Won't hurt to ask 'em. They can close it if it's fixed with very little
effort.
gerry
On Wed, Aug 19, 2015 at 8:57 PM, Rob Latham <robl@xxxxxxxxxxx> wrote:
>
>
> On 08/19/2015 03:55 PM, Gerry Creager - NOAA Affiliate wrote:
>
>> I'll open a case to determine if Cray's MPI-IO library has this problem.
>>
>>
> OK. Might not be any need to do so: David Knaak told me (via off-list
> correspondence) that it was fixed in Cray MPI-IO much the same way I fixed
> it in ROMIO.
>
> ==rob
>
> gerry
>>
>> On Wed, Aug 19, 2015 at 7:47 PM, Rob Latham <robl@xxxxxxxxxxx
>> <mailto:robl@xxxxxxxxxxx>> wrote:
>>
>>
>>
>> On 08/18/2015 02:31 PM, Ward Fisher wrote:
>>
>> Hello all,
>>
>> I just wanted to jump in and comment that this issue, recently
>> reported
>> to us by David Knaak at Cray, is now handled in the netCDF-C
>> development
>> branch on GitHub. This fix will be in the upcoming release
>> candidate and
>> eventual final release of netCDF-C 4.4.0.
>>
>> Regarding the question of short reads providing more warning;
>> netcdf
>> specifically was already checking for short reads when ‘paging
>> in’ data
>> from a file, but was assuming an error when one would occur (due
>> to a
>> non-zero |errno| value). The fix shouldn’t incur any performance
>> penalty. The new thing I learned about “short reads” is that it is
>> possible for this to occur /without/ being the result of an
>> error, but
>> rather the result of an interrupt.
>>
>>
>> I found these short reads would happen in ROMIO when trying to read
>> 2 GiB of data in one shot. Linux would give me back (2GiB-4k) worth
>> of data.
>>
>> Today, most MPI-IO libraries should detect and retry this case.
>> Cray's MPI-IO library is closed source, so i don't know what they do .
>>
>> In general, since they are technically allowed I think
>> developers are
>> going to have to accommodate the possibility of short reads in
>> their
>> software, one way or another. Developers should already be
>> checking the
>> return value of |read()|, and when short, the fix is essentially:
>>
>> 1. Check to see if errno is |EINTR|
>> 2. If so, perform some calculations and resume the read.
>>
>>
>> While that's strictly correct, I worry about short reads that for
>> whatever reason don't set EINTR. So I would check how much data was
>> read. If it is less than requested, continue the read to fetch the
>> missing data. If that continued read returns 0, then you are EOF
>> and you are done.
>>
>> ==rob
>>
>> --
>> Rob Latham
>> Mathematics and Computer Science Division
>> Argonne National Lab, IL USA
>>
>>
>> _______________________________________________
>> netcdfgroup mailing list
>> netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
>> For list information or to unsubscribe, visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>>
>>
>>
>>
>> --
>> Gerry Creager
>> NSSL/CIMMS
>> 405.325.6371
>> ++++++++++++++++++++++
>> “Big whorls have little whorls,
>> That feed on their velocity;
>> And little whorls have lesser whorls,
>> And so on to viscosity.”
>> Lewis Fry Richardson (1881-1953)
>>
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
--
Gerry Creager
NSSL/CIMMS
405.325.6371
++++++++++++++++++++++
“Big whorls have little whorls,
That feed on their velocity;
And little whorls have lesser whorls,
And so on to viscosity.”
Lewis Fry Richardson (1881-1953)