I think it is not 1 MB/s, but read chunks of 1MB, which I think is not unusual.
I think hdf5 defaults to that size? Classic and 64-bit offset netcdf will
request larger reads, but the underlying POSIX may still be reading in 1MB
chunks. (Not an expert response here, but just some recent experience with
pnetcdf reading issues with a Lustre timing bug.)
-- Ted
On Oct 30, 2015, at 10:09 AM, Bryan Lawrence <bryan.lawrence@xxxxxxxxxx> wrote:
> Hi Rob
>
> 1 MB/s compares with more like 400 MB/s from the native file system from a
> shallower stack … so we’re pretty sure there is some sort of buffering
> involved somewhere (everywhere) in the netcdf/hdf/python stack. We’re after
> some pointers to understand what buffers are where, particularly in the
> netcdf stack itself.
>
> Cheers
> Bryan
>
>
> On 30 October 2015 at 14:41, <robl@xxxxxxxxxxx> wrote:
>
>
> On 10/28/2015 09:57 AM, Matthew Jones wrote:
> > I am running some tests on an HPC cluster, altering the size of reads to
> > test the performance of the file system.
> >
> >
> > I am using python, and for sequential reads not using netCDF4 the read
> > rate is pretty constant across different read sizes. However, when I
> > introduce the netCDF4 library the smaller and larger reads see a dip in
> > performance with a peak on the medium sized reads (creating a hill-like
> > profile). The peak in the netCDF4 performance is at about the same read
> > rate as the non-netCDF4 reads. The peak is at reads of about 1MB.
> >
> >
> > We think this could be to do with buffering somewhere in the NetCDF
> > library. Does anyone know of such buffering that we should be aware of?
>
> Your software stack is pretty deep here. 1 MB might be a perfectly
> reasonable rate, but some additional experiments (or information from
> your HPC consultants) would help.
>
> Your message regards sequential access. It's possible the file system
> has been tuned for parallel access. Is peak parallel performance
> something you might care about some day soon? The tuning for a parallel
> access will be a bit different from serial access.
>
> ==rob
>
>
>
> >
> >
> > Many thanks
> >
> > Matt
> >
> >
> >
> > ----------------------------------------
> > Matthew Jones
> > PhD Student
> > Atmosphere, Oceans and Climate
> > Department of Meteorology,
> > University of Reading
> >
> > Room 288, ESSC, Harry Pitt Building,
> > 3 Earley Gate, Reading, RG6 6AL, UK
> >
> > Ext: 5214
> >
> > https://www.linkedin.com/pub/matthew-jones/8b/b81/25a
> > http://www.met.reading.ac.uk/users/users/1887
> >
> >
> >
> >
> > _______________________________________________
> > netcdfgroup mailing list
> > netcdfgroup@xxxxxxxxxxxxxxxx
> > For list information or to unsubscribe, visit:
> > http://www.unidata.ucar.edu/mailing_lists/
> >
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/
>
>
>
> --
>
> Bryan Lawrence
> University of Reading: Professor of Weather and Climate Computing.
> National Centre for Atmospheric Science: Director of Models and Data.
> STFC: Director of the Centre for Environmental Data Analysis.
> Ph: +44 118 3786507 or 1235 445012; Web:home.badc.rl.ac.uk/lawrence
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/