I looked at the GDAL driver for netcdf. Adding
nc_set_chunk_cache(100000000, 10000, 0.75)
in GDALRegister_netCDF() helps, but I'm not sure whether it's a good
general solution.
The code on lines 482-500
<https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/netcdf/netcdfdataset.cpp#L482>
sets
some chunking-related properties if NETCDF_HAS_NC4 is defined. My problem
persists, though, whether or not this is defined (unless I add the caching
call). Note that later on line 511 nBlockSize is set to 1 regardless of the
define for bottom-up datasets (which mine is). However, for other datasets
that I have nBlockSize is also set to 1, and they are fast.
Does anything else in netcdfdataset.cpp ring any bells? I can bring this up
with gdal folks, but wanted to check here first for a possible
recommendation.
On Thu, Dec 15, 2016 at 5:24 PM, dmh@xxxxxxxx <dmh@xxxxxxxx> wrote:
> The interactions between two independent caches can cause
> problems. I should look at the netcdf cache and see how it interacts
> with the hdf5 cache.
> =Dennis Heimbigner
> Unidata
>
> On 12/15/2016 6:03 PM, Dave Allured - NOAA Affiliate wrote:
>
>> On Thu, Dec 15, 2016 at 4:46 PM, Chris Barker <chris.barker@xxxxxxxx
>> <mailto:chris.barker@xxxxxxxx>> wrote:
>>
>> On Thu, Dec 15, 2016 at 1:00 PM, dmh@xxxxxxxx <mailto:dmh@xxxxxxxx>
>>
>> <dmh@xxxxxxxx <mailto:dmh@xxxxxxxx>> wrote:
>>
>> 1. Adding this feature to ncdump also requires adding
>> it to the netcdf-c library API. But providing some means
>> for client programs to pass thru parameter settings to the
>> hdf5 lib
>> seems like a good idea.
>>
>>
>> absolutely! that would be very helpful.
>>
>> -CHB
>>
>>
>> This may be premature. The netcdf API already has its own chunk cache
>> with at least two functions to adjust tuning parameters. It seems to me
>> that the netcdf facility would probably handle the current ncdump and
>> gdal cases nicely, though I have not tested it. Please see this
>> relevant documentation:
>>
>> http://www.unidata.ucar.edu/software/netcdf/docs/netcdf_perf
>> _chunking.html
>>
>> Simon, you might want to ask your gdal maintainer to give this a try.
>> If it works, it should be simple and robust. I would suggest increasing
>> the per-variable chunk size to at least 5 qualityFlags.nc uncompressed
>> chunks, and probably more. 5 is the number of chunks that span a single
>> row for this particular file. This advice presumes that your typical
>> read pattern is similar to ncdump, which I speculate is first across
>> single whole rows, as I said earlier.
>>
>> columns = 4865 ;
>> rows = 3682 ;
>> uint quality_flags(rows, columns) ;
>> quality_flags:_ChunkSizes = 891, 1177 ;
>>
>> 5 x 891 x 1177 x 4 bytes per uint uncompressed ~= 21 Mbytes
>>
>> Note this is likely to be a little larger than the default cache size in
>> the current netcdf-C library, thus explaining some of the slow read
>> behavior.
>>
>> You might also consider rechunking such data sets to smaller chunk
>> size. Nccopy and ncks can do that. Rechunking may depend on your
>> anticipated spatial read patterns, so give that a little thought.
>>
>> You might also consider reading the entire grid in a single get_vara
>> call to the netcdf API. That is what my fast fortran test program did.
>> A naive reader that, for example, loops over single rows may incur bad
>> cache activity that could be avoided.
>>
>> --Dave
>>
>>
>> _______________________________________________
>> NOTE: All exchanges posted to Unidata maintained email lists are
>> recorded in the Unidata inquiry tracking system and made publicly
>> available through the web. Users who post to any of the lists we
>> maintain are reminded to remove any personal information that they
>> do not want to be made public.
>>
>>
>> netcdfgroup mailing list
>> netcdfgroup@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe, visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>>
>>
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web. Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/
>