Re: [netcdf-java] Errors reading certain NetCDF4 data

  • To: Ryan May <rmay@xxxxxxxx>
  • Subject: Re: [netcdf-java] Errors reading certain NetCDF4 data
  • From: Antonio Rodriges <antonio.rrz@xxxxxxxxx>
  • Date: Thu, 19 Feb 2015 21:48:43 +0300
Ryan,

I do have time my first dimension (Christian suggested for time being
the last dimension)
and thought that after rechunking I get smth like this:

4x4 (lat and lon 2D array located continuously on disk), 4x4, 4x4,
4x4, ......, 4x4
<<---------------------------- the number of rasters is 512
---------------------------->>
so the distance between the different dates is not 8 kb but should be
only 4 x 4 x sizeof(float) = 64 bytes for the expected layout

Here is the metadata (although without chunk sizes, is it possible to
look at the sizes?):

netcdf 
file:/d:/RS_DATA/worker/merra_ts/tavg1_2d_slv_Nx/wind_australia_chunked/u10m/chunked/2014_ch.nc
{
 dimensions:
   latitude = 103;
   longitude = 122;
   time = UNLIMITED;   // (5088 currently)
 variables:
   double latitude(latitude=103);
     :_Netcdf4Dimid = 0; // int
     :units = "degrees_north";
     :long_name = "Latitude";
   double longitude(longitude=122);
     :_Netcdf4Dimid = 1; // int
     :units = "degrees_east";
     :long_name = "Longitude";
   double time(time=5088);
     :_Netcdf4Dimid = 2; // int
     :units = "hours since 2014-1-1 0";
   float u10m(time=5088, latitude=103, longitude=122);
     :comments = "Unknown1 variable comment";
     :long_name = "Eastward wind at 10 m above displacement height";
     :units = "m s-1";
     :grid_name = "grid-1";
     :grid_type = "linear";
     :level_description = "Earth surface";
     :time_statistic = "instantaneous";
     :missing_value = 9.9999999E14f; // float

 :Conventions = "COARDS";
 :calendar = "standard";
 :comments = "file created by grads using lats4d available from
http://dao.gsfc.nasa.gov/software/grads/lats4d/";;
 :model = "geos/das";
 :center = "gsfc";
 :history = "Mon Dec 01 20:20:48 2014:
D:\\DATA\\worker\\merra_ts\\tavg1_2d_slv_Nx\\wind_australia\\u10m\\ncks.exe
-4 --cnk_dmn lat,4 --cnk_dmn lon,4 --cnk_dmn time,512 2014.nc
2014_ch.nc\\nWed Oct 15 20:26:23 2014: ncrcat -v u10m -o 2014.nc";
 :nco_openmp_thread_number = 1; // int
 :nco_input_file_number = 212; // int
 :NCO = "20141201";
}

2015-02-19 21:24 GMT+03:00 Ryan May <rmay@xxxxxxxx>:
> Antonio,
>
> Even with that chunk size, the number of bytes between consecutive points in
> time is 512 x 4 x sizeof(float), which is 8 kb. You may get a few points
> closer together, but they're still not close together. Any read ahead
> function of the disk will be throwing away 99% of the data if all you want
> is all the time for a single point.
>
> If you're predominant access pattern is all times for a single point, your
> best throughput will be achieved by making sure that those points are
> consecutive on disk, which means that you should have time be the first
> dimension, not the last. Anything else you do will be papering over the core
> problem.
>
> Ryan
>
> On Thu, Feb 19, 2015 at 10:37 AM, Antonio Rodriges <antonio.rrz@xxxxxxxxx>
> wrote:
>>
>> Christian,
>>
>> According to Russ Rew
>>
>> http://www.unidata.ucar.edu/blogs/developer/entry/chunking_data_why_it_matters
>> the chunking must help for my access pattern
>>
>> After rechunking I expected to have chunks with 512x4x4 sizes where
>> values for the single point and different time should be stored very
>> close on disk
>
>
>
>
> --
> Ryan May
> Software Engineer
> UCAR/Unidata
> Boulder, CO



  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: