Re: [netcdf-java] Erroneous data from linked HDF files

  • To: Christopher Mueller <cmueller@xxxxxxxxxxxxxx>
  • Subject: Re: [netcdf-java] Erroneous data from linked HDF files
  • From: Christian Ward-Garrison <cwardgar@xxxxxxxx>
  • Date: Fri, 8 Aug 2014 14:52:19 -0600
Hi Chris,

It's definitely a good idea to emit a warning, rather than silently
returning bad data. I've created a JIRA issue for this problem, that you
can follow if you wish: https://bugtracking.unidata.ucar.edu/browse/TDS-584

Cheers,
Christian


On Tue, Aug 5, 2014 at 7:49 AM, Christopher Mueller <cmueller@xxxxxxxxxxxxxx
> wrote:

>  Hi Christian,
>
>  I actually prefer normal files myself, but had a need to use the files
> from the NASA OceanColor site, some of which are provided as .main +
> subordinate (linked) files.  I have used a subordinate file structure with
> HDF5 in the past, but when I did so I was working directly with the HDF
> files (via their HDF Java api), so the linking wasn’t an issue.  The
> primary reason I’m aware of for using subordinates is to keep the size of
> any single file smaller – though I think this is a somewhat antiquated
> reason that’s a holdover from the days of 2GB file limits.
>
>  As to my particular problem, I’ve been able to incorporate the
> aforementioned HDF Java library into our application, which has allowed us
> to read the linked-fileset without issue.  The downside is that it we incur
> a requirement for platform-specific binaries, but we don’t have much other
> option! :)  Fortunately, we’re able to segregate the code into a
> “pre-process”, which means we don’t need to worry about distributing the
> platform-specific portions.
>
>  It’s understandable that there is not support for linked HDF files in
> theNetCDF-Java library – as you said, it’s probably not a very frequently
> required functionality.  However – it may be worth trying to find a way to
> at least recognize that a particular dataset is backed by a linked-file so
> that an appropriate error can be thrown.  The concern I have is that, as it
> stands now, the NetCDF-Java library returns data without any indication
> that the data is incorrect.  While in theory, someone should know what
> their dealing with and recognize that the data is incorrect, I could
> envision a scenario where it could become a problem.
>
>  Best,
> Chris
>
>   From: Christian Ward-Garrison <cwardgar@xxxxxxxx>
> Date: Friday, August 1, 2014 at 7:17 PM
> To: Christopher Mueller <cmueller@xxxxxxxxxxxxxx>
> Cc: "netcdf-java@xxxxxxxxxxxxxxxx" <netcdf-java@xxxxxxxxxxxxxxxx>
> Subject: Re: [netcdf-java] Erroneous data from linked HDF files
>
>    Hi Chris,
>
>  First off, let me just say that this is an absolutely fantastic bug
> report. I wish I had better news for you, but the simple answer is that
> NetCDF-Java doesn't support linked HDF files. Frankly, you're the first use
> that's even mentioned them to us. Is there a particular reason that you
> prefer linked files to normal files?
>
>  Regards,
>  Christian
>
>
> On Tue, Jul 15, 2014 at 1:29 PM, Christopher Mueller <
> cmueller@xxxxxxxxxxxxxx> wrote:
>
>>   *tl;dr* There appears to be a bug in NetCDF Java with respect to
>> reading linked HDF4 files which results in data that is read from the
>> linked file(s) to be erroneous.
>>  Resources
>>
>>    - ToolsUI
>>    - HDFView
>>    - The files mentioned below can be retrieved directly from OceanColor
>>    
>> <http://oceancolor.gsfc.nasa.gov/cgi/l3/A20021822013212.L3b_MC_RRS.main.bz2?sub=bin>
>>  (one
>>    at a time), or (for convenience) as one tar.gz file fromhere
>>    
>> <https://drive.google.com/uc?id=0B6UT7Mn4GZQhMjdLNDBBMFE0TTA&export=download>
>>
>>  Details
>>
>> I'm reading data from the Aqua MODIS L3 Binned products available from
>> the NASA OceanColor <http://oceancolor.gsfc.nasa.gov/> website. It
>> should be noted that these files are HDF4 (4.2.9 according to NetCDF Java -
>> ncdump). Many of the products, such as chlorophyll, Particulate Inorganic
>> Carbon, and Sea Surface Temperature, come as a single file. The NetCDF
>> library reads these files without any difficulty.
>>
>> However, one of the datasets of interest is the Remote Sensing
>> Reflectance data, which is NOT provided as a single file, but as a "main"
>> file and a set of subordinate files which are read via the "main" file as
>> needed (see here for more information
>> <http://oceancolor.gsfc.nasa.gov/PRODUCTS/modis_binned.html>):
>>
>>    - A20021822013212.L3b_MC_RRS.main
>>    - A20021822013212.L3b_MC_RRS.x00
>>    - A20021822013212.L3b_MC_RRS.x01
>>    - A20021822013212.L3b_MC_RRS.x02
>>    - A20021822013212.L3b_MC_RRS.x03
>>    - A20021822013212.L3b_MC_RRS.x04
>>    - A20021822013212.L3b_MC_RRS.x05
>>    - A20021822013212.L3b_MC_RRS.x06
>>    - A20021822013212.L3b_MC_RRS.x07
>>    - A20021822013212.L3b_MC_RRS.x08
>>    - A20021822013212.L3b_MC_RRS.x09
>>    - A20021822013212.L3b_MC_RRS.x10
>>    - A20021822013212.L3b_MC_RRS.x11
>>
>>  NetCDF Java (via ToolsUI) loads the .main file without issue, and
>> permits reading of data variables (i.e. Rrs_412) without raising any
>> errors. However, the data returned is not accurate. Below is a comparison
>> of the data returned by ToolsUI and the same data returned by HDFView
>> (which uses the HDF-java JNI <http://www.hdfgroup.org/products/java/JNI/>
>>  library):
>>
>> Retrieving the first 10 values for variable "Rrs_412"
>>  HDFView
>>
>> Screen Capture <http://cl.ly/WZnD>
>>
>> Opening the .main file in HDFView and looking at the Rrs_412 dataset
>> gives a very different set of data:
>>
>> 0.0055423053, 0.0106070135, 0.006894292, -0.0040368317, -0.0020879991, 
>> -0.0020279996, 0.009794002, 0.011879213, 0.010874448, 0.012330733
>>
>>  ToolsUI
>>
>> Screen Capture <http://cl.ly/WZMW>
>>
>> Opening the .main file and performing an *Ncdump Data* of variable:
>> "Level-3_Binned_Data/Rrs_412(0:10:1).Rrs_412_sum"
>>
>> Returns:
>>
>> float Rrs_412_sum;
>>
>>  data:
>>
>>   {1.86057E-40, 9.403955E-38, 6.4099753E-10, 2.6076459E-9, 1.0297978E21, 
>> 5.6431478E-11, 0.0, -2.9699963E36, 4.59183E-40, 3.67343E-40, 2.60329423E11}
>>
>>  Also, in ToolsUI, *all of the other data variables* (i.e. angstrom,
>> aot_869 & Rrs_*) all display very very similar (most are identical) values
>> as the Rrs_412. This is not the case for HDFView.
>>
>> Incidentally, reading the data via OceanColor's SeaDas
>> <http://seadas.gsfc.nasa.gov/> application (which uses NetCDF Java under
>> the hood) results in the same data as ToolsUI.
>>  Wrap-up
>>
>> The evidence above appears to indicate that there is a bug in NetCDF Java
>> related to linked HDF files which results in incorrect data reads from
>> linked files.
>>
>> Does anyone have any idea:
>>
>> *a)* what could be causing the issue?
>> *b)* how could it be addressed?
>>
>>
>>
>>  Thanks in advance,
>> Chris
>>
>> _______________________________________________
>> netcdf-java mailing list
>> netcdf-java@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe, visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>>
>
>
  • 2014 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: