[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #YRC-237402]: netcdf4 dimensions output comment out



Wanchun,

> So my question is that they can do that ( skip meta data in netcdf4 file
> and then it's still a valid
> netcdf4 file ),  but I can not do this myself.  So there is something I
> miss, right?

Sorry, I guess I didn't understand your question, and now that I understand it,
I'll try again ...

I don't know what processing is being applied by the JPSS program to the 
original 
HDF5 files to convert them to netCDF-4 files, but I can see by looking at the 
output of "h5dump --header" on the .h5 and the .nc files that they have decided 
to structure the netCDF-4 file differently, in terms of the groups used and the 
attributes defined in the netCDF-4 file.

I would be very surprised if they actually changed any of the netCDF-4 library
code to do this conversion.  I think they wrote the resulting netCDF-4 files 
using an HDF5 program rather than calls to the netCDF-4 library, in order to 
create a file without using HDF5 Dimension Scales.  When I said the .nc file 
was 
a netCDF-4 file, I was relying on the fact that the netCDF ncdump utility could
read it and show the structure of the file and its data in ASCII.

But if I use another netCDF utility, nccopy, on the file, it shows errors:

  nccopy 
TATMS_npp_d20120911_t1359540_e1400256_b04529_c20120911202503549711_noaa_ops.nc 
tmp.nc
  NetCDF: Name contains illegal characters
  Location: file nccopy.c; line 779

because the JPSS group has created what look like netCDF attributes, but 
contain the
character "/", which is not allowed in netCDF attributes, for example the global
attributes:

  HDF5_internal_address_of_/Data_Products
  HDF5_internal_name_of_/Data_Products
  HDF5_internal_address_of_/All_Data
  HDF5_internal_name_of_/All_Data

Furthermore, since the JPSS group didn't use HDF5 Dimension Scales to represent
the shared dimensions, ncdump shows these dataspace sizes as if they were netCDF
dimensions with "made up" names:

  dimensions:
        phony_dim_0 = UNLIMITED ; // (12 currently)
        phony_dim_1 = UNLIMITED ; // (96 currently)
        phony_dim_2 = UNLIMITED ; // (22 currently)
        phony_dim_3 = UNLIMITED ; // (2 currently)
        phony_dim_4 = UNLIMITED ; // (4 currently)
        phony_dim_5 = UNLIMITED ; // (7 currently)
        phony_dim_6 = UNLIMITED ; // (1 currently)

If I fix the attribute names by replacing the "/" characters with another 
character,
the resulting dataset can be copied by the nccopy utility using only the 
netCDF-4
library calls in the current, unmodified library, but now the "phony_dim_N"
dimensions appear in the output as if they were Dimension Scales created for 
real
netCDF shared dimensions.

I don't have have hdfview installed here, but I imagine if you looked at the 
copy
produced by nccopy, it would show the "phony_dim_N" datasets corresponding to 
the 
manufactured dimensions.  You could try this by running the commands

  ncdump 
TATMS_npp_d20120911_t1359540_e1400256_b04529_c20120911202503549711_noaa_ops.nc 
> tmp.cdl
    [ edit the command tmp.cdl to replace or remove the "/" characters from the 
global attribute names ]
  ncgen -b tmp.cdl   # creates the netCDF file tmp.nc from the edited tmp.cdl
  nccopy tmp.nc tmp2.nc  # reads tmp.nc and writes tmp2.nc using netCDF-4 
library functions 
    [ now view tmp2.nc in hdfview ]

If the new dimensions appear in the result, then I think I am right that the 
"converted"
files were not created by calling the netCDF-4 library, but instead by using 
the HDF5
library to emulate the netCDF-4 library in creating a file that's close enough 
to fool
ncdump, but not a netCDF-4 file that could have been created using the netCDF-4 
library.

As to why the JPSS group used this approach, I'm not sure.  I think they could 
provide a 
better answer to your question than I have been able to.

--Russ

> Wanchun
> 
> address@hidden> wrote:
> 
> > > Hi Russ,
> > >
> > > Thanks for your explanation.
> > >
> > > Recently JPSS program converted some hdf5 files into netcdf4 by removing
> > > some meta data
> > > information.
> > >
> > > One example of output netcdf4:
> > >
> > ftp://ftp.star.nesdis.noaa.gov/pub/corp/scsb/wchen/tmp/TATMS_npp_d20120911_t1359540_e1400256_b04529_c20120911202503549711_noaa_ops.nc
> > >
> > > If viewed from hdfview, no those meta data, it's a netcdf4 file, right?
> >  Or
> > > you think this still a
> > > hdf5 file?
> >
> > It's a netCDF-4 file, as running the netCDF-4 ncdump utility shows:
> >
> >   $ ncdump -k
> > ~/Downloads/TATMS_npp_d20120911_t1359540_e1400256_b04529_c20120911202503549711_noaa_ops.nc
> >   netCDF-4
> >
> > Of course it's also an HDF5 file, because netCDF-4 files are just special
> > kinds of
> > HDF5 files with some added artifacts to support the netCDF-4 data model,
> > which is
> > different from the HDF5 data model.
> >
> > > The input hdf5 used is:
> > >
> > ftp://ftp.star.nesdis.noaa.gov/pub/corp/scsb/wchen/tmp/TATMS_npp_d20120911_t1359540_e1400256_b04529_c20120911202503549711_noaa_ops.h5
> >
> > Yes, that's an HDF5 file that's not a netCDF-4 file, and the netCDF-4 file
> > can't currently
> > read it, although the netCDF-4 library can read some HDF5 files that don't
> > use the HDF5
> > features in this list:
> >
> >   http://www.unidata.ucar.edu/netcdf/docs/faq.html#fv15
> >
> > > That's the reason that lead me to think it's possible to skip meta data
> > > information in netcdf4 file.
> > >
> > > All those red "A" on some original hdf5 fields are gone in output nc
> > file.
> > > I guess they removed attributes.
> >
> > h5dump should work on both files and make it clear what was removed.
> >
> > --Russ
> >
> > > address@hidden> wrote:
> > >
> > > > Wanchun,
> > > >
> > > > > I used the simple C code: simple_xy_nc4_wr.c
> > > > >
> > > >
> > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-tutorial/simple_005fxy_005fnc4_005fwr_002ec.html#simple_005fxy_005fnc4_005fwr_002ec
> > > > >
> > > > >
> > > > >
> > > >
> > http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-tutorial/simple_005fxy_005fnc4_005fwr_002ec.html#simple_005fxy_005fnc4_005fwr_002ec
> > > > >
> > > > > But generated nc file simple_xy_nc4.nc has x and y besides data
> > when I
> > > > > viewed it in hdfview, but I only want data, not x and y.
> > > > >
> > > > > Can you tell me in which part of netcd4 source code do those
> > dimensions
> > > > > like x and y get written out?   I want to disable them or comment
> > them
> > > > > out so that only data is kept in file.  I searched netcdf4 src
> > directory
> > > > > and can not pinpoint which is the correct code to change.
> > > >
> > > > What you're asking for is not possible, because of a mismatch between
> > the
> > > > netCDF-4
> > > > data model and the HDF5 datamodel.  NetCDF-4 has "shared dimensions"
> > such
> > > > as x
> > > > and y that can be shared among multiple variables, to indicate a common
> > > > grid.  HDF5,
> > > > used as the storage layer for netCDF-4, lacks the concept of shared
> > > > dimensions, but
> > > > provides a different mechanism. "dimension scales", on which netCDF-4
> > > > shared
> > > > dimensions are implemented.  An HDF5 dimension scale is a special type
> > of
> > > > HDF5
> > > > dataset, and must always be used for every netCDF-4 dimension, because
> > it
> > > > is the
> > > > only HDF5 mechanism available for modeling shared dimensions.
> > > >
> > > > If netCDF-4 had "anonymous dimensions" such as HDF5 dataspaces, which
> > > > cannot be
> > > > shared among variables, then you could define a variable that had its
> > own
> > > > dataspace,
> > > > with unnamed and unsharable dimensions.  If you must store just the
> > data
> > > > but not the
> > > > metadata about dimensions, you probably want to use HDF5 rather than
> > > > netCDF-4.
> > > >
> > > > For more on netCDF-4 dimensions versus HDF5 dimension scales, see these
> > > > blogs by John
> > > > Caron, explaining the issues:
> > > >
> > > >
> > http://www.unidata.ucar.edu/blogs/developer/en/entry/dimensions_scales
> > > >
> > http://www.unidata.ucar.edu/blogs/developer/en/entry/dimension_scale2
> > > >
> > > >
> > http://www.unidata.ucar.edu/blogs/developer/en/entry/dimension_scales_part_3
> > > >
> > > >
> > http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf4_shared_dimensions
> > > >
> > > > --Russ
> > > >
> > > > Russ Rew                                         UCAR Unidata Program
> > > > address@hidden                      http://www.unidata.ucar.edu
> > > >
> > > >
> > > >
> > > > Ticket Details
> > > > ===================
> > > > Ticket ID: YRC-237402
> > > > Department: Support netCDF
> > > > Priority: Normal
> > > > Status: Closed
> > > >
> > > >
> > >
> > >
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: YRC-237402
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
> >
> 
> 
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: YRC-237402
Department: Support netCDF
Priority: Normal
Status: Closed