[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #PFU-753378]: Error in closing netCDF file (due to presence of user-defined type)



Lynton,

> I put in the two lines as described in the JIRA note,
> and than ran the checks (for 4.2.1.1)
> As you will see from the attached file, there are two tests that
> fail. Can you explain this?

If you try it again, it would probably work fine now.  The errors you saw were
due to a problem with a transient problem with our remote access DAP test 
server:

  PASS: test_vara
  Cannot locate test server
  FAIL: test_partvar
  Cannot locate test server
  FAIL: test_varm3
  *** Test: var conversions on URL: 
file:///work/lappel/local/src/NETCDF/netcdf-4.2.1.1/ncdap_test/testdata3/test.02

If you never want to see these sorts of errors or have no use for supporting 
remote 
access to data subsets, you can either turn off remote tests with the configure
option "--disable-dap-remote-tests" or, more drastically, not even build with 
DAP
protocol client support, with "--disable-dap".

--Russ

> Lynton
> -----Original Message-----
> From: Unidata netCDF Support [mailto:address@hidden]
> Sent: 01 February 2013 18:24
> To: Appel, Lynton
> Cc: address@hidden
> Subject: [netCDF #PFU-753378]: Error in closing netCDF file (due to presence 
> of user-defined type)
> 
> > So far so good:
> > I built the netcdf-4.2.1.1  and applied the bug fix.
> > The tests both work now (egood.c/ebad.c).
> > Also the indications are that the application I am developing also works.
> > so thanks very much.
> > I just wonder what the failed test indicates has broken. So longer is
> > doesn't affect me.....
> 
> I've now fixed a second bug that caused the failed test.  The two bugs 
> working together
> made the test in nc_test4/tst_vars3 appear to work.  The two bugs were in two 
> adjacent
> lines of code in libsrc4/nc4hdf5.c.  See the Jira ticket NCF-217 for more 
> details.
> 
> --Russ
> 
> > On 01/31/2013 11:05 PM, Unidata netCDF Support wrote:
> > > Hi Lynton,
> > >
> > > I think I found the bug and have a fix for it, but when I apply the fix,
> > > one of our tests fails.  Apparently another fix is  required to make that
> > > test pass, because it currently seems to depend on the buggy code.  But my
> > > fix does seem to make your bug demonstration example work OK.
> > >
> > > The fix involves changing a line of code in version 4.2.1.1 in
> > > libsrc4/nc4hdf.c:2444, from
> > >
> > >       if (strcmp(dim->name, var->name)&&  !dim->dirty)
> > >
> > > to
> > >
> > >       if (!strcmp(dim->name, var->name)&&  !dim->dirty)
> > >
> > > If you recompile the library, and run "make check", it will fail when 
> > > running
> > > nc_test4/tst_vars3, which remains to be fixed.  But if you just do "make 
> > > all"
> > > and "make install" it may work on your current code base and get you 
> > > around
> > > this particular bug.
> > >
> > > I'll post progress on this on the Jira ticket and let you know if and when
> > > I get the failing test working:
> > >
> > >    https://bugtracking.unidata.ucar.edu/browse/NCF-217
> > >
> > > --Russ
> > >
> > >
> > >
> > >>> I have done some digging into the problem.  The bug appears to be
> > >>> associated with the HDF5 attribute "_Netcdf4Dimid".  Page 14, section
> > >>> B-5 of:
> > >>>
> > >>> https://earthdata.nasa.gov/sites/default/files/esdswg/spg/rfc/esds-rfc-022/nasa_netcdf4_standard_v0.03.pdf
> > >>>
> > >>> describes the function of this attribute. Essentially if the order of
> > >>> the coordinate variables is different from the order of the dimensions,
> > >>> then this attribute must be present in all HDF5 datasets that have the
> > >>> property "dimension_scale".
> > >>>
> > >>> I looked at the netCDF data files written out by the example programmes
> > >>> you prepared. The programme egood.c HDF5 doesn't contain the
> > >>> _Netcdf4Dimid attribute. This is correct behaviour because the
> > >>> dimensions are variables are written in the "correct" order.
> > >>>
> > >>> The programme ebad.c HDF5 contains the _Netcdf4Dimid attribute.
> > >>> However, there appear to be several mistakes:
> > >>> (i) the data set "c" contains a "_Netcdf4Dimid" attribute even
> > >>> though it is not a Dimension_scale.
> > >>> (ii) the data set "time" does not contain a "_Netcdf4Dimid"
> > >>> attribute, but it should!
> > >>> These are the "bugs".
> > >>>
> > >>> I am not sure how to correct the bugs, but I think I know in which part
> > >>> of netCDF code it exists:
> > >>> The _Netcdf4Dimid is defined  in "write_netcdf4_dimid [ line 1220
> > >>> libsrc4/nc4hdf.c])
> > >>> The decision to write out the attribute is done in 
> > >>> nc4_rec_write_metadata.
> > >>> So I think there is something wrong with the logic in this part of the 
> > >>> code.
> > >>>
> > >>> As an aside I am somewhat confused by the definition of a coordinate
> > >>> variable. I had understood
> > >>> a coordinate variable is one which the dimension and variable have the
> > >>> same names. By this
> > >>> definition, there are no coordinate variables in these code examples.
> > >>> However, all the documentation describes
> > >>> this as a problem to do with coordinate variables.
> > >> You're right, but evidently the developer who wrote this code had some 
> > >> confusion
> > >> about coordinate variables, CF auxiliary coordinate variables, and 
> > >> multidimensional
> > >> coordinate variables.
> > >>
> > >>> I  hope these observations are helpful. Please let me know.
> > >> Yes, thanks!  I hope I'll be able to find and fix the bugs soon, and I 
> > >> think your
> > >> contributions will be very helpful.
> > >>
> > >> --Russ
> > >>
> > >>> On 01/28/2013 04:28 PM, Unidata netCDF Support wrote:
> > >>>> Lynton,
> > >>>>> many thanks for this. I am following progress....
> > >>>>> However, is it possible to have an indication of timescales (or work 
> > >>>>> effort)
> > >>>>> This is a a serious bug for me and I would prefer to wait for its
> > >>>>> resolution before
> > >>>>> continuing with the current software development work.
> > >>>> It's hard to estimate how much work it will take to fix this.  My
> > >>>> latest efforts make it appear as if the problem is in nc_enddef(), but
> > >>>> a quick look at that didn't result in seeing the bug.  This problem is
> > >>>> in the area of trying to model netCDF's shared dimensions using HDF5's
> > >>>> dimension scales, but HDF5 dimension scales aren't adequate by
> > >>>> themselves, so Ed had to "bolt on" extra artifacts, consisting of
> > >>>> lists and attributes in the HDF5 representation that aren't visible in
> > >>>> the netCDF-4 files, to try to fill the gap.  There have been several
> > >>>> bugs in this part of the netCDF-4 implementation, all involving
> > >>>> something breaking depending on someone invoking netCDF functions in
> > >>>> an order that we don't test or didn't anticipate.
> > >>>>
> > >>>> Ed's no longer available for consulting on this, so I'm currently
> > >>>> trying to figure out what's going on by reading about the artifacts in
> > >>>> Appendix B of this document:
> > >>>>
> > >>>>     
> > >>>> https://earthdata.nasa.gov/sites/default/files/esdswg/spg/rfc/esds-rfc-022/nasa_netcdf4_standard_v0.03.pdf
> > >>>>
> > >>>> However, currently getting a blog finished and published on use of
> > >>>> chunking in netCDF-4 is higher priority, because it's overdue.  So
> > >>>> more progress in debugging the netCDF-4 bug, as my next highest
> > >>>> priority, will probably be delayed until later this week ...
> > >>>>
> > >>>> --Russ
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>> However, this may be impractical.
> > >>>>> thanks
> > >>>>> Lynton
> > >>>>>
> > >>>>> On 01/24/2013 10:54 PM, Unidata netCDF Support wrote:
> > >>>>>> Lynton,
> > >>>>>>
> > >>>>>> The Jira ticket for this bug, with two C example programs, is now 
> > >>>>>> available here:
> > >>>>>>
> > >>>>>>      https://bugtracking.unidata.ucar.edu/browse/NCF-217
> > >>>>>>
> > >>>>>> in case you want to follow the progress.
> > >>>>>>
> > >>>>>> --Russ
> > >>>>>>
> > >>>>>>> Lynton,
> > >>>>>>>
> > >>>>>>>> Thanks for the reply. In fact the "feature" you picked up was a 
> > >>>>>>>> genuine
> > >>>>>>>> mistake of mine
> > >>>>>>>> when translating from the C++ API to the C API. The real problem 
> > >>>>>>>> was
> > >>>>>>>> somewhat different
> > >>>>>>>> as I will explain. The programme I attach is the same as before 
> > >>>>>>>> but with
> > >>>>>>>> the user-type error corrected
> > >>>>>>>> and some data assigned to the variable "weightDDXXYY"
> > >>>>>>>>
> > >>>>>>>> I can compile the code fine and run it fine.
> > >>>>>>>>
> > >>>>>>>> However, when I run ncdump I get problems. In this case the output 
> > >>>>>>>> is
> > >>>>>>>> wrong, but in other cases ncdump can actually crash.
> > >>>>>>>> The error appears to be associated with assigning values to the 
> > >>>>>>>> variable
> > >>>>>>>> "ironBoundaries" on line 44 of efit++.cpp.
> > >>>>>>>> This causes the dimensioning of weightDDXXYY to be screwed up, at 
> > >>>>>>>> least
> > >>>>>>>> according to ncdump.
> > >>>>>>>> However h5dump appears not to have the same problem suggesting that
> > >>>>>>>> there is a problem in ncdump !!
> > >>>>>>>>
> > >>>>>>>> To see this for yourself, compare the files efitOut.txt (ncdump 
> > >>>>>>>> output)
> > >>>>>>>> and efitOut.hdf5.txt (h5Dump output).
> > >>>>>>>> You will see that the dimensioning of weightDDXXYY is apparently 
> > >>>>>>>> different.
> > >>>>>>>>
> > >>>>>>>> Note as I said before, this is using netCDF version 4.2
> > >>>>>>> OK, now I can reproduce the bug.  It appears to be an example of 
> > >>>>>>> the bug that depends
> > >>>>>>> on the order in which netCDF functions are called, but the results 
> > >>>>>>> should not depend on
> > >>>>>>> the order.
> > >>>>>>>
> > >>>>>>> I'm attaching a version of your program that works when I reorder 
> > >>>>>>> the function calls to
> > >>>>>>> appear in the following groups of calls:
> > >>>>>>>
> > >>>>>>> create file and groups
> > >>>>>>> define types
> > >>>>>>> define dimensions
> > >>>>>>> define variables
> > >>>>>>> write data
> > >>>>>>>
> > >>>>>>> and it works as expected.  I don't know if there's a simpler 
> > >>>>>>> permutation of statement orders
> > >>>>>>> that would also work.
> > >>>>>>>
> > >>>>>>> The fact that it doesn't work in the order you used is definitely a 
> > >>>>>>> major bug.
> > >>>>>>> I'm also creating a Jira ticket for this and will consider it a 
> > >>>>>>> priority to try
> > >>>>>>> to diagnose the underlying problem and fix it.
> > >>>>>>>
> > >>>>>>> --Russ
> > >>>>>>>
> > >>>>>>>> On 01/24/2013 01:46 PM, Unidata netCDF Support wrote:
> > >>>>>>>>> Hi Lynton,
> > >>>>>>>>>
> > >>>>>>>>>> I have a short programme that throws up an HDF5 error: 
> > >>>>>>>>>> NC_EHDFERR  when closing. It appears to be connected with 
> > >>>>>>>>>> defining a user-defined type:
> > >>>>>>>>>> Have  you got any idea what the problem is?
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> The output of the code is:
> > >>>>>>>>>> 0 1
> > >>>>>>>>>> 0 2
> > >>>>>>>>>> 0 3
> > >>>>>>>>>> 0 4
> > >>>>>>>>>> 0 5
> > >>>>>>>>>> 0 6
> > >>>>>>>>>> 0 7
> > >>>>>>>>>> 0 9
> > >>>>>>>>>> 0 10
> > >>>>>>>>>> -101 11
> > >>>>>>>>> It looks to me as if you started to define a netCDF user-defined 
> > >>>>>>>>> type
> > >>>>>>>>> named "ironBoundaryType", but didn't finish that definition.  
> > >>>>>>>>> Then you
> > >>>>>>>>> tried to define netCDF variables of the incompletely defined type.
> > >>>>>>>>> It's a bug that the netCDF API lets you do this without returning 
> > >>>>>>>>> an
> > >>>>>>>>> error until you close the file.  I'm not sure whether there's 
> > >>>>>>>>> also a
> > >>>>>>>>> corresponding bug in HDF5 that allows this.
> > >>>>>>>>>
> > >>>>>>>>> To complete the definition of the user-defined type, you need to 
> > >>>>>>>>> fill
> > >>>>>>>>> out the type with repeated calls to nc_insert_compound(). Call the
> > >>>>>>>>> nc_insert_compound function once for each field (member) you wish 
> > >>>>>>>>> to
> > >>>>>>>>> insert into the compound type.  Don't define variables using a 
> > >>>>>>>>> type
> > >>>>>>>>> until you finish defining the type.
> > >>>>>>>>>
> > >>>>>>>>> I'll enter a Jira ticket for this later and try to determine 
> > >>>>>>>>> where the bug
> > >>>>>>>>> is, but it may have to wait until after we get the 4.3 release 
> > >>>>>>>>> for the C
> > >>>>>>>>> library out ...
> > >>>>>>>>>
> > >>>>>>>>> --Russ
> > >>>>>>> Russ Rew                                         UCAR Unidata 
> > >>>>>>> Program
> > >>>>>>> address@hidden                      http://www.unidata.ucar.edu
> > >>>>>>>
> > >>>>>>>
> > >>>>>> Russ Rew                                         UCAR Unidata Program
> > >>>>>> address@hidden                      http://www.unidata.ucar.edu
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> Ticket Details
> > >>>>>> ===================
> > >>>>>> Ticket ID: PFU-753378
> > >>>>>> Department: Support netCDF
> > >>>>>> Priority: Normal
> > >>>>>> Status: Closed
> > >>>>>>
> > >>>> Russ Rew                                         UCAR Unidata Program
> > >>>> address@hidden                      http://www.unidata.ucar.edu
> > >>>>
> > >>>>
> > >>>>
> > >>>> Ticket Details
> > >>>> ===================
> > >>>> Ticket ID: PFU-753378
> > >>>> Department: Support netCDF
> > >>>> Priority: Normal
> > >>>> Status: Closed
> > >>>>
> > >>>
> > >> Russ Rew                                         UCAR Unidata Program
> > >> address@hidden                      http://www.unidata.ucar.edu
> > >>
> > >>
> > > Russ Rew                                         UCAR Unidata Program
> > > address@hidden                      http://www.unidata.ucar.edu
> > >
> > >
> > >
> > > Ticket Details
> > > ===================
> > > Ticket ID: PFU-753378
> > > Department: Support netCDF
> > > Priority: Normal
> > > Status: Closed
> > >
> >
> >
> 
> Russ Rew                                         UCAR Unidata Program
> address@hidden                      http://www.unidata.ucar.edu
> 
> 
> 
> Ticket Details
> ===================
> Ticket ID: PFU-753378
> Department: Support netCDF
> Priority: Normal
> Status: Closed
> 
> 
> 

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: PFU-753378
Department: Support netCDF
Priority: Normal
Status: Closed