Re: Should NetCDF character attributes be null-terminated

On 2001-08-03, Mark Hadfield" <m.hadfield@xxxxxxxxxxx> wrote:

>     When an application writes text data to a netCDF file attribute,
>     should it (may it, must it) terminate the data with a null character?
>
> I am writing netCDF files with IDL (http://www.researchsystems.com/) and
> reading them in a Fortran program. My Fortran program is reading an
> attribute and interpreting its value as the name of a variable in the same
> netCDF file. I had a very frustrating time working out why it did not find
> the variable, when it was very obviously there, until it dawned on me that
> it was a string termination issue. I established that the Fortran program
> was reading the attribute value as
>
>     wind_time\0
>
> where "\0" stands for the ASCII null character and of course it couldn't
> find a variable with a name that matched this.
>
> So I thought, "It's a bug in IDL. When it writes a string value to a netCDF
> attribute, it's adding an ASCII null at the end."
>
> On reading the IDL documentation, I noted the following in the documentation
> for NCDF_ATTINQ (IDL's counterpart to the NF_INQ_ATT family):
>
>     Length
>
>     The number of values stored in the attribute. If the attribute
>     is a string, the number of values indicates one more character
>     than the string length to include the terminating null character.
>     This is the NetCDF convention...

This was a netCDF convention early in netCDF history, but is now
deprecated.  However the situation for string-valued attributes is
different than for character data in variables; see below.

> It's not a bug it's a feature! But it was news to me that terminating 0s
> are a netCDF convention, so I looked in the netCDF documentation.
> There are references to this issue in a few places in the C and Fortran
> user guides. These are the most relevant:
>
> Section 8.3 (Get Information about an Attribute: nc_inq_att Family) of the C
> guide includes the following
>
> http://www.unidata.ucar.edu/packages/netcdf/guidec/guidec-13.html
> #HEADING13-265
>
>     lenp
>
>     Pointer to location for returned number of values currently
>     stored in the attribute. For attributes of type NC_CHAR, you
>     should not assume that this includes a trailing zero byte; it
>     doesn't if the attribute was stored without a trailing zero byte,
>     for example from a FORTRAN program. Before using the value
>     as a C string, make sure it is null-terminated. If this parameter is
>     given as '0' (a null pointer), no length will be returned so no
>     variable to hold this information needs to be declared.
>
> So this implies it is permissible but not compulsory to add a trailing zero
> byte.

That's right.

> Section 7.15 (Reading and Writing Character String Values) of the Fortran
> guide
>
> http://www.unidata.ucar.edu/packages/netcdf/guidef/guidef-12.html
> #HEADING12-1332
>
> includes the following
>
>     In FORTRAN, fixed-length strings may be written to a netCDF
>     dataset without a terminating character, to save space.
>     Variable-length strings should follow the C convention of writing
>     strings with a terminating zero byte so that the intended length
>     of the string can be determined when it is later read by either C
>     or FORTRAN programs.
>
> I'm not sure what "variable-length strings" means in this context. Surely
> the issue is not whether the string being written has variable length
> but whether the reading application has enough info to determine
> the string length without the zero byte. For an attribute the size is
> available from NF_INQ_ATTLEN so the zero byte is unnecessary IMO.

This is discussing variables, rather than attributes.  The difference
is that a variable may hold an array of strings of different lengths,
for example an array of "station names" some of which are 5 characters
in length and some of which are shorter.  NetCDF requires you to
declare the shape of such an array to accommodate the maximum string
length, but some convention is needed to indicate the end of shorter
strings, unless you store a separate array of string lengths.  This is
not an issue with attributes, since attributes can only be scalar or
one dimensional, so are not intended to be used to hold an array of
strings.

> I note that when I take a file with a null-terminated text attribute,
> convert it to CDL with ncdump, then reconvert it to binary with ncgen, the
> trailing zero byte is stripped off (which of course provides me with a
> simple way to "fix" IDL-generated files). This suggests to me that
> null-terminated text attributes are *not* a netCDF convention. Mind you,
> the ncdump man page calls this a bug!
>
> Any comments?

The comment in the ncdump bugs section of the ncdump man page also
refers to variables, not attributes, and merely laments the fact that
there isn't more explicit support for multidimensional arrays of
variable length strings.

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
russ@xxxxxxxxxxxxxxxx                     http://www.unidata.ucar.edu

  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: