[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #LWO-337450]: nc_put_att_schar vs uchar vs ubyte



Hi Steve,

> I'm using the C API with NC4/HDF5 backing.
> When I write attributes using the calls
> nc_put_att_schar
> nc_put_att_uchar
> nc_put_att_ubyte
> I find that the resulting HDF5 file attributes are identical.
> Here are the results of h5dump -p for the 3 cases:
> 
> call h5dump -p output
> ---- ----------------
> 
> nc_put_att_schar
> DATATYPE H5T_STD_I8LE
> nc_put_att_uchar
> DATATYPE H5T_STD_I8LE
> nc_put_att_ubyte
> DATATYPE H5T_STD_I8LE
> 
> Similarly the ncdump output shows:
> :attrUchar = 97b, 98b, 99b ;
> :attrSchar = 97b, 98b, 99b ;
> :attrUbyte = 97b, 98b, 99b ;

All of the following functions in the C API were the result of a
mistake, because they all duplicate the function with "_ubyte"
replaced by "_uchar" in the name of the function:

  nc_put_var1_ubyte
  nc_put_var_ubyte
  nc_put_vara_ubyte
  nc_put_vars_ubyte
  nc_put_varm_ubyte
  nc_get_var1_ubyte
  nc_get_var_ubyte
  nc_get_vara_ubyte
  nc_get_vars_ubyte
  nc_get_varm_ubyte
  nc_put_att_ubyte
  nc_get_att_ubyte

The last component of the name of these functions is supposed to
specify the C type of the associated data pointer, not a netCDF
primitive type, so "uchar", indicating "unsigned char", was the
appropriate component of the function name, not "ubyte".  All the
associated functions with "uchar" in their name were part of the
netCDF-3 API, so these new functions added to the netCDF-4 API were
unnecessary and did not follow the intended pattern of the API.

These "_ubyte" functions should probably be deprecated in the C API
because of the confusion they cause, but are currently left in the
netCDF-4 API as harmless aliases for corresponding the "_uchar"
functions.

As for why we have both "_uchar" and "_schar" functions, they should
actually behave differently when converting data in the C specified
type into a wider external netCDF type, as can be done with the
variable functions.  For example, assume the netCDF attribute "ivar"
is defined to be of type NC_INT, and you write the byte 255 into it
with either the nc_put_var1_uchar() function or the
nc_put_var1_schar() function.  You will get two different results,
either 255 or -1 as the stored int value, because the conversion to an
int should be different depending on the intended type of the
in-memory value.

For attributes, you always specify the external type when you put the
value, but you should see the same difference as in the variable case
above when specifying an external type for the attribute that's
different from the argument type supplied.  For example

  unsigned char uc = 255;
  nc_put_att_uchar(ncid, varid, "att1", NC_INT, len, &uc);
  nc_put_att_schar(ncid, varid, "att2", NC_INT, len, &uc);

will store two different attribute values, with att1==255 and
att2==-1.

> However for variables, the HDF5 types do seem to vary.
> 
> var type h5dump -p output
> -------- ----------------
> 
> NC_BYTE
> DATATYPE H5T_STD_I8LE
> NC_UBYTE
> DATATYPE H5T_STD_U8LE
> NC_CHAR
> DATATYPE H5T_STRING {
> STRSIZE 1;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> NC_STRING
> DATATYPE H5T_STRING {
> STRSIZE H5T_VARIABLE;
> STRPAD H5T_STR_NULLTERM;
> CSET H5T_CSET_ASCII;
> CTYPE H5T_C_S1;
> }
> 
> 
> Is this intentional?

Yes.  Note that there is no relation between the NC_UBYTE macro, which
specifies an external type of an unsigned 8-bit byte, and the _uchar
functions, which specify that the data argument has C type "unsigned
char".  So the NC_UBYTE and NC_BYTE macros specify two distinct netCDF
external types, but the arguments of type "unsigned char" and "signed
char" used in netCDF functions specify an internal C type for the
data.  Data of an internal type "unsigned char" can be written to an
attribute or variable of a completely different type, e.g. NC_DOUBLE
or NC_INT64.

The existence of _ubyte functions amplify confusion over this subtle
point, which is another reason they should be deprecated.

I've attached a little C program showing the different behaviors of
nc_put_att_uchar() and nc_put_att_schar() when type conversion occurs.

--Russ
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: LWO-337450
Department: Support netCDF
Priority: Normal
Status: Closed