[netcdfgroup] question about compound types

I'm trying to understand how netcdf handles the alignment of compound types. The empirical evidence suggests that the library assumes that the data is aligned according to the default alignment of the C compiler, regardless of what the user specifies for the offsets of the compound fields. Consider the following C program:

#include <stdlib.h>
#include <stdio.h>
#include "netcdf.h"

int
main()
{
     int ncid,typeid,varid,dimid,ndims,natts,nfields;
     size_t offset,size;
     char name[NC_MAX_NAME + 1];
     int dimids[] = {0};

     struct s1
     {
           short i;
           int j;
     } __attribute__ ((__packed__));

     struct s1 data[1];

/* Create some phony data. */ data[0].i = 20000;
     data[0].j = 300000;

     /* Create a file with a compound type. Write a little data. */
     nc_create("test.nc", NC_NETCDF4, &ncid);
     printf("size of compound %d\n",sizeof(struct s1));
     nc_def_compound(ncid, sizeof(struct s1), "cmp1", &typeid);
     printf("offset 1 %d\n",NC_COMPOUND_OFFSET(struct s1,i));
     nc_insert_compound(ncid, typeid, "i",
                            NC_COMPOUND_OFFSET(struct s1, i), NC_SHORT);
     printf("offset 2 %d\n",NC_COMPOUND_OFFSET(struct s1,j));
     nc_insert_compound(ncid, typeid, "j",
                            NC_COMPOUND_OFFSET(struct s1, j), NC_INT);
     nc_def_dim(ncid, "phony_dim", 1, &dimid);
     nc_def_var(ncid, "phony_var", typeid, 1, dimids, &varid);
     nc_put_var(ncid, varid, data);
     nc_close(ncid);

     /* Reopen the file and read back info about compound type */
     /* Note that the size and the offsets are different than */
     /* what was specified above */
     nc_open("test.nc", NC_NOWRITE, &ncid);
     nc_inq_varid(ncid, "phony_var", &varid);
     nc_inq_var (ncid, varid, name, &typeid, &ndims, dimids, &natts);
     nc_inq_compound_size(ncid, typeid, &size);
     printf("size of compound %d\n",size);
     nc_inq_compound_fieldoffset(ncid, typeid, 0, &offset);
     printf("offset 1 %d\n",offset);
     nc_inq_compound_fieldoffset(ncid, typeid, 1, &offset);
     printf("offset 2 %d\n",offset);
     nc_close(ncid);
}

When I run this I get (on Mac OS 10.5 using the June 1 netcdf-4.1 snapshot)

size of compound 6
offset 1 0
offset 2 2
size of compound 8
offset 1 0
offset 2 4

Note that my data is packed (no padding), and I specified the offsets consistent with that packing, but when I read the data back in I find that the library actually used a different alignment (with padding consistent with the default compiler alignment).

When I raised this issue before, Ed said that the packed data is not allowed, and you must use the default alignment of the C compiler. There are at least three big problems with this policy:

1) It's very confusing that the library ignores the offsets you provide.

2) If you are compiling a program with a C compiler with a different default alignment that the one netcdf assumes, you will get suprising (and wrong) results.

3) There is actually no way to know what alignment netcdf is actually going to use, short of creating a compound type and then reading it back in. This means since the user can't count on specifying the offsets, there's no way to know how to provide the data to nc_put_var ahead of time.

Shouldn't netcdf respect what the user provides for offsets, even if it doesn't agree with the default compiler alignment? I know this makes it hard to read and write C structs - in that case the user must intrepret the data on his or her own using the specified offsets. This is how the HDF5 library behaves, and it seems to me that if netcdf deviates from this it will cause all kinds of problems for users down the line when trying to read and write data that doesn't conform to the default alignment expected by the netcdf library. It certainly has caused a lot of headaches for me already.

-Jeff



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: