[netcdfgroup] question about compound types

To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: [netcdfgroup] question about compound types
From: Jeff Whitaker <jswhit@xxxxxxxxxxx>
Date: Tue, 02 Jun 2009 05:53:49 -0600

I'm trying to understand how netcdf handles the alignment of compoundtypes. The empirical evidence suggests that the library assumes thatthe data is aligned according to the default alignment of the Ccompiler, regardless of what the user specifies for the offsets of thecompound fields. Consider the following C program:


#include <stdlib.h>
#include <stdio.h>
#include "netcdf.h"

int
main()
{
     int ncid,typeid,varid,dimid,ndims,natts,nfields;
     size_t offset,size;
     char name[NC_MAX_NAME + 1];
     int dimids[] = {0};

     struct s1
     {
           short i;
           int j;
     } __attribute__ ((__packed__));

     struct s1 data[1];

/* Create some phony data. */data[0].i = 20000;

     data[0].j = 300000;

     /* Create a file with a compound type. Write a little data. */
     nc_create("test.nc", NC_NETCDF4, &ncid);
     printf("size of compound %d\n",sizeof(struct s1));
     nc_def_compound(ncid, sizeof(struct s1), "cmp1", &typeid);
     printf("offset 1 %d\n",NC_COMPOUND_OFFSET(struct s1,i));
     nc_insert_compound(ncid, typeid, "i",
                            NC_COMPOUND_OFFSET(struct s1, i), NC_SHORT);
     printf("offset 2 %d\n",NC_COMPOUND_OFFSET(struct s1,j));
     nc_insert_compound(ncid, typeid, "j",
                            NC_COMPOUND_OFFSET(struct s1, j), NC_INT);
     nc_def_dim(ncid, "phony_dim", 1, &dimid);
     nc_def_var(ncid, "phony_var", typeid, 1, dimids, &varid);
     nc_put_var(ncid, varid, data);
     nc_close(ncid);

     /* Reopen the file and read back info about compound type */
     /* Note that the size and the offsets are different than */
     /* what was specified above */
     nc_open("test.nc", NC_NOWRITE, &ncid);
     nc_inq_varid(ncid, "phony_var", &varid);
     nc_inq_var (ncid, varid, name, &typeid, &ndims, dimids, &natts);
     nc_inq_compound_size(ncid, typeid, &size);
     printf("size of compound %d\n",size);
     nc_inq_compound_fieldoffset(ncid, typeid, 0, &offset);
     printf("offset 1 %d\n",offset);
     nc_inq_compound_fieldoffset(ncid, typeid, 1, &offset);
     printf("offset 2 %d\n",offset);
     nc_close(ncid);
}

When I run this I get (on Mac OS 10.5 using the June 1 netcdf-4.1 snapshot)

size of compound 6
offset 1 0
offset 2 2
size of compound 8
offset 1 0
offset 2 4

Note that my data is packed (no padding), and I specified the offsetsconsistent with that packing, but when I read the data back in I findthat the library actually used a different alignment (with paddingconsistent with the default compiler alignment).

When I raised this issue before, Ed said that the packed data is notallowed, and you must use the default alignment of the C compiler.There are at least three big problems with this policy:


1) It's very confusing that the library ignores the offsets you provide.

2) If you are compiling a program with a C compiler with a differentdefault alignment that the one netcdf assumes, you will get suprising(and wrong) results.

3) There is actually no way to know what alignment netcdf is actuallygoing to use, short of creating a compound type and then reading it backin. This means since the user can't count on specifying the offsets,there's no way to know how to provide the data to nc_put_var ahead of time.

Shouldn't netcdf respect what the user provides for offsets, even if itdoesn't agree with the default compiler alignment? I know this makes ithard to read and write C structs - in that case the user must intrepretthe data on his or her own using the specified offsets. This is how theHDF5 library behaves, and it seems to me that if netcdf deviates fromthis it will cause all kinds of problems for users down the line whentrying to read and write data that doesn't conform to the defaultalignment expected by the netcdf library. It certainly has caused a lotof headaches for me already.


-Jeff

2009 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: