These rules are taken from the HDF5 code. They are used in netcdf in ncgen4 and (the soon to be released) DAP->netcdf-4 translator.
The key to the layout is the notion of alignment. The alignment of a primitive data type (e.g. char, short, int, etc.) is the memory boundary on which all instances of the type should occur. As a rule, the alignment of a primitive type is equal to the sizeof(). Thus, the alignment of a char is 1, a short is 2, and so on. Note that the alignment of long depends on the machine. For 32-bit machines, it is 4 and for 64-bit machines the alignment of a long is 8.
|However, the above rule is not always correct. For some machines, the alignment boundary may be smaller than the sizeof() function indicates. For example, on a SPARC, double values can be aligned on a 4-byte boundary instead of the expected 8-byte boundary. This means the alignment must be computed on a per-machine (though hopefully not on a per-compiler basis). To compute these true alignments, one must construct the following set of C structs.
| struct S { char f1; T f2;}
|T ranges over all of the possible primitive types: char, short, int, float, double, etc. For each such struct, the value of the offsetof(S,f2) macro (from stddef.h) must be calculated and used as the alignment for type T. The offset of a field in a C struct is the relative address of the field from the beginning of the struct, where the initial offset is zero. Thus, on a SPARC, offsetof(S,f2) when T = double is 4, whereas on a 64-bit X86 machine, offsetof(S,f2) when T = double is 8. This value is the alignment that must be used when computing struct offsets as defined below.
To test if a primitive type is properly aligned, the following should be true, where A is the address and alignment is the alignment of the primitive type.
((unsigned long)A) % alignment == 0
Given this, the rules for layout of a C struct are as follows.
- The initial offset is zero
- Given a current offset, O, and a field F whose alignment is A, the offset of F is
O + P
, where P is the padding needed to be added to make sure that F is aligned to A. P is defined as(O % A == 0)?0:(A - (O % A))
. - After adding field F, the offset is then
O = O + P + A
. - One more rule is needed to complete the description. It appears that the alignment of a nested structure is the alignment of the most stringent field in the nested structure. "Stringent" effectively means the largest alignment.
- The size of a struct is the offset after the last field is added rounded up to a multiple of the most stringent field alignment.
More simply put, when adding a field, bump the offset until the offset is at the alignment required by the field.