C Struct Layout Rules

The issue of the layout of C struct data type fields has cropped up a number of times recently, so it seems appropriate to document the apparent layout rules. This is important to developers who are using a language other than C to access netcdf-4 datatypes: python, or fortran.

These rules are taken from the HDF5 code. They are used in netcdf in ncgen4 and (the soon to be released) DAP->netcdf-4 translator.

The key to the layout is the notion of alignment. The alignment of a primitive data type (e.g. char, short, int, etc.) is the memory boundary on which all instances of the type should occur. As a rule, the alignment of a primitive type is equal to the sizeof(). Thus, the alignment of a char is 1, a short is 2, and so on. Note that the alignment of long depends on the machine. For 32-bit machines, it is 4 and for 64-bit machines the alignment of a long is 8.

|However, the above rule is not always correct.  For some machines, the alignment boundary may be smaller than the sizeof() function indicates. For example, on a SPARC, double values can be aligned on a 4-byte boundary instead of the expected 8-byte boundary. This means the alignment must be computed on a per-machine (though hopefully not on a per-compiler basis). To compute these true alignments, one must construct the following set of C structs.

|    struct S { char f1; T f2;}

|T ranges over all of the possible primitive types: char, short, int, float, double, etc. For each such struct, the value of the offsetof(S,f2) macro (from stddef.h) must be calculated and used as the alignment for type T.  The offset of a field in a C struct is the relative address of the field from the beginning of the struct, where the initial offset is zero. Thus, on a SPARC, offsetof(S,f2) when T = double is 4, whereas on a 64-bit X86 machine, offsetof(S,f2) when T = double is 8. This value is the alignment that must be used when computing struct offsets as defined below.

To test if a primitive type is properly aligned, the following should be true, where A is the address and alignment is the alignment of the primitive type.

 ((unsigned long)A) % alignment == 0 

Given this, the rules for layout of a C struct are as follows.

  1. The initial offset is zero
  2. Given a current offset, O, and a field F whose alignment is A, the offset of F is O + P, where P is the padding needed to be added to make sure that F is aligned to A. P is defined as
    (O % A == 0)?0:(A - (O % A)).      
  3. After adding field F, the offset is then O = O + P + A.
  4. One more rule is needed to complete the description. It appears that the alignment of a nested structure is the alignment of the most stringent field in the nested structure. "Stringent" effectively means the largest alignment.
  5. The size of a struct is the offset after the last field is added rounded up to a multiple of the most stringent field alignment.

More simply put, when adding a field, bump the offset until the offset is at the alignment required by the field.

Unidata Developer's Blog
A weblog about software development by Unidata developers*
Unidata Developer's Blog
A weblog about software development by Unidata developers*

Welcome

FAQs

News@Unidata blog

Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
  • feed AWIPS (17)
Browse by Topic
« March 2009 »
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
31
    
       
Today