NOTE: The netcdf-hdf
mailing list is no longer active. The list archives are made available for historical reasons.
I'm forwarding the following reply from Jeff Lng about SILO, for your information ... --Russ Russ, Due to problems on my end, the following email message from two weeks ago got bounced back, and I didn't notice it until today. Sorry about the delay. Russ, May 7, 1992 I appreciate your comments and questions. I am mailing you the SILO document, just so you have a complete reference. I am happy to answer any questions via email, though, since that is easier. Since you do not yet have the document, let me fill you in on some of the background behind SILO. SILO is currently implemented on top of the Portable Database Library (PDBLib) written by Stewart Brown of LLNL. We did this as a stop-gap measure; our long-term goal is to drop the SILO library altogether, and use the HDF/netCDF merge (assuming that the capabilities provided by the SILO extensions are available.) My immediate goal is to have the HDF/netCDF community agree on a directory and object capability (and programming interface). When that is done, I will modify the SILO interface to match the agreed upon interface. We will then use the SILO/PDBLib combination until such time that the HDF/netCDF project has a library for us to use. We have bought into the idea of a standard INTERFACE, and are willing to switch underlying databases. DIRECTORIES ----------- I agree with your assessment that directories are more fundamental. I will try to answer the questions regarding them first. In SILO, every file is born with a root directory. At present, this directory is called "RootDir", and its ID is 0. Since SILO is trying to provide a Unix-like hierarchy, though, I'm considering changing the root directory name to "/". So, to answer your question, there IS a default directory even if the user has not explicitly called ncdirdef(). I have not yet defined a Fortran interface for the directory functions. It appears that netCDF is using at most six characters for Fortran subroutine names, which means that the various primitives are limited to one distinct character ('d' for dimension, 'v' for variable, etc.) Directories would therefore require a notation other than 'd'; perhaps 'r': ncrdef, ncrget, ncrid, ncrinq, ncrlst, ncrset GROUPS (aka OBJECTS) -------------------- The term 'object' is indeed an overused term. I propose that this capability be referred to as 'groups' from here on. My previous message was probably unclear regarding some aspects of groups. Let me try to clarify. Groups within SILO are treated like dimensions, variables, and attributes in that they are scoped to directories. That is, a SILO file can contain multiple objects with the same name, provided they appear in different directories. (In SILO, only directory ID's are global.) Groups are unique in that they can be composed of components which reside in other directories. Group components can be shared between multiple groups (this is very useful, and frequently used.) To answer your questions regarding groups directly: o The 'type' of an object is like a tag in the HDF world. It indicates what the group contains; current group types are: quad-mesh, unstructured-mesh, and so on. SILO itself does not distinguish between the different types of groups; we have a higher level of functions (called SLIDE) which does that. o Group ID's are unique only within directories. o We have not yet encountered a need to edit (add to, delete from) groups. o The same component can be in multiple groups. Here is a diagram of what a group might look like: Group Name = "Sample" Group Type = SLIDE_QUADMESH # Components = 5 Component Component Component Component Name Type ID Parent --------------------------------------------- "X Coords" var 15 1 {2D var, in dir 1} "Y Coords" var 16 2 {2D var, in dir 2} "Num Dims" dim 8 0 {dim with size = 2} "Dims" var 2 0 {array of dim values} "Coord Sys" var 6 0 {used like attribute} << Note that the component types are actually defined constants, and will be of type integer.>> I am intrigued by your suggestion of using attributes to define groups, rather than extending the interface. Because there can be multiple objects with the same name within a file, I don't believe we could use global attributes to describe a group. Maybe I'm missing something. Anyway, I have come up with a couple of variations on the attribute scheme which I'd like to throw out for discussion. I will use the object described above for illustrating each. 1. Adopt a convention such that variables whose name begins with "GROUP_" are in fact group variables. The value of the variable would be some kind of string representation which describes the dimensions, variables, and groups which comprise the group. An attribute could be used for defining the type. For example: GROUP_Sample = "X Coords,var,15,1;Y Coords,var,16,2;..." 2. Like 1 above, but the value of the variable would be a scalar containing the type of group (e.g., SLIDE_QUADMESH). Attributes of this variable whose names begin with "GROUP_" would define the group components. This could be done in one of several ways, including: Variable "GROUP_Sample" = SLIDE_QUADMESH GROUP_Sample Attribute Name Attribute Value -------------- --------------- a. "GROUP_X Coords_parent" 1 "GROUP_X Coords_id" 15 "GROUP_X Coords_type" var . . . b. "GROUP_X Coords" {1,15,var} . . . c. "GROUP_X Coords_var_dir1" 15 3. Rather than using attributes, use a pair of variables to define a group. One variable would define the component names, the other variable would define the remaining component data. For example: "GROUPNAMES_Sample" ";X Coords;Y Coords;Num Dims;Dims;CoordSys;" "GROUPDATA_Sample" {var,15,1,var,16,2,dim,8,0,var,2,0,var,6,0} The names are packed into a single character array, where the first character of the array is to be used as the field delimiter (';' in the example.) The other data is stored in either a nx3 array, or a vector of length n*3. Regardless of how objects are represented, there is still the issue of whether an interface is provided which builds these group variables, or if the user does it explicitly (perhaps via a higher-level interface). Any comments? GENERAL ------- Regarding unlimited dimensions: we do not currently use that feature, so I basically punted when I said that there is only one per file. Whatever seems reasonable to you and other netCDF users is okay with me. Likewise, I punted when it came to define vs. data mode. As PDBLib does not really adhere to that model, I have glossed over the mode issue in the current version of SILO. I want SILO to do things in netCDF way, however, so I expect that will change. As you found, I am not yet on the netcdfgroup mailing list. The netCDF document I was working from was marked Version 1.06, which apparently is quite out of date. I will attempt to get the latest document, as well as join the mailing list. I expect SILO to be changed to correspond to the latest version. ---- Jeff Long jwlong@xxxxxxxx LLNL PO Box 808, L-35 Livermore, CA 94551 >From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 15 2004 Jul -0600 13:42:24 Message-ID: <wrxd62xcb1b.fsf@xxxxxxxxxxxxxxxxxxxxxxx> Date: 15 Jul 2004 13:42:24 -0600 From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx> To: netcdf-hdf@xxxxxxxxxxxxxxxx Subject: questions about compression... Received: (from majordo@localhost) by unidata.ucar.edu (UCAR/Unidata) id i6FJgQ3k029379 for netcdf-hdf-out; Thu, 15 Jul 2004 13:42:26 -0600 (MDT) Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu [128.117.140.88]) by unidata.ucar.edu (UCAR/Unidata) with ESMTP id i6FJgPaW029375 for <netcdf-hdf@unidata>; Thu, 15 Jul 2004 13:42:25 -0600 (MDT) Organization: UCAR/Unidata Keywords: 200407151942.i6FJgPaW029375 Lines: 22 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx Precedence: bulk Reply-To: netcdf-hdf@xxxxxxxxxxxxxxxx Howdy HDF5 People! I am looking at your docs to try and learn more about compression and what it means. I see the following: herr_t H5Pset_deflate (hid_t plist_id, int level) These functions set or query the deflate level of dataset creation property list plist_id. The H5Pset_deflate sets the compression method to H5Z_DEFLATE and sets the compression level to some integer between one and nine (inclusive). One results in the fastest compression while nine results in the best compression ratio. The default value is six if H5Pset_deflate isn't called. Does this mean that to compress some chunked dataset, it set it's compression method to H5Z_DEFLATE and that turns on compression? I can't seem to find much info about compression, am I missing some? Ed
netcdf-hdf
archives: