Re: [netcdfgroup] Problem using h5repack with netcdf-4 files

Hi Greg,

> --===============1191010251==
> Content-Type: multipart/alternative;
>       boundary=------------050001060202050300010408
> 
> 
> --------------050001060202050300010408
> Content-Type: text/plain;
>  charset=iso-8859-1;
>  format=flowed
> Content-Transfer-Encoding: 7bit
> 
> I am trying to use the h5repack utility with netcdf-4 files and it 
> doesn't seem to work.  The data below shows this on a small datafile 
> included in ncdump/small2.nc.  Basically:
> 
>   * Convert the netcdf-3 format small2.nc to netcdf-4 format using
>     nccopy -k 4
>   * Use h5repack on that file to create a third file.
>   * ncdump can access the original and nccopy'd file
>   * ncdump give HDF error when accessing the h5repack'd file
>   * A diff of the h5dump output shows that h5repack seems to replace
>     DATASET with GROUP...
> 
> Are there options to h5repack that can be used to produce a valid netcdf 
> file?

I don't know the answer to that question, but the nccopy in netCDF-4.2
has new options for rechunking data for faster access, and has achieved
some order of magnitude speedups in this area.  In one test case, it was
significantly faster than h5repack.  I can send details, if you're
interested.  The nccopy options are:

nccopy: nccopy [-k n] [-d n] [-s] [-c chunkspec] [-u] [-m n] [-h n] [-e n] 
infile outfile
  [-k n]    specify kind of netCDF format for output file, default same as input
            1 classic, 2 64-bit offset, 3 netCDF-4, 4 netCDF-4 classic model
  [-d n]    set deflation compression level, default same as input (0=none 
9=max)
  [-s]      add shuffle option to deflation compression
  [-c chunkspec] specify chunking for dimensions, e.g. "dim1/N1,dim2/N2,..."
  [-u]      convert unlimited dimensions to fixed-size dimensions in output copy
  [-m n]    set size in bytes of copy buffer, default is 5000000 bytes
  [-h n]    set size in bytes of chunk_cache for chunked variables
  [-e n]    set number of elements that chunk_cache can hold
  infile    name of netCDF input file
  outfile   name for netCDF output file

I'll be announcing netCDF 4.2 availability later this afternoon.

--Russ

> --Greg
> 
> s927819>  nccopy -k 4 small2.nc small2.nc4
> s927819>  h5repack small2.nc4 small2.nc4.rep
> s927819>  ./ncdump small2.nc4
> netcdf small2 {
> dimensions:
>          t = UNLIMITED ; // (1 currently)
>          m = 5 ;
> variables:
>          byte b(t, m) ;
> data:
> 
>   b =
>    1, 2, 3, 4, 5 ;
> }
> s927819>  ./ncdump small2.nc4.rep
> /Users/gdsjaar/src/SEACAS/TPL/netcdf/netcdf-4.2/ncdump/.libs/ncdump: small2.n
> c4.rep: NetCDF: HDF error
> s927819>  h5dump small2.nc4>good.out
> s927819>  h5dump small2.nc4.rep>bad.out
> s927819>  diff -c good.out bad.out
> *** good.out    2012-03-20 09:52:47.000000000 -0600
> --- bad.out     2012-03-20 09:52:53.000000000 -0600
> ***************
> *** 1,4 ****
> ! HDF5 "small2.nc4" {
>    GROUP "/" {
>       ATTRIBUTE "_nc3_strict" {
>          DATATYPE  H5T_STD_I32LE
> --- 1,4 ----
> ! HDF5 "small2.nc4.rep" {
>    GROUP "/" {
>       ATTRIBUTE "_nc3_strict" {
>          DATATYPE  H5T_STD_I32LE
> ***************
> *** 17,23 ****
>             DATATYPE  H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
>             DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
>             DATA {
> !          (0): (DATASET 255 /t ), (DATASET 547 /m )
>             }
>          }
>       }
> --- 17,23 ----
>             DATATYPE  H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
>             DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
>             DATA {
> !          (0): (GROUP 0), (GROUP 0)
>             }
>          }
>       }
> ***************
> *** 59,65 ****
>             DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>             DATA {
>             (0): {
> !                DATASET 969 /b ,
>                   1
>                }
>             }
> --- 59,65 ----
>             DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>             DATA {
>             (0): {
> !                GROUP 140734799797120,
>                   1
>                }
>             }
> ***************
> *** 102,108 ****
>             DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>             DATA {
>             (0): {
> !                DATASET 969 /b ,
>                   0
>                }
>             }
> --- 102,108 ----
>             DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>             DATA {
>             (0): {
> !                GROUP 140734799797120,
>                   0
>                }
>             }
> 
> 
> --------------050001060202050300010408
> Content-Type: text/html;
>  charset=iso-8859-1
> Content-Transfer-Encoding: 7bit
> 
> <html>
>   <head>
> 
>     <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
>   </head>
>   <body bgcolor="#FFFFFF" text="#000000">
>     I am trying to use the h5repack utility with netcdf-4 files and it
>     doesn't seem to work.&nbsp; The data below shows this on a small datafile
>     included in ncdump/small2.nc.&nbsp; Basically:<br>
>     <ul>
>       <li>Convert the netcdf-3 format small2.nc to netcdf-4 format using
>         nccopy -k 4</li>
>       <li>Use h5repack on that file to create a third file.</li>
>       <li>ncdump can access the original and nccopy'd file</li>
>       <li>ncdump give HDF error when accessing the h5repack'd file</li>
>       <li>A diff of the h5dump output shows that h5repack seems to
>         replace DATASET with GROUP...</li>
>     </ul>
>     <p>Are there options to h5repack that can be used to produce a valid
>       netcdf file?&nbsp; <br>
>     </p>
>     <p>--Greg<br>
>     </p>
>     <pre>s927819&gt; nccopy -k 4 small2.nc small2.nc4
> s927819&gt; h5repack small2.nc4 small2.nc4.rep
> s927819&gt; ./ncdump small2.nc4
> netcdf small2 {
> dimensions:
>         t = UNLIMITED ; // (1 currently)
>         m = 5 ;
> variables:
>         byte b(t, m) ;
> data:
> 
>  b =
>   1, 2, 3, 4, 5 ;
> }
> s927819&gt; ./ncdump small2.nc4.rep 
> /Users/gdsjaar/src/SEACAS/TPL/netcdf/netcdf-4.2/ncdump/.libs/ncdump: small2.n
> c4.rep: NetCDF: HDF error
> s927819&gt; h5dump small2.nc4 &gt;good.out
> s927819&gt; h5dump small2.nc4.rep &gt;bad.out
> s927819&gt; diff -c good.out bad.out
> *** good.out    2012-03-20 09:52:47.000000000 -0600
> --- bad.out     2012-03-20 09:52:53.000000000 -0600
> ***************
> *** 1,4 ****
> ! HDF5 "small2.nc4" {
>   GROUP "/" {
>      ATTRIBUTE "_nc3_strict" {
>         DATATYPE  H5T_STD_I32LE
> --- 1,4 ----
> ! HDF5 "small2.nc4.rep" {
>   GROUP "/" {
>      ATTRIBUTE "_nc3_strict" {
>         DATATYPE  H5T_STD_I32LE
> ***************
> *** 17,23 ****
>            DATATYPE  H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
>            DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
>            DATA {
> !          (0): (DATASET 255 /t ), (DATASET 547 /m )
>            }
>         }
>      }
> --- 17,23 ----
>            DATATYPE  H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
>            DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
>            DATA {
> !          (0): (GROUP 0), (GROUP 0)
>            }
>         }
>      }
> ***************
> *** 59,65 ****
>            DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>            DATA {
>            (0): {
> !                DATASET 969 /b ,
>                  1
>               }
>            }
> --- 59,65 ----
>            DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>            DATA {
>            (0): {
> !                GROUP 140734799797120,
>                  1
>               }
>            }
> ***************
> *** 102,108 ****
>            DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>            DATA {
>            (0): {
> !                DATASET 969 /b ,
>                  0
>               }
>            }
> --- 102,108 ----
>            DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
>            DATA {
>            (0): {
> !                GROUP 140734799797120,
>                  0
>               }
>            }
> </pre>
>   </body>
> </html>
> 
> --------------050001060202050300010408--
> 
> 
> --===============1191010251==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit: http://www.unidata.ucar.edu/m
> ailing_lists/ 
> --===============1191010251==--



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: