Thanks again.
I'll have a look to pnetcdf.
Another reason why we moved towards HDF5 was that, according to what I know,
they could be able to exploit different levels of memory hierarchy in HPC
simulations. Could pnetcdf do that as well ?
Besides that I'd really like some hints. Why could netcdf better than HDF5, or
viceversa. Please do your worst.
For the NF90_unlimited, we are already using it in time dependent simulations
in a way similar to the one you suggest. For the present case instead I'm just
filling a huge complex matrix. So the interruption usually happens because
there are limits on the simulation time. I'd really need to check which
elements were filled and which were not without having any clue on the status.
Since you mentioned it. I'm very interested in the storage of sparse matrices.
My huge matrix is indeed quite sparse. How does that work ?
Best,
D
On Sun, May 3, 2020 at 12:48 AM +0200, "Wei-Keng Liao"
<wkliao@xxxxxxxxxxxxxxxx> wrote:
Hi, Dave
Thanks for following up with the correct information about the dimension
objects.
I admit that I am not familiar with the NetCDF4 dimension representation in
HDF5.
Wei-keng
> On May 2, 2020, at 5:28 PM, Dave Allured - NOAA Affiliate wrote:
>
> Wei-king, thanks for the info on the latest release. Minor detail, I found
> that hidden dimension scales are still stored as arrays, but the arrays are
> left unpopulated. HDF5 stores these as sparse, which means no wasted space
> in arrays that are never written.
>
> For Davide, I concur with Wei-king that netcdf-C 4.7.4 is okay for your
> purpose, and should not store wasted space. Version 4.7.3 behaves the same
> as 4.7.4.
>
> I wonder when they changed that, some time between your 4.4.1.1 and 4.7.3.
> Also you used HDF5 1.8.18, I used 1.10.5. That should not make any
> difference here, but perhaps it does.
>
>
> On Sat, May 2, 2020 at 1:01 PM Wei-Keng Liao wrote:
>
> If you used the latest NetCDF 4.7.4, the dimensions will be stored as scalars.
>
> Wei-keng
>
> > On May 2, 2020, at 1:42 PM, Davide Sangalli wrote:
> >
> > Yeah, but BS_K_linearized1 is just a dimension, how can it be 8 GB big ?
> > Same for BS_K_linearized2, how can it be 3 GB big ?
> > These two are just two numbers
> > BS_K_linearized1 = 2,025,000,000
> > (it was chosen has a maximum variable size in my code to avoid overflowing
> > the maximum allowed integer in standard precision)
> > BS_K_linearized2 = 781,887,360
> >
> > D.
> >
> > On 02/05/20 19:06, Wei-Keng Liao wrote:
> >> The dump information shows there are actually 8 datasets in the file.
> >> Below is the start offsets, sizes, and end offsets of individual datasets.
> >> There is not much padding space in between the datasets.
> >> According to this, your file is expected to be of size 16 GB.
> >>
> >> dataset name start offset size end offset
> >> BS_K_linearized1 2,379 8,100,000,000 8,100,002,379
> >> BSE_RESONANT_COMPRESSED1_DONE 8,100,002,379 2,025,000,000
> >> 10,125,002,379
> >> BSE_RESONANT_COMPRESSED2_DONE 10,125,006,475 2,025,000,000
> >> 12,150,006,475
> >> BS_K_linearized2 12,150,006,475 3,127,549,440 15,277,555,915
> >> BSE_RESONANT_COMPRESSED3_DONE 15,277,557,963 781,887,360
> >> 16,059,445,323
> >> complex 16,059,447,371 8
> >> 16,059,447,379
> >> BS_K_compressed1 16,059,447,379 99,107,168 16,158,554,547
> >> BSE_RESONANT_COMPRESSED1 16,158,554,547 198,214,336 16,356,768,883
> >>
> >> Wei-keng
> >>
> >>> On May 2, 2020, at 11:28 AM, Davide Sangalli wrote:
> >>>
> >>> h5dump -Hp ndb.BS_COMPRESS0.005000_Q1
>
>