Re: [netcdfgroup] retrieving missing data from netcdf3_64BIT_OFFSET formatted netcdf files

  • To: Ramakrishnan N <ram.n.krishnan@xxxxxxxxx>
  • Subject: Re: [netcdfgroup] retrieving missing data from netcdf3_64BIT_OFFSET formatted netcdf files
  • From: Wei-Keng Liao <wkliao@xxxxxxxxxxxxxxxx>
  • Date: Sat, 5 Feb 2022 19:48:06 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9FGYa4vUwMJ9Qarr8E1d4TAu0vwQahItxn/ivGi4P3I=; b=any6E2ykFkwwUMbKKlkc3BiZVENqpL2f3sUehyTJH4yp5HEmQe67dVAvhTgSglMSBPIahhqXvVTKidjS5kOI3OhlZmpqwPKg1WuwFQPupl+UnvkFwdu4to1MNUAsFrhH+Eaes940SZnnVKjMYWttTj8EDNLZhJnT/2r8Aquj3wNOMCnSDZ1b04F0U6sAL6R2bRi7ncN9vpfxd8IuStrEMI/xfcf2zfVf3Sk7Mgfc/4jM7qaRJ5S7MCC37uUdtUHiT/b1vzHspNL1rh6cJcRJJa+Lel5YmV6pdRUwGsvgBfAEbdZcNMd+m8aErBr4N6N1OAz6HRxphpHsCnmtA1dXkw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d38SPcdqY6KtVczJSOQax6XGlLdZCmup6VForp/UWtjKoLgD3/RKGXtxne6N1gbk9BwvxZ1xDA3mMaXvoRDUklZ6VqQ/obRBQzI7YMJMmobDG2K3Y/DMlcvV9p5bpZDN3EMZuuJ5t+ImALsA57hyu9Wq9LZQrY4SCvTkHllGfTLzPBbiY+ym3B5fGaFxWw6KiTk5Dwf09g8QBTSZDALt60cOXpdIx3VMuyDySIU3U0rb+1wEhgyRFvdtRLyVCPvUAAugrkzPchOhR784ANy1qpeGoedlRaSmPFawRA/F+St+Gx+bUPk/JSYD+gMsQtYDmjkIeupsOPqmhcsLYjyBRg==
Great!

As Gus pointed out that the corrupted file may be caused by a bug in
the Amber IO routines, I encourage you to submit a github issue in
ParmED repo if the problem persists and is reproducible.

Wei-keng

On Feb 4, 2022, at 12:12 PM, Ramakrishnan N 
<ram.n.krishnan@xxxxxxxxx<mailto:ram.n.krishnan@xxxxxxxxx>> wrote:

Awesome Wei-Keng. This fixed the issue and the file now displays the data.

Thanks a lot for helping me out.

Best
Ram

On Fri, Feb 4, 2022 at 1:03 PM Wei-Keng Liao 
<wkliao@xxxxxxxxxxxxxxxx<mailto:wkliao@xxxxxxxxxxxxxxxx>> wrote:
Attached is a C program that hacks into a NetCDF file and changes
the number of records in the file header.

Compile command:
  gcc hack_numrecs.c -o hack_numrecs

Run command:
  ./hack_numrecs 
prod.nc<https://urldefense.com/v3/__http://prod.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!G72gvh3CO95w1D0Xt6zAA4C6D80DaZD8lCq5dp9UUGLyhFDrXU-31otuJESKItgINMRo$>
 14

Please backup your input file first.

I tested it against file, 
prod.nc<https://urldefense.com/v3/__http://prod.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!G72gvh3CO95w1D0Xt6zAA4C6D80DaZD8lCq5dp9UUGLyhFDrXU-31otuJESKItgINMRo$>,
 by changing the number of
records to 14 and verified the updated file with the utility
program named 'ncvalidator'. It appears to me the only problem
of the original 
prod.nc<https://urldefense.com/v3/__http://prod.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!G72gvh3CO95w1D0Xt6zAA4C6D80DaZD8lCq5dp9UUGLyhFDrXU-31otuJESKItgINMRo$>
 is the wrong number of records in the
file header.

Wei-keng


On Feb 4, 2022, at 8:51 AM, Ramakrishnan N 
<ram.n.krishnan@xxxxxxxxx<mailto:ram.n.krishnan@xxxxxxxxx>> wrote:

I have a netcdf file 
(prod.nc<https://urldefense.com/v3/__http://prod.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!FK2mESqDdiPm0bi9x3pxACU8dfAUrVQrXaTHKWSRfp5xONugHQH7flIruBbX7k-Hkz6Y$>)
 that contains time series from a molecular dynamics simulation (Amber force 
field, OpenMM engine, parmed netCDFReporter). The netCDFReporter had some 
problems and as a result, the number of frames in the netcdf file is zero. 
Given below is the ncdump for the file:

$ncdump -h 
prod.nc<https://urldefense.com/v3/__http://prod.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!FK2mESqDdiPm0bi9x3pxACU8dfAUrVQrXaTHKWSRfp5xONugHQH7flIruBbX7k-Hkz6Y$>
netcdf prod {
dimensions:
        frame = UNLIMITED ; // (0 currently)
        spatial = 3 ;
        atom = 20504 ;
variables:
        char spatial(spatial) ;
        float time(frame) ;
                time:units = "picosecond" ;
        float coordinates(frame, atom, spatial) ;
                coordinates:units = "angstrom" ;

// global attributes:
                :Conventions = "AMBER" ;
                :ConventionVersion = "1.0" ;
                :application = "AmberTools" ;
                :program = "ParmEd" ;
                :programVersion = "3.4.0+11.g1be8ca0f" ;
                :title = "ParmEd-created trajectory" ;
}

However, the netcdf file has non-zero size (that increases linearly with the 
number of frames stored) which implies that it certainly has the data written 
into it. I tried a number of tools (nco tools, netCDF4, scipy netcdf reader, 
xarray) to access the missing data but have not succeeded.

I have two questions:

1. Does the file contain real data?

2. If the former, is there a way to retrieve the data and create a new netcdf 
file?

I am desperately looking to salvage near 3 microseconds of simulation data 
which would take more than 2 months to generate. I would greatly appreciate it 
if anyone can provide me with some insight into this problem.

The attached netcdf file has 14 frames that can be used to examine the issue.

Thanks in advance

Best
Ram


<prod.nc<https://urldefense.com/v3/__http://prod.nc__;!!Dq0X2DkFhyF93HkjWTBQKhk!G72gvh3CO95w1D0Xt6zAA4C6D80DaZD8lCq5dp9UUGLyhFDrXU-31otuJESKItgINMRo$>>_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe,  visit: 
https://urldefense.com/v3/__https://www.unidata.ucar.edu/mailing_lists/__;!!Dq0X2DkFhyF93HkjWTBQKhk!FK2mESqDdiPm0bi9x3pxACU8dfAUrVQrXaTHKWSRfp5xONugHQH7flIruBbX7peWmZi7$


  • 2022 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: