Re: [netcdfgroup] Record Dimension Question

Yes, my file is valid as far as I can tell and it is virus checked.  I can 
ncdump it and look at it, manipulate it with python-netCDF4 and ncl, I can 
display it with Panoply and other display software, not sure what else I can do 
to verify its validity as a netcdf file.  I think the problem is the people who 
wrote the data guard software try add some rigor using some homegrown checks 
that fail, instead of just letting the netcdf library verify its integrity.  
I'll find out next week, but I'm thinking the data guard software is wrong, not 
the file.

The original grib data has been munged by wgrib2 and cdo for sure.    Not sure 
where it got converted to .nc but from the history metadata, I'd say wgrib2 
performed the conversion.   It's possible Wesley has a loose netcdf 
implementation, I suppose, but nonetheless everything but this data guard 
thinks it is a valid netcdf.

Kevin Havener

-----Original Message-----
From: Chris Barker [mailto:chris.barker@xxxxxxxx] 
Sent: Friday, January 20, 2017 2:19 PM
To: HAVENER, KEVIN F GS-12 USAF ACC 14 WS/WXED <kevin.havener@xxxxxxxxx>
Cc: Dave Allured - NOAA Affiliate <dave.allured@xxxxxxxx>; 
netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] Record Dimension Question

" I am trying to pass a netCDF3v1 file through a virus detector-like software 
(more like a firewall-like thing)  that checks for a few things to ascertain 
the file is really a netCDF3 file.  The file is global lon x lat x time (1 time 
step) with 4 variables."


This seems to be a very hard thing to do: Questions:
 
1: is this a netcdf3 file? that you can check with the first few bytes.


2: is this a VALID netcdf3 file -- if you now it's a simple file iwth these 
particular 4 variables, I suppose you could check the stuff you are checking, 
but it would get pretty ugly to make that more general -- it would make more 
sense to run it through the netcdf C lib, and see if it's valid.


3: is the data in the file corrupted? THAT is pretty much impossible to do in 
the general case -- you can store any binary blob in a valid netcdf3 file. If 
you care about this, then the (to the extent I understand it) "normal" virus 
scanning approach of looking for known malicious blobs may be as good as you 
can do.


So I'd do (2) and run it through the netcdflib (maybe driven py Python, or 
ncdump or...)


Then optionally run a regular generic virus scanner on it.


-CHB




On Fri, Jan 20, 2017 at 10:45 AM, HAVENER, KEVIN F GS-12 USAF ACC 14 WS/WXED 
<kevin.havener@xxxxxxxxx <mailto:kevin.havener@xxxxxxxxx> > wrote:


        Unfortunately this file type validator checks into at least byte 19.  
Is there any way from the file metadata to calculate the size of the file?  One 
of the errors that seems to be blocking this file might be " Size computed did 
not match size in header” is that something that can be calculated?   There are 
three mystery values in bytes 48 - 63 that I don’t have an explanation for. I 
see no evidence of file size anywhere in the octal dump.
        
        Kevin Havener
        
        
        -----Original Message-----
        From: netcdfgroup-bounces@xxxxxxxxxxxxxxxx 
<mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx>  
[mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx 
<mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx> ] On Behalf Of Dave Allured - 
NOAA Affiliate
        Sent: Thursday, January 19, 2017 6:40 PM
        To: netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> 
        Subject: Re: [netcdfgroup] Record Dimension Question
        
        Kevin,
        
        The information from appendix B is correct but incomplete.  In netcdf-3 
classic format, bytes 4-7 are "numrecs".  This is a big-endian integer with the 
current dimension size, i.e. number of elements, of the unlimited dimension.  
For netcdf-3 files with no unlimited dimension, in other words all fixed 
dimensions, numrecs is present, but the value is undefined.  For streaming 
files, numrecs is defined as all four bytes = FF hex.
        
        The unlimited dimension means the same thing as the record dimension.
        
        I recommend that you use only bytes 0-3 to identify netcdf-3 files.
        
        You might also take a look at how format identification is done in a 
recent version of the "file" utility in Linux distributions.  My recent version 
of "file" identifies netcdf-3 files as "NetCDF Data", and netcdf-4 files as 
HDF5.  My guess is that they look at only bytes 0-3 for netcdf-3, but I am not 
sure.
        
        --Dave
        
        
        On Thu, Jan 19, 2017 at 2:20 PM, HAVENER, KEVIN F GS-12 USAF ACC 14 
WS/WXED <kevin.havener@xxxxxxxxx <mailto:kevin.havener@xxxxxxxxx>  
<mailto:kevin.havener@xxxxxxxxx <mailto:kevin.havener@xxxxxxxxx> > > wrote:
        
        
                I have what I am sure is a very basic question but I couldn't 
figure out how to search the archives for it, and the documentation left me 
befuddled.
        
                I am trying to pass a netCDF3v1 file through a virus 
detector-like software (more like a firewall-like thing)  that checks for a few 
things to ascertain the file is really a netCDF3 file.  The file is global lon 
x lat x time (1 time step) with 4 variables.
        
                So I've done an octal dump on the file and I'm curious about 
the value that is supposed to be in bytes 4-7, where bytes 0-3 are "C-D-F-1".  
Appendix B in the user's guide says these bytes are the numrecs=length of the 
record dimension.  What is that?  The unlimited dimension?  My example file has 
"1" at byte 7, the example in the user's guide has 0.  My intuition tells me 
that for my file, time is considered the record dimension, but it would also be 
OK to have 0 record dimensions in this file if I don't intend to append to it.
        
                Is my understanding correct?
        
                Kevin Havener, DAFC, 14WS/WXED
        
        
                _______________________________________________
                NOTE: All exchanges posted to Unidata maintained email lists are
                recorded in the Unidata inquiry tracking system and made 
publicly
                available through the web.  Users who post to any of the lists 
we
                maintain are reminded to remove any personal information that 
they
                do not want to be made public.
        
        
                netcdfgroup mailing list
                netcdfgroup@xxxxxxxxxxxxxxxx 
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>  <mailto:netcdfgroup@xxxxxxxxxxxxxxxx 
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx> >
                For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/ 
<http://www.unidata.ucar.edu/mailing_lists/>  
<http://www.unidata.ucar.edu/mailing_lists/ 
<http://www.unidata.ucar.edu/mailing_lists/> >
        
        
        
        _______________________________________________
        NOTE: All exchanges posted to Unidata maintained email lists are
        recorded in the Unidata inquiry tracking system and made publicly
        available through the web.  Users who post to any of the lists we
        maintain are reminded to remove any personal information that they
        do not want to be made public.
        
        
        netcdfgroup mailing list
        netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> 
        For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/ 
<http://www.unidata.ucar.edu/mailing_lists/>  




-- 


Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx> 
  • 2017 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: