Yes, my file is valid as far as I can tell and it is virus checked. I can
ncdump it and look at it, manipulate it with python-netCDF4 and ncl, I can
display it with Panoply and other display software, not sure what else I can do
to verify its validity as a netcdf file. I think the problem is the people who
wrote the data guard software try add some rigor using some homegrown checks
that fail, instead of just letting the netcdf library verify its integrity.
I'll find out next week, but I'm thinking the data guard software is wrong, not
the file.
The original grib data has been munged by wgrib2 and cdo for sure. Not sure
where it got converted to .nc but from the history metadata, I'd say wgrib2
performed the conversion. It's possible Wesley has a loose netcdf
implementation, I suppose, but nonetheless everything but this data guard
thinks it is a valid netcdf.
Kevin Havener
-----Original Message-----
From: Chris Barker [mailto:chris.barker@xxxxxxxx]
Sent: Friday, January 20, 2017 2:19 PM
To: HAVENER, KEVIN F GS-12 USAF ACC 14 WS/WXED <kevin.havener@xxxxxxxxx>
Cc: Dave Allured - NOAA Affiliate <dave.allured@xxxxxxxx>;
netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] Record Dimension Question
" I am trying to pass a netCDF3v1 file through a virus detector-like software
(more like a firewall-like thing) that checks for a few things to ascertain
the file is really a netCDF3 file. The file is global lon x lat x time (1 time
step) with 4 variables."
This seems to be a very hard thing to do: Questions:
1: is this a netcdf3 file? that you can check with the first few bytes.
2: is this a VALID netcdf3 file -- if you now it's a simple file iwth these
particular 4 variables, I suppose you could check the stuff you are checking,
but it would get pretty ugly to make that more general -- it would make more
sense to run it through the netcdf C lib, and see if it's valid.
3: is the data in the file corrupted? THAT is pretty much impossible to do in
the general case -- you can store any binary blob in a valid netcdf3 file. If
you care about this, then the (to the extent I understand it) "normal" virus
scanning approach of looking for known malicious blobs may be as good as you
can do.
So I'd do (2) and run it through the netcdflib (maybe driven py Python, or
ncdump or...)
Then optionally run a regular generic virus scanner on it.
-CHB
On Fri, Jan 20, 2017 at 10:45 AM, HAVENER, KEVIN F GS-12 USAF ACC 14 WS/WXED
<kevin.havener@xxxxxxxxx <mailto:kevin.havener@xxxxxxxxx> > wrote:
Unfortunately this file type validator checks into at least byte 19.
Is there any way from the file metadata to calculate the size of the file? One
of the errors that seems to be blocking this file might be " Size computed did
not match size in header” is that something that can be calculated? There are
three mystery values in bytes 48 - 63 that I don’t have an explanation for. I
see no evidence of file size anywhere in the octal dump.
Kevin Havener
-----Original Message-----
From: netcdfgroup-bounces@xxxxxxxxxxxxxxxx
<mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx>
[mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx
<mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx> ] On Behalf Of Dave Allured -
NOAA Affiliate
Sent: Thursday, January 19, 2017 6:40 PM
To: netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
Subject: Re: [netcdfgroup] Record Dimension Question
Kevin,
The information from appendix B is correct but incomplete. In netcdf-3
classic format, bytes 4-7 are "numrecs". This is a big-endian integer with the
current dimension size, i.e. number of elements, of the unlimited dimension.
For netcdf-3 files with no unlimited dimension, in other words all fixed
dimensions, numrecs is present, but the value is undefined. For streaming
files, numrecs is defined as all four bytes = FF hex.
The unlimited dimension means the same thing as the record dimension.
I recommend that you use only bytes 0-3 to identify netcdf-3 files.
You might also take a look at how format identification is done in a
recent version of the "file" utility in Linux distributions. My recent version
of "file" identifies netcdf-3 files as "NetCDF Data", and netcdf-4 files as
HDF5. My guess is that they look at only bytes 0-3 for netcdf-3, but I am not
sure.
--Dave
On Thu, Jan 19, 2017 at 2:20 PM, HAVENER, KEVIN F GS-12 USAF ACC 14
WS/WXED <kevin.havener@xxxxxxxxx <mailto:kevin.havener@xxxxxxxxx>
<mailto:kevin.havener@xxxxxxxxx <mailto:kevin.havener@xxxxxxxxx> > > wrote:
I have what I am sure is a very basic question but I couldn't
figure out how to search the archives for it, and the documentation left me
befuddled.
I am trying to pass a netCDF3v1 file through a virus
detector-like software (more like a firewall-like thing) that checks for a few
things to ascertain the file is really a netCDF3 file. The file is global lon
x lat x time (1 time step) with 4 variables.
So I've done an octal dump on the file and I'm curious about
the value that is supposed to be in bytes 4-7, where bytes 0-3 are "C-D-F-1".
Appendix B in the user's guide says these bytes are the numrecs=length of the
record dimension. What is that? The unlimited dimension? My example file has
"1" at byte 7, the example in the user's guide has 0. My intuition tells me
that for my file, time is considered the record dimension, but it would also be
OK to have 0 record dimensions in this file if I don't intend to append to it.
Is my understanding correct?
Kevin Havener, DAFC, 14WS/WXED
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made
publicly
available through the web. Users who post to any of the lists
we
maintain are reminded to remove any personal information that
they
do not want to be made public.
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx> <mailto:netcdfgroup@xxxxxxxxxxxxxxxx
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx> >
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/
<http://www.unidata.ucar.edu/mailing_lists/>
<http://www.unidata.ucar.edu/mailing_lists/
<http://www.unidata.ucar.edu/mailing_lists/> >
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web. Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/
<http://www.unidata.ucar.edu/mailing_lists/>
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx>