Re: [netcdf-java] Data precision while aggregating data

Jon,

The precision of the time vector with "units since XXXX" must
definitely be considered carefully, but we did think about this.

We want to store all our oceanographic time series data with the same
time convention to facilitate aggregation and minimize mods to
existing software.

Choosing time as double precision with units of "days since 1858-11-17
00:00"  should give us a precision of:
  - Better than 3.0e-5 milliseconds until August 31, 2132 and
  - Better than 3.0e-4 milliseconds until October 12, 4596!

(This is actually is the definition of "Modified Julian Day", which is
one of the few internationally recognized time conventions that starts
at midnight. See http://tycho.usno.navy.mil/mjd.html for more info.
It also has the advantage of being a date by which nearly all the
world had finally switched to a Gregorian calendar, and early enough
so that most of the data we want to represent will have positive time
values.)

The bug Sachin reported is a big deal for us, since we want to use
NcML and THREDDS as a way of serving our hundreds of oceanographic
time series files as CF compliant using NcML with the THREDDS data
server without changing any of the original files.    The original
files are NetCDF, but with a non-standard convention for time:  an
integer array with julian day, and a second integer array with
milliseconds since midnight.    This allows integer math with time to
give results with no round off problems.

We have a script in Matlab (that uses double precision math) to take
our two integer format for time and create NcML for a CF-compliant
time array using start and increment.   That script produces NcML like
this:

<variable name="time" shape="time" type="double">
  <attribute name="units" value="days since 1858-11-17 00:00:00 UTC"/>
  <attribute name="long_name" value="Modified Julian Day"/>
  <values start="47865.7916666665110000" increment="0.0416666666666667"/>
</variable>

As Sachin mentioned, the start time for this file is  "05-Dec-1989
19:00:00", and as proof that we have sufficient precision, when we
simply load the time vector in NetCDF-java and do the double precision
math in Matlab, we get the right start time:

datestr(datenum([1858 11 17 0 0 0]) + 47865.791666666511)

ans =  05-Dec-1989 19:00:00

but when we use the NetCDF-Java time routines to convert to Gregorian, we get

05-Dec-1989 18:59:59 GMT

Clearly our users will not accept this.   I hope this can get resolved soon!!!!

-Rich

On Tue, May 13, 2008 at 2:52 AM, Jon Blower <jdb@xxxxxxxxxxxxxxxxxxxx> wrote:
> Hi,
>
>  I have seen similar issues (time values being out by a second or two).
>   I was wondering whether it's something to do with udunits and
>  calculating dates on the basis of "units since XXXXXX".  I seem to
>  remember an earlier conversation on this list (or maybe on the CF
>  list) concerning how udunits defines the length of certain time-spans
>  (e.g. a month) and wondered whether this might be the issue?  Jonathan
>  Gregory recommended against using "months since" and "years since" and
>  sticking to seconds or days to avoid ambiguities in the length of a
>  month/year.  But maybe this is a red herring.
>
>  Whatever the issue is I'd be very keen to understand it as it's
>  affecting me too!
>
>  Cheers, Jon
>
>
>  On Mon, May 12, 2008 at 9:31 PM, Sachin Kumar Bhate
>  <skbhate@xxxxxxxxxxxxxxx> wrote:
>
>
> > John,
>  >
>  >  The NcML  file shown below attempts to aggregate time series files,
>  >  overriding
>  >  the time values for each 'time' variable.
>  >
>  >  The aggregation works great and I can access the time values as well,
>  >  but I see that there is loss of precision in the new time values, when I
>  >  access
>  >  values for a coordinate data variable.
>  >
>  >  For example:
>  >
>  >  <<<<
>  >    URI =
>  >  'http://www.gri.msstate.edu/rsearch_data/nopp/test_agg_precision.ncml';
>  >    String var="T_20";
>  >
>  >    GridDataset gid = GridDataset.open(URI);
>  >    GeoGrid Grid = gid.findGridByName(var);
>  >    GridCoordSys GridCoordS = (GridCoordSys) Grid.getCoordinateSystem();
>  >
>  >     java.util.Date d[] = GridCoordS.getTimeDates();
>  >
>  >     System.out.println("DateString: "+d[0].toGMTString());
>  >   >>>>>
>  >
>  >  The output from the above code for the 1st time value in the java Date
>  >  array.
>  >
>  >  DateString: 5 Dec 1989 18:59:59 GMT
>  >
>  >  But, the correct value should be
>  >
>  >  DateString: 5 Dec 1989 19:00:00 GMT
>  >
>  >
>  >  Just out of curiosity I tried to print the 1st time value being read
>  >  from the NcML,
>  >  by 'ucar.nc2.ncml.NcmlReader.readValues()'. I get,
>  >
>  >  Start = 47865.79166666651;   (Parsed as double)
>  >
>  >  but,  the 1st start value specified in NcML is  '47865.7916666665110000'.
>  >
>  >  Don't care about the tailing '0s', but the digit '1' in the 12th decimal
>  >  place is being dropped and may be causing this
>  >  problem.
>  >
>  >  Although, parsing it as a 'BigDecimal' does read in the correct value.
>  >
>  >  Start-BigDecimal: 47865.7916666665110000
>  >
>  >
>  >  I am just guessing here, I am not sure if this is what causing the
>  >  precision problem.
>  >
>  >  Will appreciate your help.
>  >
>  >  thanks..
>  >
>  >  Sachin
>  >
>  >  --
>  >  Sachin Kumar Bhate, Research Associate
>  >  MSU-High Performance Computing Collaboratory, NGI
>  >  John C. Stennis Space Center, MS 39529
>  >  http://www.northerngulfinstitute.org/
>  >
>  >
>  >
>  >  _______________________________________________
>  >  netcdf-java mailing list
>  >  netcdf-java@xxxxxxxxxxxxxxxx
>  >  For list information or to unsubscribe, visit: 
> http://www.unidata.ucar.edu/mailing_lists/
>  >
>
>
>
>  --
>  --------------------------------------------------------------
>  Dr Jon Blower Tel: +44 118 378 5213 (direct line)
>  Technical Director Tel: +44 118 378 8741 (ESSC)
>  Reading e-Science Centre Fax: +44 118 378 6413
>  ESSC Email: jdb@xxxxxxxxxxxxxxxxxxxx
>  University of Reading
>  3 Earley Gate
>  Reading RG6 6AL, UK
>  --------------------------------------------------------------
>
>
> _______________________________________________
>  netcdf-java mailing list
>  netcdf-java@xxxxxxxxxxxxxxxx
>  For list information or to unsubscribe, visit: 
> http://www.unidata.ucar.edu/mailing_lists/
>



-- 
Dr. Richard P. Signell (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598


  • 2008 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: