This is interesting. I think a move to ISO strings would be a good
one - do you think it's worth bringing this up again with CF? I'd
support this, FWIW.
Am I correct in thinking that the problem is caused because a human
means "calendar days" or "calendar months" but udunits means a
specific, fixed number of milliseconds?
Can this be fixed in NetCDF-Java without change to CF, perhaps by
using Joda-time (a proper calendaring library) instead of udunits for
time handling?
Cheers, Jon
On Thu, May 15, 2008 at 2:08 AM, John Caron <caron@xxxxxxxxxxxxxxxx> wrote:
> Im not quite sure where the inaccuracy comes in, likely converting between
> Date and udunits representation. Ill have to see what I can do.
>
> A few comments:
>
> 1) double has 53 bits of accuracy giving slightly under 16 decimal digits of
> accuracy. thus:
>
> public void testDoublePrecision() {
> double dval = 47865.7916666665110000;
> System.out.println(" dval= "+dval);
> }
>
> prints:
>
> dval= 47865.79166666651
>
> 2) preserving lowest bits of accuracy is tricky, and requires care, which i
> promise has not (yet) happened in the CDM units handling. in general,
> relying lowest bits being preserving is dicey.
>
> 3) what is the definition of a "day". how accurate do you need that? All I
> could find was this note in the units package:
>
> * Interval between 2 successive passages of sun through vernal
> equinox
> * (365.242198781 days -- see
> * http://www.ast.cam.ac.uk/pubinfo/leaflets/,
> * http://aa.usno.navy.mil/AA/
> * and http://adswww.colorado.edu/adswww/astro_coord.html):
>
> you may agree, but what if someone uses a different meaning for "day" ??
>
> 4) IMHO, using udunits for calender date is a mistake. its a units package,
> not a calender package.
>
> 5) "47865.7916666665110000 days since 1858-11-17 00:00:00 UTC" is, um,
> unreadable to humans.
>
> 6) I earlier proposed to CF that we allow ISO date strings, more readable,
> not ambiguous, and doesnt have a precision problem. Various CF authorities
> thought it wasnt needed because it was redundant with the udunits
> representation.
>
>
>
> Rich Signell wrote:
>>
>> Jon,
>>
>> The precision of the time vector with "units since XXXX" must
>> definitely be considered carefully, but we did think about this.
>>
>> We want to store all our oceanographic time series data with the same
>> time convention to facilitate aggregation and minimize mods to
>> existing software.
>>
>> Choosing time as double precision with units of "days since 1858-11-17
>> 00:00" should give us a precision of:
>> - Better than 3.0e-5 milliseconds until August 31, 2132 and
>> - Better than 3.0e-4 milliseconds until October 12, 4596!
>>
>> (This is actually is the definition of "Modified Julian Day", which is
>> one of the few internationally recognized time conventions that starts
>> at midnight. See http://tycho.usno.navy.mil/mjd.html for more info.
>> It also has the advantage of being a date by which nearly all the
>> world had finally switched to a Gregorian calendar, and early enough
>> so that most of the data we want to represent will have positive time
>> values.)
>>
>> The bug Sachin reported is a big deal for us, since we want to use
>> NcML and THREDDS as a way of serving our hundreds of oceanographic
>> time series files as CF compliant using NcML with the THREDDS data
>> server without changing any of the original files. The original
>> files are NetCDF, but with a non-standard convention for time: an
>> integer array with julian day, and a second integer array with
>> milliseconds since midnight. This allows integer math with time to
>> give results with no round off problems.
>>
>> We have a script in Matlab (that uses double precision math) to take
>> our two integer format for time and create NcML for a CF-compliant
>> time array using start and increment. That script produces NcML like
>> this:
>>
>> <variable name="time" shape="time" type="double">
>> <attribute name="units" value="days since 1858-11-17 00:00:00 UTC"/>
>> <attribute name="long_name" value="Modified Julian Day"/>
>> <values start="47865.7916666665110000" increment="0.0416666666666667"/>
>> </variable>
>>
>> As Sachin mentioned, the start time for this file is "05-Dec-1989
>> 19:00:00", and as proof that we have sufficient precision, when we
>> simply load the time vector in NetCDF-java and do the double precision
>> math in Matlab, we get the right start time:
>>
>> datestr(datenum([1858 11 17 0 0 0]) + 47865.791666666511)
>>
>> ans = 05-Dec-1989 19:00:00
>>
>> but when we use the NetCDF-Java time routines to convert to Gregorian, we
>> get
>>
>> 05-Dec-1989 18:59:59 GMT
>>
>> Clearly our users will not accept this. I hope this can get resolved
>> soon!!!!
>>
>> -Rich
>>
>> On Tue, May 13, 2008 at 2:52 AM, Jon Blower <jdb@xxxxxxxxxxxxxxxxxxxx>
>> wrote:
>>>
>>> Hi,
>>>
>>> I have seen similar issues (time values being out by a second or two).
>>> I was wondering whether it's something to do with udunits and
>>> calculating dates on the basis of "units since XXXXXX". I seem to
>>> remember an earlier conversation on this list (or maybe on the CF
>>> list) concerning how udunits defines the length of certain time-spans
>>> (e.g. a month) and wondered whether this might be the issue? Jonathan
>>> Gregory recommended against using "months since" and "years since" and
>>> sticking to seconds or days to avoid ambiguities in the length of a
>>> month/year. But maybe this is a red herring.
>>>
>>> Whatever the issue is I'd be very keen to understand it as it's
>>> affecting me too!
>>>
>>> Cheers, Jon
>>>
>>>
>>> On Mon, May 12, 2008 at 9:31 PM, Sachin Kumar Bhate
>>> <skbhate@xxxxxxxxxxxxxxx> wrote:
>>>
>>>
>>>> John,
>>>
>>> >
>>> > The NcML file shown below attempts to aggregate time series files,
>>> > overriding
>>> > the time values for each 'time' variable.
>>> >
>>> > The aggregation works great and I can access the time values as well,
>>> > but I see that there is loss of precision in the new time values,
>>> when I
>>> > access
>>> > values for a coordinate data variable.
>>> >
>>> > For example:
>>> >
>>> > <<<<
>>> > URI =
>>> >
>>> 'http://www.gri.msstate.edu/rsearch_data/nopp/test_agg_precision.ncml';
>>> > String var="T_20";
>>> >
>>> > GridDataset gid = GridDataset.open(URI);
>>> > GeoGrid Grid = gid.findGridByName(var);
>>> > GridCoordSys GridCoordS = (GridCoordSys)
>>> Grid.getCoordinateSystem();
>>> >
>>> > java.util.Date d[] = GridCoordS.getTimeDates();
>>> >
>>> > System.out.println("DateString: "+d[0].toGMTString());
>>> > >>>>>
>>> >
>>> > The output from the above code for the 1st time value in the java
>>> Date
>>> > array.
>>> >
>>> > DateString: 5 Dec 1989 18:59:59 GMT
>>> >
>>> > But, the correct value should be
>>> >
>>> > DateString: 5 Dec 1989 19:00:00 GMT
>>> >
>>> >
>>> > Just out of curiosity I tried to print the 1st time value being read
>>> > from the NcML,
>>> > by 'ucar.nc2.ncml.NcmlReader.readValues()'. I get,
>>> >
>>> > Start = 47865.79166666651; (Parsed as double)
>>> >
>>> > but, the 1st start value specified in NcML is
>>> '47865.7916666665110000'.
>>> >
>>> > Don't care about the tailing '0s', but the digit '1' in the 12th
>>> decimal
>>> > place is being dropped and may be causing this
>>> > problem.
>>> >
>>> > Although, parsing it as a 'BigDecimal' does read in the correct
>>> value.
>>> >
>>> > Start-BigDecimal: 47865.7916666665110000
>>> >
>>> >
>>> > I am just guessing here, I am not sure if this is what causing the
>>> > precision problem.
>>> >
>>> > Will appreciate your help.
>>> >
>>> > thanks..
>>> >
>>> > Sachin
>>> >
>>> > --
>>> > Sachin Kumar Bhate, Research Associate
>>> > MSU-High Performance Computing Collaboratory, NGI
>>> > John C. Stennis Space Center, MS 39529
>>> > http://www.northerngulfinstitute.org/
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > netcdf-java mailing list
>>> > netcdf-java@xxxxxxxxxxxxxxxx
>>> > For list information or to unsubscribe, visit:
>>> http://www.unidata.ucar.edu/mailing_lists/
>>> >
>>>
>>>
>>>
>>> --
>>> --------------------------------------------------------------
>>> Dr Jon Blower Tel: +44 118 378 5213 (direct line)
>>> Technical Director Tel: +44 118 378 8741 (ESSC)
>>> Reading e-Science Centre Fax: +44 118 378 6413
>>> ESSC Email: jdb@xxxxxxxxxxxxxxxxxxxx
>>> University of Reading
>>> 3 Earley Gate
>>> Reading RG6 6AL, UK
>>> --------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> netcdf-java mailing list
>>> netcdf-java@xxxxxxxxxxxxxxxx
>>> For list information or to unsubscribe, visit:
>>> http://www.unidata.ucar.edu/mailing_lists/
>>>
>>
>>
>>
>
--
--------------------------------------------------------------
Dr Jon Blower Tel: +44 118 378 5213 (direct line)
Technical Director Tel: +44 118 378 8741 (ESSC)
Reading e-Science Centre Fax: +44 118 378 6413
ESSC Email: jdb@xxxxxxxxxxxxxxxxxxxx
University of Reading
3 Earley Gate
Reading RG6 6AL, UK
--------------------------------------------------------------