Re: [netcdf-java] Data precision while aggregating data

John,

Your comment about java representing time in milliseconds since 1970
gave me an idea:
perhaps the problem is simply a difference in the way that rounding is
done by the routine that calculates (year, mon, day, hour, min, sec)
from decimal days.

In the Matlab routine, time is rounded by default to the nearest second.

In the Java routine, is time rounded to the nearest millisecond, or
perhaps not even rounded, but simply truncated?

As a test, I tried adding 0.5 milliseconds to my time value:
47865.791666666511 + 1/24/3600/1000 = 47865.79166667230

and sure enough, I get the result I was looking for:

05-Dec-1989 19:00:00

-Rich

On Thu, May 15, 2008 at 1:10 PM, John Caron <caron@xxxxxxxxxxxxxxxx> wrote:
>
>
> Rich Signell wrote:
>>
>> John,
>>
>> Four replies to your four comments:   ;-)
>>
>> On Wed, May 14, 2008 at 9:08 PM, John Caron <caron@xxxxxxxxxxxxxxxx>
>> wrote:
>>>
>>> Im not quite sure where the inaccuracy comes in, likely converting
>>> between
>>> Date and udunits representation. Ill have to see what I can do.
>>>
>>> A few comments:
>>>
>>> 1) double has 53 bits of accuracy giving slightly under 16 decimal digits
>>> of
>>> accuracy. thus:
>>>
>>>  public void testDoublePrecision() {
>>>   double dval = 47865.7916666665110000;
>>>   System.out.println(" dval= "+dval);
>>>  }
>>>
>>> prints:
>>>
>>>  dval= 47865.79166666651
>>>
>>
>> Okay, you lost the lowest bit, but you should still be fine.   You
>> still have 11 places after the decimal point.    In Matlab, which uses
>> double precision arithmetic, I don't get a problem converting to
>> gregorian until we drop to 8 places after the decimal point:
>>
>> datestr(datenum([1858 11 17 0 0 0]) + 47865.791666666511) =>
>> 05-Dec-1989 19:00:00
>> datestr(datenum([1858 11 17 0 0 0]) + 47865.79166666651)   =>
>> 05-Dec-1989 19:00:00
>> datestr(datenum([1858 11 17 0 0 0]) + 47865.7916666665)    =>
>> 05-Dec-1989 19:00:00
>> datestr(datenum([1858 11 17 0 0 0]) + 47865.791666666)      =>
>> 05-Dec-1989 19:00:00
>> datestr(datenum([1858 11 17 0 0 0]) + 47865.79166666)        =>
>
> yes, it does seem funny we are losing so much precision. It probably has to
> do with converting internally to a Java date, which uses millisecs since
> 1970.
>
>> 05-Dec-1989 18:59:59
>>
>>> 2) preserving lowest bits of accuracy is tricky, and requires care, which
>>> i
>>> promise has not (yet) happened in the CDM units handling. in general,
>>> relying lowest bits being preserving is dicey.
>>
>> That's okay -- we don't need to preserve that lowest bit.
>
> how many bits do you need to preserve?
>
>
>>> 3) what is the definition of a "day". how accurate do you need that? All
>>> I
>>> could find was this note in the units package:
>>>
>>>        * Interval between 2 successive passages of sun through vernal
>>> equinox
>>>        * (365.242198781 days -- see
>>>        * http://www.ast.cam.ac.uk/pubinfo/leaflets/,
>>>        * http://aa.usno.navy.mil/AA/
>>>        * and http://adswww.colorado.edu/adswww/astro_coord.html):
>>>
>>> you may agree, but what if someone uses a different meaning for "day" ??
>>
>> Take a look at udunits.dat:
>> http://www.unidata.ucar.edu/software/udunits/udunits-1/udunits.txt
>>
>> A "day" is precisely defined as 86400 seconds.
>> A "sidereal day" is a different unit.
>
> yes, the 86400 is clear. but how many days are there between date1 and date
> 2? you have to deal with leap years etc
>
>>
>>> 4) IMHO, using udunits for calender date is a mistake. its a units
>>> package,
>>> not a calender package.
>>
>> Maybe, but I think to solve the current problem, we could just find
>> out where the computations are dropping the double precision.
>
> yes, thats the short term solution
>
>
>>
>>> 5) "47865.7916666665110000 days since 1858-11-17 00:00:00 UTC" is, um,
>>> unreadable to humans.
>>
>> What is not unreadable about that?   Yes, it's a big number with a lot
>> of precision, and a older date, but I think it's perfectly readable
>> and unambigous.    And as I mentioned, it's a an international
>> recognized convention called "Modified Julian Date".
>
> its unreadable because you cant tell what the actual date it represents,
> without using software.
>
>>
>>> 6) I earlier proposed to CF that we allow ISO date strings, more
>>> readable,
>>> not ambiguous, and doesnt have a precision problem. Various CF
>>> authorities
>>> thought it wasnt needed because it was redundant with the udunits
>>> representation.
>>
>> I think allowing ISO date strings in CF would be a good idea, and I
>> also think allowing a two integer representation in CF would be a good
>> idea (we use Julian day, and milliseconds since midnight as our two
>> integer vectors).   But that idea was also not too popular.   Several
>> people thought it would be a good idea, including Balaji, but there
>> was concern about to need to modify all existing CF applications to
>> handle these new time conventions.     But if this was just handled in
>> UDUNITS, I don't think this would be much problem, as I would think
>> that most CF-compliant apps have used the UDUNITS library to to their
>> math.
>
> part of my point to CF is that one must use udunits (which has both C and
> Java versions, as well as multiple releases. do they always agree?). Its a
> mistake to tie long-term semantics as important as time to a single software
> package. better to document what its supposed to mean, so it can be
> independently implemented.
>



-- 
Dr. Richard P. Signell (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598


  • 2008 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: