[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [uaf_tech] Next UAF telcon: June 10th, 12:30pm EDT (metadata for operational outputs)



Hi John,

This is a "Joe Anyone" answer.   I'd defer to Ted on "best practices" -- e.g. specific encodings from ISO of the concepts such as "now" and "now minus 7 days" that would most readily be understood by data the growing list of standards-based data discovery tools.

You describe making a dynamically computed start and end date/time (computed by the TDS server) available as THREDDS metadata.  Useful, but as we discussed below, not the whole story for discovery purposes.  The question seems to be what metadata to provide to also communicate the cycle of updates that goes with the dataset.  A natural answer would seem to be that TDS should do as complete a job as it can of generating the Unidata NetCDF Attribute Convention for Dataset Discovery  (http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html) metadata.   However the NACDD conventions seem incomplete for this task as-is.

Serving outputs of operational models has fairly well defined needs -- e.g. a forecast run every 2 days at 3AM forecasting 10 days into the future and aggregating in a rolling archive of 30 days into the past.  So I'd hope for this collection of information (schematic):
  • start_time = "present-30d"  // really "most recent update minus 30 days"
  • end_time = "present+10d"  // really "most recent update plus 10 days"
  • update_interval = 48 hours    // (fussy detail:  it is an interval not a frequency)
  • update_interval_reference_time = "2010-05-01T03:00:00+01:00"
Presumably ISO 8601 provides the authoritative way to encode these bits of time information ... if the fuzzy notion of "present" we see here is covered.  "Present" here really means "most recent update".
   
    - Steve

=============================

John Caron wrote:
On 6/10/2010 10:18 AM, Steve Hankin wrote:
Hi John,

Thanks for weighing in. Helpful. Since you ended in "Not sure if I covered all the issues", can we touch back to see what this says about the original issue that Rich raised.

The choice to have TDS translate
      <end>present</end>
      <duration>7 days</duration>
into
      Start: 2010-06-03 12:04:57Z
      End: 2010-06-10 12:04:57Z
      Duration: 7 days
has implications for data discovery services and crawling.  While the first encoding ("present" with a duration) remains true when new files are added to the underlying aggregation, the second encoding has to be altered or it becomes out of date.   Does Unidata envision that metadata harvesters will ping these datasets on a regular basis to get the updated information?  Is there (or should there be) metadata in the THREDDS catalog to tell crawlers which datasets require periodic pinging and at what frequency?  Is RAMADDA sensitive to these issues?  In short, what are your thoughts on the data discovery process for datasets that extend to "the present"?

Hi Steve:

Good questions that i obviously havent thought all the way through. Data Discovery should want the first form, since then theres no need to refresh the actual interval. What Data Discovery tools do you want to use?

Im sorry I dont know what ramadda does, but we could post the question to the ramadda email group.

-- 
Steve Hankin, NOAA/PMEL -- address@hidden
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744

"The only thing necessary for the triumph of evil is for good men
to do nothing." -- Edmund Burke