Re: Proposed new specification for THREDDSS Catalogs

To: Roland Schweitzer <Roland.Schweitzer@xxxxxxxx>
Subject: Re: Proposed new specification for THREDDSS Catalogs
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Wed, 05 May 2004 12:25:44 -0600

Roland Schweitzer wrote:

John,

John Caron wrote:
Roland Schweitzer wrote:
John,
I have a question about the THREDDS Dataset Inventory Catalog XML.I don't intend this as a criticism, but rather I'm curious about thechoices and trade-offs. All of us that are messing around with XMLare wrestling with similar issues.
In general, it seems that relationships between elements in the XMLare done via attributes. For example, a <service> element isreferred to in the document via the serviceName attribute in the<dataset> element. And a <dataset> element can be repeated byreferencing the name of another <dataset> element via the aliasattribute.
It seems to me that using this technique then requires that clientcode must be written to follow these connections. By contrast, itseems that the XML community has attempted to create languages (likeXPointer) that would "standardize" these sorts of references.Admittedly, even though the XPointer recommendation is a year old, Ihave not found (m)any implementations in general purpose XML software.
Can you please comment on these choices and trade-offs for definingthe internal connections between bit of XML that went intodeveloping the Inventory Catalog?
Thanks,
Roland
Hi Roland:

<excuse> Sorry its taken me so long to answer this </excuse>
Anyway, its not clear that the XPointer spec will become an officialstandard. XPath seems useable though, and i am open to it. Both theserviceName and the alias = dataset ID are more or less the simplecase of XPath using IDs. I think using IDs for datasets is so usefulthat it should probably be required. Which I would do if we could doso and still allow the minimal datasets like the DODS File Server.This ID reference is so simple that even DTDs have it.
So Id say full XPath is a bit of overkill right now, but i am open tousing it in the future. Do you forsee any new features that mightneed it?
No excuses needed and no worries.
I don't have any particular features in mind that require full XPath,but my question was directed at the idea that we should get the mostbang for the buck that we can out of the validation of documents.In the new catalog schema, every attribute (except name) is optionalon the dataset element. This means, simple catalogs are possible.But, I think it also means that there is no way from simply validatingthe XML to guarantee that the alias references are available in thedocument. This is a valid document (according to the schema and XMLSpy):
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="blah blah blah">
   <dataset name="billy" ID="b1"/>
   <dataset name="pointer to nothing" alias="sam"/>
</catalog>
even though the dataset named "pointer to nothing" does just that.

I'll be the first to admit I'm not even sure if what I'm thinkingabout is possible, but I think if there were some way to use the"standard" constructs of XML to enforce the relationship betweendataset elements with alias attributes and the dataset elements towhich they refer it would somehow be "better". I assume when you"validate" a document with your client library you enforce thisrelationship, but it seems it might be "better" if an off the shelfvalidation code (like XML Spy) could enforce this relationship. As Isaid, I don't know if it is possible and I'm trying to figure this outfor XML I'm designing so I'm hoping to benefit from our discussion andyour experience designing these catalogs.
Thanks,
Roland

i agree with you on all this; we continue to try to use standardvalidation as much as possible.

on this particular example, we actually now can validate this, (with thelatest version of the schema put out about a week ago and cleverly notannounced to anyone yet ;^) at


 http://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.xsd

the way it works is using the "keyref" constraint:

<!--
Enforce dataset ID references:
        1) Each dataset ID must be unique in the document.
        2) Each dataset alias must reference a dataset ID in the document.

 -->
- <xsd:unique name="datasetID">
 <xsd:selector xpath=".//dataset" />
 <xsd:field xpath="@ID" />
 </xsd:unique>

- <xsd:keyref name="datasetAlias" refer="datasetID">
 <xsd:selector xpath=".//dataset" />
 <xsd:field xpath="@alias" />
 </xsd:keyref>

interestingly enough, it appears that Xerces is not yet handling thisconstraint, but XMLSpy seems to. I havent yet tracked this down, orfound out if i need a more current version of Xerces. (i didnt get achance to try this on your example, let me know if you do...)

IMO, schemas are still bleeding-edge; im hoping they get more maturesoon. theres a lot of sentiment against W3C Schema; i toyed withRelax-NG as an alternative. Just have to keep trying different stuff fornow....

Follow-Ups:
- Re: Proposed new specification for THREDDSS Catalogs
  - From: Roland Schweitzer

References:
- Proposed new specification for THREDDSS Catalogs
  - From: John Caron
- Re: Proposed new specification for THREDDSS Catalogs
  - From: Roland Schweitzer
- Re: Proposed new specification for THREDDSS Catalogs
  - From: John Caron
- Re: Proposed new specification for THREDDSS Catalogs
  - From: Roland Schweitzer

2004 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: