[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: orthogonality (was Re: New attempt)



----- Original Message -----
From: "Joe Wielgosz" <address@hidden>
To: "John Caron" <address@hidden>
Sent: Thursday, June 06, 2002 1:18 PM
Subject: Re: orthogonality (was Re: New attempt)


> Hi John,
>
> I didn't respond directly to all the questions you asked but I hope that
> what I wrote is sufficient...
>
> John Caron wrote:
>
> snip..
>
>
> >
> >>Thus I agree with benno that there is not a very
> >>meaningful distinction between them (and reconsider my listing of them
> >>as orthogonal concepts in my previous message).
> >>
> >>I wonder if it would be a good idea to merge these concepts and use a
> >>less loaded word, say "entry", to refer to an entity that has meaning to
> >>THREDDS and to end users, but not to a data access protocol, i.e.
> >>
> >><catalog>
> >><service name="X"/>
> >><service name="Y"/>
> >>...
> >>
> >><entry name="my_dataset">
> >>
> >>    <metadata name="global-metadata" url="..."/>
> >>    <access name="global-X-access"/>
> >>
> >>    <entry name="monthly-data">
> >>      <metadata name="monthly-metadata" url="..."/>
> >>      <access name="X-with-COARDS" serviceType="X" url="..."/>
> >>      <access name="X-with-no-COARDS" serviceType="X" url="..."/>
> >>      <access name="X-flattened-to-2D" serviceType="X"
url="http://..."/>
> >>      <access name="Y" serviceType="Y" url="..."/>
> >>      ....
> >>    </entry>
> >>
> >>
> >></entry>
> >>
> >
> > Ok so an "entry" meets meaning 1), while an "access" meets meaning 3)
(we
> > dont need to worry about meaning 2) here).
> >
> > Some questions:
> >
> > 1) Should we understand that all the access elements within an entry are
> > different versions of the same dataset? Should we disallow:
> >
> >      <entry name="monthly-data">
> >        <metadata name="monthly-metadata" url="..."/>
> >        <access name="monthly-data from MARS" serviceType="X" url="..."/>
> >        <access name="monthly-data from VENUS" serviceType="X"
url="..."/>
> >      </entry>
> >
>
>
> No, I was not implying that for an <entry> tag. I would allow your
example.
>
>  > 2) is there any relationship between peer elements, in your example
>  >
>  >      <access name="global-X-access"/>
>  >      <entry name="monthly-data">
>
> Not necessarily.
>
> I think what I am trying to suggest is while it may be useful for humans
> to think of some consistent object being accessed via different
> services, this really does not translate it to anything meaningful at
> the machine level.
>
> Unless we actually try to define some machine-readable relationship
> between the accesses (e.g. Type 1 aggregation, etc - which gets into the
> whole data model can of worms) the only thing a machine can understand
> is a named and described hierarchy of access objects.
>
> Of course, something is being lost here from the human's point of view.
> Humans seem to want to make a distinction that is not significant to
> machines:
>
> "a collection of accesses to some single underlying object"
> vs
> "a collection of accesses to different underlying objects, that share
> some common theme"
>
> Is this is what <dataset> and <collection> have been intended to mean?

yes.

machines arent smart enough to know (ie make use of) the fact that this GIF
and this DDS are representations of the same underlying object. But
conveying that fact to users can be useful to them.

>
> If this is the case then I would suggest that
>
> a) this distinction be preserved by allowing both tags to be
> used(possibly renamed if it would clarify things); and
>
> b) data providers should be encouraged to mark up their catalogs
> appropriately using the two tags, so that THREDDS client UI's can take
> advantage of this to present catalogs in an intuitive way; but
>
> c) these tags should be completely interchangeable in all other ways
> (i.e. same type in the DTD/Schema, and same API calls, any tag that can
> go in a dataset can also go in a collection), since they are
> semantically equivalent at a machine level.
>
> Does that make any sense? Benno, would that satisfy you?
>
> - Joe (ready for a checkup with my ontologist)

Actually Im inclined to take it a bit further.

Currently a collection is just some collection of datasets that share some
common theme. If we allow it also to be a dataset (meaning it has a URL, can
be selected, etc) then I think it should have the meaning that contained
datasets are subsets or specializations of it. Because if they are not it
seems to me that you might as well just represent the collection-as-dataset
as a contained dataset element. [Maybe in this whole discussion I have been
trying to convince myself of that :^] Does everyone agree with that meaning
of nested datasets inside of collection-as-dataset?

PS: There are still semantic difference between collections and datasets: A
dataset has one or more access elements, a collection 0 or more. Collections
contain datasets and nested collections.
OTOH, datasets and collections look so similar already in the XML, its
tempting to combine them (which i was playing with earlier in
http://www.unidata.ucar.edu/projects/THREDDS/xml/InvCatalog.0.6a.dtd)