Re: orthogonality (was Re: New attempt)

----- Original Message -----
Sent: Thursday, June 06, 2002 1:18 PM

> Hi John,
> I didn't respond directly to all the questions you asked but I hope that
> what I wrote is sufficient...
> John Caron wrote:
> snip..
> >
> >>Thus I agree with benno that there is not a very
> >>meaningful distinction between them (and reconsider my listing of them
> >>as orthogonal concepts in my previous message).
> >>
> >>I wonder if it would be a good idea to merge these concepts and use a
> >>less loaded word, say "entry", to refer to an entity that has meaning to
> >>THREDDS and to end users, but not to a data access protocol, i.e.
> >>
> >><catalog>
> >><service name="X"/>
> >><service name="Y"/>
> >>...
> >>
> >><entry name="my_dataset">
> >>
> >>    <metadata name="global-metadata" url="..."/>
> >>    <access name="global-X-access"/>
> >>
> >>    <entry name="monthly-data">
> >>      <metadata name="monthly-metadata" url="..."/>
> >>      <access name="X-with-COARDS" serviceType="X" url="..."/>
> >>      <access name="X-with-no-COARDS" serviceType="X" url="..."/>
> >>      <access name="X-flattened-to-2D" serviceType="X"
> >>      <access name="Y" serviceType="Y" url="..."/>
> >>      ....
> >>    </entry>
> >>
> >>
> >></entry>
> >>
> >
> > Ok so an "entry" meets meaning 1), while an "access" meets meaning 3)
> > dont need to worry about meaning 2) here).
> >
> > Some questions:
> >
> > 1) Should we understand that all the access elements within an entry are
> > different versions of the same dataset? Should we disallow:
> >
> >      <entry name="monthly-data">
> >        <metadata name="monthly-metadata" url="..."/>
> >        <access name="monthly-data from MARS" serviceType="X" url="..."/>
> >        <access name="monthly-data from VENUS" serviceType="X"
> >      </entry>
> >
> No, I was not implying that for an <entry> tag. I would allow your
>  > 2) is there any relationship between peer elements, in your example
>  >
>  >      <access name="global-X-access"/>
>  >      <entry name="monthly-data">
> Not necessarily.
> I think what I am trying to suggest is while it may be useful for humans
> to think of some consistent object being accessed via different
> services, this really does not translate it to anything meaningful at
> the machine level.
> Unless we actually try to define some machine-readable relationship
> between the accesses (e.g. Type 1 aggregation, etc - which gets into the
> whole data model can of worms) the only thing a machine can understand
> is a named and described hierarchy of access objects.
> Of course, something is being lost here from the human's point of view.
> Humans seem to want to make a distinction that is not significant to
> machines:
> "a collection of accesses to some single underlying object"
> vs
> "a collection of accesses to different underlying objects, that share
> some common theme"
> Is this is what <dataset> and <collection> have been intended to mean?


machines arent smart enough to know (ie make use of) the fact that this GIF
and this DDS are representations of the same underlying object. But
conveying that fact to users can be useful to them.

> If this is the case then I would suggest that
> a) this distinction be preserved by allowing both tags to be
> used(possibly renamed if it would clarify things); and
> b) data providers should be encouraged to mark up their catalogs
> appropriately using the two tags, so that THREDDS client UI's can take
> advantage of this to present catalogs in an intuitive way; but
> c) these tags should be completely interchangeable in all other ways
> (i.e. same type in the DTD/Schema, and same API calls, any tag that can
> go in a dataset can also go in a collection), since they are
> semantically equivalent at a machine level.
> Does that make any sense? Benno, would that satisfy you?
> - Joe (ready for a checkup with my ontologist)

Actually Im inclined to take it a bit further.

Currently a collection is just some collection of datasets that share some
common theme. If we allow it also to be a dataset (meaning it has a URL, can
be selected, etc) then I think it should have the meaning that contained
datasets are subsets or specializations of it. Because if they are not it
seems to me that you might as well just represent the collection-as-dataset
as a contained dataset element. [Maybe in this whole discussion I have been
trying to convince myself of that :^] Does everyone agree with that meaning
of nested datasets inside of collection-as-dataset?

PS: There are still semantic difference between collections and datasets: A
dataset has one or more access elements, a collection 0 or more. Collections
contain datasets and nested collections.
OTOH, datasets and collections look so similar already in the XML, its
tempting to combine them (which i was playing with earlier in

>From owner-thredds@xxxxxxxxxxxxxxxx 17 2001 Sep -0700 09:35:04 
Date: 17 Sep 2001 09:35:04 -0700
From: James Gallagher <jgallagher@xxxxxxxxxxx>
In-Reply-To: <02a101c13d73$88c0e1c0$568c7580@xxxxxxxxxxxxxxxx>
To: John Caron <caron@xxxxxxxxxxxxxxxx>
Subject: Re: Aggregation Server Configuration
Received: (from majordo@localhost)
        by (UCAR/Unidata) id f8HGan414030
        for thredds-out; Mon, 17 Sep 2001 10:36:49 -0600 (MDT)
Received: from (IDENT:root@xxxxxxxx [])
        by (UCAR/Unidata) with ESMTP id f8HGaQ113924;
        Mon, 17 Sep 2001 10:36:27 -0600 (MDT)
Organization: UCAR/Unidata
Keywords: 200109171636.f8HGaQ113924
Received: from localhost (IDENT:jimg@localhost [])
        by (8.9.3/8.9.3) with ESMTP id JAA31634;
        Mon, 17 Sep 2001 09:35:04 -0700
Cc: DODS Technical Discussions <dods-tech@xxxxxxxxxxxxxxxx>,
References: <02a101c13d73$88c0e1c0$568c7580@xxxxxxxxxxxxxxxx>
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
X-Mailer: Evolution/0.12 (Preview Release)
Message-Id: <1000744505.19642.49.camel@dcz>
Mime-Version: 1.0
Sender: owner-thredds@xxxxxxxxxxxxxxxx
Precedence: bulk


For type 1 and 3 aggregations, how do you handle the case where DAS
values (that is, attribute values) vary between datasets?


On 14 Sep 2001 17:18:16 -0600, John Caron wrote:
> Version 0.4 of the Aggregation Server Configuration XML format is available
> for your comments. Changes (partly due to Joe Sirott's comments) include
> renaming some attributes, the use of XML namespaces, and factoring out the
> aggregation definitions from the Catalog.
> Version 0.4 is being used in the latest Agg Server; we are not completely
> happy with it but it is working. We would appreciate any comments on it.
> The Aggregation Server itself has some sketchy documentation at:
James Gallagher                  The Distributed Oceanographic Data System
Voice: 775.337.8612                                      Fax: 775.337.2105