[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: latest Catalog XML



You are quite right:  going to the extreme of one dataset per xml document is
not very efficient.   Having the ability to make a thredds catalog that contains
a single URL does not mean that I am going to use it for all datasets:  that is
not my plan.   I just want the ability to do it should it seem appropriate.  
Given that one server's collection is another server's dataset, you really ought
to allow it.


I think I can paraphrase your point about dataset aliases in that, since one can
create an alias for a dataset, it does not matter that whatever document it is
contained in may or may not contain a useless collection.  True in a functional
sense, I suppose, but rather ugly code that forces one to put in a collection
even when it has no meaning in the sense of being a desired nesting.


At the moment, I am requesting two small changes: let a catalog contain a
collection or a dataset, and let suffix be a property of a service.   The first
one may not be particularly useful, but it is quite minor.   The second one is
extremely useful to me:  I cannot make use of compound services without it.    

Please Please Please

Benno





Quoting Ethan Davis <address@hidden>:

> Hi Benno,
> 
> At first I thought I agreed with you on the dataset aliasing issue. Now I'm
> not
> so sure. It seems like the problem boils down to the ability to reference a
> dataset in an external catalog. Currently, the only external reference is a
> catalogRef which dereferences into a collection. One solution would be to
> switch
> things around in one way or another so that catalogRef could be
> dereferenced
> into a dataset. On the other hand, an aliased dataset already gets
> dereferenced
> into a dataset. There are two issues at this point: first, dataset aliases
> are
> currently restricted to local datasets; second, datasets aren't complete
> (i.e.,
> it doesn't necessarily contain a service). But that shouldn't matter, the
> client
> would need to parse the entire catalog doc to find the referenced dataset so
> the
> dereferencing step could also grab the required service. Here's an example:
> 
> Top level catalog (cat1.xml) that references datasets (sorry, I don't know
> XPointer well so I'm faking the actual references):
> 
> <catalog name="catalog 1" version="0.6d">
>   <collection name="collection 1">
>     <service name="motherlode" serviceType="DODS"
>              base="http://motherlode.ucar.edu/cgi-bin/dods/nph-dods"; />
>     <dataset alias="http://.../cat2.xml#id(ds1)" />
>     <dataset alias="http://.../cat3.xml#id(ds2)" />
>     <dataset alias="http://.../cat4.xml#id(ds3)" />
>   </collection>
> </catalog>
> 
> Lower level catalog (cat2.xml) that contains info on a single dataset that
> is
> referenced by cat1.xml:
> 
> <catalog name="catalog 2" version="0.6d">
>   <collection name="collection 2">
>     <service name="motherlode" serviceType="DODS"
>              base="http://motherlode.ucar.edu/cgi-bin/dods/nph-dods"; />
>     <dataset name="dataset 1" ID="ds1" dataType="Grid">
>       <access serviceName="motherlode" urlPath="ds1.nc" />
>       <dataset alias="http://.../cat5.xml#id(ds5)" />
>       <dataset alias="http://.../cat6.xml#id(ds6)" />
>     </dataset>
>   </collection>
> </catalog>
> 
> Not sure if this is cleaner than figuring out how to get catalogRefs to
> dereference into datasets rather than collections. The incompleteness of a
> dataset seems to make it a bit harder on the client side. What does
> everyone
> think?
> 
> 
> Also, one comment on efficiency of transmission. I certainly agree that it
> won't
> always be reasonable to represent an entire sites holdings in a single
> THREDDS
> catalog. But doesn't going to one dataset per step go to the other extreme
> which
> could also be somewhat inefficient? It certainly side-steps any opportunity
> to
> reduce the number of HTTP requests involved in the clients navigation of
> the
> hierarchy. Obviously, though, it will sometimes be desirable to do just
> that.
> 
> Ethan
> 
> Benno Blumenthal wrote:
> > 
> > Hi Ethan,
> > 
> > I don't think so, but I don't know what you are talking about either, so
> that
> > is not definitive.   Let me remind you what I am trying to do.   I
> contend
> > (though people do not necessarily believe me) that it is not reasonable
> to
> > expect the THREDDS catalog at a particular site to fit into a single
> file.
> > It is certainly grossly inefficient to transmit the whole tree when the
> user
> > is only interested in a small branch.   I think the IRI Data Library
> catalog
> > is one of those catalogs.   The way I do that is to transmit one level of
> > nesting at a time, using catalogREF to point to the sub-catalogs.   Those
> > subcatalogs may contain a collection or a dataset:  there is no real
> > difference as far as the Ingrid is concerned, though I am perfectly happy
> to
> > call objects without DODS access collections and objects with DODS access
> > datasets.     If you use datasetaliasing, then you still have the problem
> of
> > creating a THREDDS document that describes a dataset within a collection,
> > which means a catalog must be able to contain a single dataset.   No real
> > progress.
> > 
> > Benno
> > 
> > 
> > Ethan Davis wrote:
> > 
> > > Benno Blumenthal wrote:
> > > >
> > > > This does not quite work for me, unless you allow a catalog to
> > > > contain exactly one collection or exactly one dataset. Right
> > > > now the only thredds thing that can be pointed to is a catalog,
> > > > with the added provision that a catalog with one collection is
> > > > effectively that collection -- I would like to point to datasets
> > > > with thredds, which requires either that a catalog can point to
> > > > a dataset or a collection, or that you extend collection to be
> > > > isomorphic to dataset.   The catalog modification would be
> > > > preferable.
> > >
> > > Currently dataset aliasing is confined to the current document. What do
> > > people
> > > think of extending dataset aliasing to reference datasets external to
> the
> > > current document. It means a bit higher maintenance risk as far as
> broken
> > > links
> > > but not much more than the catalogRef already implies. Would that work
> for
> > > you
> > > Benno?
> > >
> > > Ethan
> > >
> > > > > 6) access element can specify an absolute URL with a serverType -or-
> a
> > > > > reletive URL with a serverID.
> > > > >
> > > >
> > > > If you are going to have a base element in service, it is only fair
> that
> > > you
> > > > have a suffix element, too.  Please leave the suffix element in.
> > >
> > > --
> > > Ethan R. Davis                       Telephone: (303) 497-8155
> > > Software Engineer                    Fax:       (303) 497-8690
> > > UCAR Unidata Program Center          E-mail:    address@hidden
> > > P.O. Box 3000
> > > Boulder, CO  80307-3000             http://www.unidata.ucar.edu/
> > 
> > --
> > Dr. M. Benno Blumenthal          address@hidden
> > International Research Institute for climate prediction
> > Lamont-Doherty Earth Observatory of Columbia University
> > Palisades NY 10964-8000                  (845) 680-4450
> > 
> > 
> 
> -- 
> Ethan R. Davis                       Telephone: (303) 497-8155
> Software Engineer                    Fax:       (303) 497-8690
> UCAR Unidata Program Center          E-mail:    address@hidden
> P.O. Box 3000
> Boulder, CO  80307-3000              http://www.unidata.ucar.edu/
>