Re: Need THREDDS metadata catalog info

To: "Shishir S. Bharathi" <shishir@xxxxxxx>
Subject: Re: Need THREDDS metadata catalog info
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Thu, 19 Jun 2003 14:09:16 -0600

Shishir S. Bharathi wrote:

OK. I assumed that the PICats were also services. This clarifies things.What I meant by mapping was that at what level is the actual searchperformed based on the required keywords ?
I'm trying to summarize how to get from a set of keywords to a data item(or set) that satisfies those conditions. Is this what happens ?
1. The data arrives from it's source and stored on a storage device.

yes. a lot of data is archival data, so it doesnt need to arrive.

2. Catalog generators mine this data and generate PICats (and also Datasetcatalogs ? Are these different ?)

PICats are all the various THREDDS XML documents, including catalogs,aka "dataset catalogs". The Catalogs are pretty well defined, the otherPICats we are still experimenting with.

2.1. Since the data can be of different forms, you generate metadataaccording to different schema, but the PICat itself adheres to a singleschema.

yes. there are a lot of details here we are still prototyping.

3. PICat servers pull this information from the PICats
So what do PICat servers store ? XML documents like InvCatalog.0.6.xml,which is the PICat itself ?

Currently our prototype "PICAT Server", now called "Dataset Searcher"replicates the entire catalog. We will probably revisit this whenscaleability becomes an issue. So it creates an in-memory database.Obviously this wont scale either. We are considering relationaldatabases, simple BTrees, and text indexing tools such as Lucene.

4. Query the PICat server with the keywords required
5. PICat server looks at the PICats and returns id of a Dataset Catalog
  How is this done ?

Currently just look for keyword matches. That part is easy. Thespace/time filtering is a bit harder. Our prototype just fits it all inmemory, so scanning everything is no big deal. We are considering how tomake this scaleable for the next funding cycle.


we return a catalog of matches.

6. Query the dataset catalog if needed.

same step as 5.


Is this about right ?

yup.


Thanks,
Shishir

2003 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: