Hi Ted,
Ted Habermann wrote:
We would be very interested in working with you to explore adding this
harvesting approach to GeoNetwork. I expect that there will be a
plethora of challenges. I am using this e-mail to collect and expose
my, admittedly long-winded and certainly primordial, thoughts on a
couple, and possibly to initiate discussion and evolution...
Content - My experience (could easily be incorrect) is that the
THREDDS community has really focused on “use” metadata which tends to
be relatively sparse (most importantly) and generally more customized.
This reflects the emergence of THREDDS from the scientific community
which traditionally shares that focus. As a result, I expect that the
threddsmetadata elements exist only in a small minority of catalogs.
This situation is exacerbated by the evolution of THREDDS towards
auto-generation of catalogs from file systems. I’m fairly sure that
this process does not involve opening the files (for performance
reasons) so metadata that might be in those files is generally not
incorporated in the catalog. I suspect that hand-hewn catalogs with
lots of metadata are rare. BTW - I suspect that the same obvious
(over-)generalization applies to the files that underlie most of these
catalogs (again I have no real quantitative evidence for this). There
are a few groups out there creating netCDF files with really
high-quality metadata content and that number may be growing, but it
is still small. This reflects the fact that most creators and users of
these files understand them pretty well and can generally use them
successfully with information gleaned from conversations or scientific
papers and presentations. The focus on high-quality standard metadata
generally comes more from archives and groups interested in the
preservation of understanding. This is a different group.
Yep - I think I understand where you are coming from. The projects I'm
working with (and the institution I work for) are trying to do the
high-quality standard metadata approach because they've realized its
importance for data management. Tools like THREDDS, GeoNetwork, other
OGC servers etc are to be introduced as part of a data management policy
with procedures/principles etc. This and links between the tools (such
as a THREDDS metadata harvester) are intended (however optimistic that
may sound!) to try and change the culture of "metadata for me and my
collaborators only". From this point of view, THREDDS has a number of
really useful features - eg. metadata inheritance in the catalog being
one of them and the potential to use datasetScan catalogs (from
filesystems or even other OPeNDAP servers) to build enhanced catalogs
with metadata elements embedded is another.
I have to leave comment on the rest of your points until I've had more
time to think about them (apologies) and study your slides :-)
Cheers and thanks,
Simon