Hi Dan,
I was not sure whether there was a real distinction between an attribute
of an element and a subelement of an element in xml, which is why I
asked the question.
But on a deeper level .....,
THREDDS v1.0 has the wonderful innovation that any element can have a
vocabulary attribute, which specifies the controlled vocabulary for the
values of that element. I think this is the greatest thing. They also
have this variable example, where the variables element specfies the
controlled vocabulary for each variable within it -- this makes perfect
sense, but the grammer is less than ideal.
Going back to the example,
<variables
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<variable name="wv" vocabulary_name="Wind Speed" units="m/s"/>
<variable name="wdir" vocabulary_name="Wind Direction" units
"degrees"/>
<variable name="o3c" vocabulary_name="Ozone Concentration" units="g/g"/>
...
</variables>
Conceptually, I would like to write this as
<variables
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<variable name="wv">
<vocabulary_name vocabulary="CF-1.0">Wind Speed</vocabulary_name>
<units vocabulary="udunits>m/s</units></variable>
<variable name="wdir">
<vocabulary_name vocabulary="CF-1.0">Wind Direction</vocabulary_name>
<units vocabulary="udunits>degrees</units></variable>
...
</variables>
So each element can specify its own controlled vocabulary, instead of
being stuck formulating a convention that covers all the attributes in
some vague way. Given a standard for transmitting vocabularies, I can
now write software to use that information: validating element,
displaying additional information (i.e. a controlled vocabulary as an
indexed (keyed) table), allowing conversions (a units attribute can be
used for units conversions, a projection attribute can be used for
projection conversions). Conventions would then get constructed out of
sets of "controlled vocabularies", the advantage being that the
software can understand what a controlled vocabulary is, and written
once, can then understand many conventions. Of course, "controlled
vocabulary" needs to be broadened, particularly as sets of attributes
can interact and are more complicated that simple lists.
Part of this I suppose is personal perspective: I think we would get a
lot farther if we set up conventions a few attributes at a time. But
there is a practical side too: it gives us a way of marking a dataset
as obeying a convention with some exceptions. For example, CCM model
output once it is in a netcdf file comes marked as following the CF
conventions. CF conventions start by saying thou shall use
udunits-compatible units. However, the CCM output I have encountered
has very few units that udunits parses as it currently stands (mostly
but not entirely a case problem). At least this way I could mark the
units as not following the convention.
A more positive example is a variable that happens to contain ISO
standard country codes. So the dataset can be marked up according to
CF, plus I can specify that this particular variable's values have the
given controlled vocabulary, making it a whole lot more useful.
Now obviously, in this example it would be better to specify the
controlled vocabulary for both units and vocabulary_name at the
variables level. I just wish we could use a grammer that was as general
as allowing a vocabulary attribute for each element. Some sort of
element-specific inheritance, I suppose.
<vocabularies inherit=true>
<controlby vocabulary="udunits"><attribute>units</attribute></controlby>
<controlby vocabulary="scalethenadd">
<attribute>scale_factor</attribute>
<attribute>add_offset</attribute>
</controlby>
<controlby vocabulary="applyscale">
<attribute>value_min</attribute>
<attribute>value_max</attribute>
<attribute>scale_min</attribute>
<attribute>scale_max</attribute>
<attribute>missing_value</attribute>
</controlby>
</vocabularies>
On the other hand, this might be going too far. Having attributes that
splice together two conventions might be done in the specification of
the vocabulary in THREDDS v1.0, since the convention for transmitting
vocabularies is to be determined. I could then make up my own
convention "almost CF', or "CF plus iso99999" and inherit all of CF plus
whatever changes are needed. Then we are back to exactly what you said
-- all we need to do is specify the proper convention for the whole set
of attributes. Of course we might end up with "per-dataset"
conventions, but since we can describe them in a standard way, perhaps
it is not too bad.
The kicker is (and I am so glad that you asked), the kicker is that what
I really want is to specify vocabularies for OpenDAP attributes (THREDDS
giving the variable list is sneaking down into the OpenDAP level of
specificity). So is that part of the next generation of OpenDAP?
Benno
The original conversation:
Benno Blumenthal wrote:
If I understood xml better, I guess I would know the answer to this
question, but here goes.
Suppose I had a variable list, e.g. (taken from the documentation page)
<variables
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<variable name="wv" vocabulary_name="Wind Speed" units="m/s"/>
<variable name="wdir" vocabulary_name="Wind Direction" units
"degrees"/>
<variable name="o3c" vocabulary_name="Ozone Concentration" units="g/g"/>
...
</variables>
Suppose I want to say that the units are udunits compliant. Can I write
<variables
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" >
<variable name="wv" vocabulary_name="Wind Speed">
<units vocabulary="udunits>m/s</units></variable>
<variable name="wdir" vocabulary_name="Wind Direction">
<units vocabulary="udunits>degrees</units></variable>
<variable name="o3c" vocabulary_name="Ozone Concentration">
<units vocabulary="udunits">g/g</units></variable>
...
</variables>
I certainly would like to be able to do so.
Currently 'units' are an attribute of <variable> not a separate
element. But doesn't
your example imply that you want to identify the 'authority', or
'controlled vocabulary'
that both the 'units' as well as variable 'name' are relative to? The
schema allows for
the catalog to identify the source of the controlled vocabulary in use,
I assume that
could be extended to include the authority for the 'units' that are used
as attributes of
a <variable> element. That might negate the necessity of adding
specific <units>
elements to the schema. Just a thought. I too am not an expert on
XML Schemas.
Dan
Benno
--
**************************************************************************** <
Unidata User Support UCAR Unidata Program <
(303)497-8643 P.O. Box 3000 <
support@xxxxxxxxxxxxxxxx Boulder, CO 80307 <
---------------------------------------------------------------------------- <
Unidata WWW Service http://my.unidata.ucar.edu/content/support <
---------------------------------------------------------------------------- <
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web. If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.
------- End of Forwarded Message
--
Dr. M. Benno Blumenthal benno@xxxxxxxxxxxxxxxx
International Research Institute for climate prediction
The Earth Institute at Columbia University
Lamont Campus, Palisades NY 10964-8000 (845) 680-4450