NOTE: The bufrtables
mailing list is no longer active. The list archives are made available for historical reasons.
[resending because the first attempt apparently did not make it to the list] On Wed, Mar 09, 2011 at 12:10:31PM -0700, John Caron wrote: > Apologies for the long hiatus on this list. > > I have written a brief report about BUFR/GRIB with a (possibly > controversial) recommendation. Feel free to forward to anyone who > might be interested. > > http://www.unidata.ucar.edu/staff/caron/bufr/Summary.html Hello, from the experience[1][2][3] I have with BUFR messages, I see a few problems with your proposal: 1. it would imply that BUFR decoding can only happen when/where there is network connectivity and the central server is working. I am not comfortable in tying a long lived archive to the existance of a 3rd party server; 2. alternatively, the archive needs to store and maintain up to date an entire mirror of all the tables mentioned by all the BUFRs it contains, and that more or less what we already have, barring the proposal to standardise a file format for storing tables. But if you retrofit the system that we have now with a standard file format for tables and a working central repository, you basically fix it without the need for hash codes; 3. 16bits (0-65535) are imo not that big a hash space: when you allow everyone to create new tables at will, things may degenerate quickly. But the biggest problem I have is this: you do need to maximise reuse of BUFR table codes, otherwise the problem of making sense of the decoded data is not machine computable anymore. I am maintaining software that not only decodes BUFR bulletins, but also tries to make sense of them: for example, it can understand that a given decoded value is a temperature, that it is sampled at a given vertical level and that it went through a given kind of statistical processing. That is, it can decode a bulletin and say: "There is a temperature reading at 2 meters above ground, maximum over 12 hours." This interpreted information can be used by meteorologists without having to be aware that temperatures can come as B12001, B12101, B12111, B12112, B12114..B12119 or what else. Where I work, the possibility to do this is considered a very valuable resource, as it allows to uniformly compare readings from different sources. If you have a process where data sharing across centers has to use some well standardised, well known tables (as well as some reasonable standards, or even just practices, for laying out BUFR templates), you can code (I have coded) that sort of interpretation in software. If instead anyone can at any point start distributing BUFRs that can use any B code they want to represent temperature, then the only way to make sense of a decoded bulletin is to have it personally read by an experienced meteorologist. Even if you don't want machine interpretation of the bulletins, if the lifetime of the archive is long enough then its data can potentially outlive the availability of experienced meteorologists who can remember how to make sense of them. To have a long lived archive, IMO what is needed are pervasive standards, stable over time. Instead of designing for chaos, I'd rather see how to make coordination work: propose a standard file format for distributing tables; propose the creation of a repository where to download the WMO standard table; propose a process for submission of new table entries, akin to what happens with submissions of new code points to UTF-8, or new locales to ISO. My feeling is that something like UTF-8 is more like the kind of thing to model BUFR tables on. Of course chaos should still be supported, because scientists have to have full freedom of experimentation. But there are already local table numbers that can be used for that, and after the experiments are successful the new entries can be submitted to a new version of the shared tables, so that the shared language can grow. [1] http://www.arpa.emr.it/dettaglio_documento.asp?id=2927&idlivello=64 [2] http://www.arpa.emr.it/dettaglio_documento.asp?id=514&idlivello=64 [3] http://www.arpa.emr.it/dettaglio_documento.asp?id=1172&idlivello=64 Ciao, Enrico -- GPG key: 4096R/E7AD5568 2009-05-08 Enrico Zini <enrico@xxxxxxxxxxxxxx>
bufrtables
archives: