NOTE: The galeon
mailing list is no longer active. The list archives are made available for historical reasons.
Hi Jon:I will follow bad internet etiquette and not bottom post, and I think I agree with a lot of what you said except for the fact of whether scientists want the original files. Let me give an example. Our high resolution satellite day might have a global file for each day. I want a time series of that data for a small region off of California. You know what, I actually do not what to download the several thousand files taking up many gigabytes to get that data. It would be nice if I could get just that region for my time period in a single file. You can do this with THREDDS/OPeNDAP, and that is the use case we see most. Or think of the lagrangean case I mentioned with animal tags where the tags only have position and the scientist wants an environment variable along that track. Again, they do not want to have to download all of the files to get that small amount of data. The ability to subset the data before the download, and as discussed at GO-ESSP sometimes perform server-side functions ( eg give me a time series of the integrated heat in the upper 150m over a region), which is then the "data" that I will be using over and over again in the analysis.
When we start to include these type of uses into our use cases, then we need to rethink our services. There is also one other point, one I made to Ben privately. The use cases all assume that the user in the use case will actually be willing to use your service. Our experience is that if you are not delivering data in the form and the in the way that they think about and use data, they will go elsewhere for the data. It may not be the "Right Way" as decreed from above, but you ignore your users at your own peril.
-Roy On Oct 14, 2008, at 1:05 AM, Jon Blower wrote:
Hi Ben, I think this is a very nice set of use cases, and I was also very interested in the discussion with Roy M that ensued. These use cases give good examples of "what" a user might want to do with FES data and "why". I think it's just as valuable to look in a bit more detail at "how" they might want to do things. This is another dimension through "use case space", if you like. One could divide methods of use into "get and forget" and "get and reuse". Let me explain further: 1) A decision-maker responding to an emergency situation needs to getthe right data as quickly as possible to help make the decision. Afterhaving done this the data can be thrown away, or perhaps archived forauditing purposes. Anyway, the data aren't reused. It probably doesn'tmatter too much if the data have been manipulated in some way to expedite the process. 2) A scientist performing a detailed analysis on a dataset (e.g. a reanalysis) needs to look at the data from a whole load of directionsand perform lots of analysis tasks. In this case the user will probablywant the original data (probably in the original data files), and will keep the data over an extended period of time. The scientist needs to be confident that the data have not been manipulated by the server. One could also think of these cases as being "real time use" and"offline use" respectively. The priorities of each case are different:in case (1) the emphasis is on getting data quickly (requiring a "clever" server); in case (2) the emphasis is on being confident thatthe data are "correct" (requiring a "dumb" server). Scientists can alsooperated in "real time" mode when performing initial explorations on data, prior to detailed analysis.I think WCS fits in best with case (1) because case (2) can be satisfied simply by serving files in a sensible format (i.e. CF-NetCDF) from some kind of file server. Clearly there are some broad-brush generalizationshere, but do others basically agree with this? Cheers, Jon
galeon
archives: