Re: [galeon] The GALEON wiki and Use Cases

NOTE: The galeon mailing list is no longer active. The list archives are made available for historical reasons.

Hi Roy,

You are quite right of course that subsetting servers are important to
scientists (I did say that I was generalizing!)  I was trying to
highlight that "dumb" servers have their advantages because scientists
can trust that they haven't manipulated the data.  OPeNDAP I guess
falls somewhere between the "dumb" (e.g. Apache) and "smart" servers
(e.g. WCS) because it does basic subsetting in index space and hence
its output is predictable - it behaves exactly as if the users have
done the subsetting locally with no other manipulation and hence is
pretty trustworthy.  I don't want to suggest for a moment that
server-side subsetting isn't important (we use OPeNDAP every day too)
but scientists tell me all the time that "we just need the data".

Again, I just wanted to say that "how" something is done is just as
important to use case analysis as "what" is done.

Regarding John G's point about sensors - I'm afraid I don't know
anything about sensors but I guess all data is processed somewhat
before publication: even model results are often translated to a
different format (e.g. Met Office PP format to NetCDF).  I think the
point is that scientists often feel most comfortable getting data from
as close to the original source as possible, as the data providers
intended it.

Cheers, Jon


On Wed, Oct 15, 2008 at 3:49 AM, Roy Mendelssohn
<Roy.Mendelssohn@xxxxxxxx> wrote:
Hi Jon:

I will follow bad internet etiquette and not bottom post, and I think I
agree with a lot of what you said except for the fact of whether scientists
want the original files.  Let me give an example.  Our high resolution
satellite day might have a global file for each day.  I want a time series
of that data for a small region off of California.  You know what, I
actually do not what to download the several thousand files taking up many
gigabytes to get that data.  It would be nice if I could get just that
region for my time period in a single file.  You can do this with
THREDDS/OPeNDAP, and that is the use case we see most.  Or think of the
lagrangean case I mentioned with animal tags where the tags only have
position and the scientist wants an environment variable along that track.
 Again, they do not want to have to download all of the files to get that
small amount of data.  The ability to subset the data before the download,
and as discussed at GO-ESSP sometimes perform server-side functions  ( eg
give me a time series of the integrated heat in the upper 150m  over a
region), which is then the "data" that I will be using over and over again
in the analysis.

When we start to include these type of uses into our use cases, then we need
to rethink our services.  There is also one other point, one I made to Ben
privately.  The use cases all assume that the user in the use case will
actually be willing to use your service.  Our experience is that if you are
not delivering data  in the form and the in the way that they think about
and use data, they will go elsewhere for the data.  It may not be the "Right
Way" as decreed from above, but you ignore your users at your own peril.

-Roy



  • 2008 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the galeon archives: