Re: [thredds] nco as a web service

Hi,

If people are interested in reviving the OPeNDAP server-side functions working group, I'd help out. If nothing else, could we revive that mailing list instead of perpetuating this "nco" thread on the thredds list?

In addition to the KISS principle, I'm also a big fan of standards. OPeNDAP (with DAP2) has already cracked that nut, so that is where I would prefer to start. I think there is a lot of low hanging fruit (the first 20% of the 80%) that would be trivial for service providers to implement once we have a standard syntax. I believe in an evolutionary design approach based on use cases. That approach may lead to less stable APIs in the beginning, but it tends to be much more fruitful and usable than "big design up front". It sounds like many of us already have an API that "works", and plenty of use cases, so I think we could make a lot of progress evolving some common APIs.

Doug


On 7/2/12 8:25 AM, Roland Schweitzer wrote:
Hi,

I've been following the conversation...  A couple of comments in
general, then to a couple specific to this message.

Years ago at the OPeNDAP developers meeting I made a plea for the
community to help define a syntax for server-side functions.  We formed
a working group and had essentially this entire conversation
(application specific syntax vs a functional language, GET vs POST,
synchronous vs asynchronous) and so on.  We even wrote the conversation
down in the Wiki
(http://docs.opendap.org/index.php/Server-side_Functions).  The date on
the document is 2007.  In the end, we couldn't agree on the right
approach, got tired and stopped working on the problem.  I don't know
what lesson is to be learned from that experience except that I'm
probably not the right person to lead the effort to form a consensus on
the right approach.

Our product F-TDS will always allow transformations to be defined using
Ferret syntax.  However, if there is a consensus on a functional
language, I would be thrilled to implement it for F-TDS.

As for Ben's comments on forming the URL, the idea when we build F-TDS
was that an ordinary Ferret user would be able to key in simple
transformations in their desktop clients.  Instead of opening a data
set, they could open the data set with a new variable defined that was a
transformation of existing variables.  However, the reality of stuffing
the Ferret syntax into the URL is ugly and complicated.  The Ferret
scripting language wasn't designed for transmission on the URL so all
kinds of syntax that is significant to Ferret is also significant to
HTTP clients.  Therefore things have to be very carefully encoded and
even then we had to make some extensions to the Ferret syntax to make it
work.  So a functional language that was part of the DAP spec and URL
safe would be a big win.  Some folks in our group think it's still
viable for folks to type Ferret syntax into URLs, but I don't.

However, our web application (LAS) uses the F-TDS syntax to do all of
the transformation users request from the Web UI (average, sum, min, max
for now) and to automatically request that a variable be re-gridded when
there is a request to compute the difference between two variables.  All
the fussy preparation of the URL is handled by software and this has
been a big win for us.  LAS is faster and more capable because of this.
  To make sure this works universally, LAS will "wrap" a remote data
source in a local F-TDS URL so it can make the same transformation
requests of remote data albeit without the significant performance win
of doing the transformation local to the data.  So, if  we develop a
common functional language we would jump on that straight away -- both
to implement the functions it defines in F-TDS and to allow LAS to make
requests from remote data with the language.

Roland

On Mon, Jul 2, 2012 at 8:17 AM, Ben Domenico <bendomenico@xxxxxxxxx
<mailto:bendomenico@xxxxxxxxx>> wrote:

    Hi all,

    Just a quick note to emphasize a "use case" that I am especially
    interested in.  That is the case where an end user wants to invoke a
    server side process from within an html document.   Being able to
    specify the process in a URL makes this possible.

    On the other hand, having the user construct the URL by hand is not
    practical.   Roy's approach allows the user to set up the process
    using a browser-based client to set up the process interactively and
    them offers the resulting URL for the user to embed in a document.
      From the user viewpoint, this combination is very powerful, but
    I'm not sure how much it limits the complexity of the process that
    can be specified.

    -- Ben


    On Sun, Jul 1, 2012 at 11:55 PM, Tom Kunicki <tkunicki@xxxxxxxx
    <mailto:tkunicki@xxxxxxxx>> wrote:


        On (C) I definitely concur.  I am not against simplicity and
        HTTP GET requests.  I just want to make sure that the approach
        is discussed and that one doesn't fall into the trap of
        believing HTTP GET is a panacea of simplicity.  These URLs that
        have been posted are pretty complex and aren't the kinds of
        things that anyone but expert users will be crafting by hand.
          There will be a client implementation in front of them and
        they will need to be updated if the server processing API behind
        them changes.  In this case, the client implementation will have
        to change in tandem with the server side processing API. This
        will be true regardless of whether the request is GET, POST,
        PUT, etc.  One benefit of GET is an embeddable link, to my
        knowledge this isn't easily done with POST or PUT.

        Our group uses WPS.  We had issues with some holes with some
        implementation and the specification so we made a choice to join
        on to the WPS 2.0 SWG.

        There are advantages to the WPS specification.  Implementations
        can list a set of supported operations and processes using the
        GetCapabilities request (a GET or POST, we use GET).  Each
        process can be queried for it's API including supported inputs
        and outputs (name, mime-type and schema if xml) using a
        DescribeProcess request (GET or POST, we use GET). If you know
        the arguments and types you can parse the DescribeProcess
        response and automatically generate a UI.  We have implemented
        this in JavaScript for our Web-based brokering services.  There
        are python clients as well as an Arc plugin in-progress
        (completed?) by ERSI and 52n, also a qGIS plugin among others.
          Processes can be executed with an Execute request (a GET or
        POST request, we use POST).  POST for us because we deal with
        some pretty complex inputs (WFS calls with server side geometry
        filtering by reference to a GET or POST request; or Base64
        encoded shapefiles sent in-line).  These would bump us into some
        URL len
          gth restrictions we have dealt with in the past.  We don't
        have to use these complex inputs but since WPS offers this
        flexibility we are happy to leverage it.  When we execute
        processes we have the options to execute them synchronously or
        asynchronously (and an implementation can control these options
        by advertising them per process.)  We can query the executing
        process for it's completion state (POST, don't know if GET is
        possible as I haven't looked into it).  We can request
        executions results in-line with the response or by reference.
          We provide inputs to WPS calls as the results of other WPS
        calls.  WPS processing implementations can be complex or simple.
          Given our use cases we made an architectural decision to
        leverage some of the more advanced components of the
        specification.  We've developed some complex processing that
        does some cool and useful things that we are able to leverage in
        other projects and share with other groups.  With our processing
        endpoints we can a
          dd a process and have it automatically be displayed in our
        UIs.  One of the benefits of WPS was processing end-points
        became self-documenting.

        Now, the WPS execute by GET is pretty tricky as it requires so
        double URL encoding.  We are happy using POST and didn't delve
        too much into GET. If there was a need and someone wanted to
        look at this with me (ahem, Roy?) I would be more that happy to
        submit some change requests to simplify the specification for
        some use cases.  In my experience with the OGC standards almost
        everything can be done with GET, it's when you get into the
        outlying use cases you have to represent your requests with POST.

        WPS is an OGC specification.  I think the last 2 words of the
        previous sentence instantly turn people off.  But there's some
        real value to the work that's been done.  We've used it as a
        thin wrapper on process execution.  Our initial cut at
        processing involved using simple GET-based services.  We found
        we had to generate a whole suite of utility/supporting GET-based
        services relying on clients to perform operations with correct
        ordering.  The architecture was becoming difficult to maintain
        and document. A large number of tasks have now been implemented
        with the OGC standards suite and available standards
        implementations.  This has saved our group a lot of development
        time and in turn taxpayer dollars.

        Tom Kunicki
        Center for Integrated Data Analytics
        U.S. Geological Survey
        8505 Research Way
        Middleton, WI  53562



        On Jul 1, 2012, at 11:34 PM, Gerry Creager wrote:

         > Roy,
         >
         > That's a good explanation, and one I can live with. However,
        I also agree with Jeff's later comments, that A) in general, the
        same interpreter can handle GET and POST, and B) file uploads
        can't happen with a GET.
         >
         > And, most important: C) KISS is a good mantra.
         >
         > I'll sit back and listen to the debate again.
         >
         > gerry
         >
         > On Sun, Jul 1, 2012 at 3:13 PM, Roy Mendelssohn
        <roy.mendelssohn@xxxxxxxx <mailto:roy.mendelssohn@xxxxxxxx>> wrote:
         > BTW - a discussion we have been having around these parts is
        can you do enough in the way of server-side functions without a
        POST  (ie the URL defines the function).  That is why I would
        like to hear more from people who are running F-TDS and GDS -
        how many requests do they get for server side functions, but is
        the usual response time and download for these request, how
        large are the usual expressions?  And then contrast it with a
        WPS or WCPS approach.    I clearly believe in one approach, but
        I would welcome people who are using some of these other
        approaches to describe what they have done, the benefits of
        doing things that way, and what it means for a client.
         >
         > Thanks,
         >
         > -Roy
         >
         > On Jul 1, 2012, at 11:25 AM, Dennis Heimbigner wrote:
         >
         > > Roy-
         > >
         > > > ...  One comment.  I think you misunderstood my point about
         > > > Matlab and R.  I am not interested in Matlab specific
         > > > implementations.  The point was because the URL completely
         > > > defines the request, I can implement scripts in any
        application
         > > > that can send an URL and receive a file in terms of functions
         > > > built-in to that application - that is my clients do not
        break as
         > > > the application or operating system change.
         > >
         > > Not quite sure I understand. This phrase "...receive a file in
         > > terms of functions built-in to that application" sounds
         > > like you are creating an association between functions defined
         > > on the client side and functions defined on the server side.
         > > Can you elaborate?
         > >
         > > > Why I strongly prefer, if it is at all reasonable,
        services that
         > > > only use GET, not POST.
         > >
         > > Again, that is only possible if you keep your requests
         > > short enough to not violate the URL length restrictions.
         > >
         > > =Dennis Heimbigner
         > > Unidata
         > >
         > >
         > >
         > > Roy Mendelssohn wrote:
         > >> Hi Dennis:
         > >> Thanks.  One comment.  I think you misunderstood my point
        about Matlab and R.  I am not interested in Matlab specific
        implementations.  The point was because the URL completely
        defines the request, I can implement scripts in any application
        that can send an URL and receive a file in terms of functions
        built-in to that application  - that is my clients do not break
        as the application or operating system change.
         > >> While I understand why this occurred, a few years ago we
        had straight OPeNDAP implementations.  We had a lot of users
        using scripts we developed for Matlab, running under Windows.
          Due to updates in both Windows and Matlab, the OPeNDAP files
        for Windows stopped working (at least for Matlab).  We had a lot
        of users that were left stranded and stranded for quite a long
        time.  Developing and maintaining clients, particularly clients
        that are working within an application for which you have to
        write code, very quickly becomes a non-trivial exercise.
         > >> Since we switched to a service where the URL completely
        defines the request, our Matlab and R scripts have survived
        quite nicely any number of updates both to the applications
        themselves and to the operating systems.  That is because the
        clients now only use functions built into the applications.
         > >> Why I strongly prefer, if it is at all reasonable,
        services that only use GET, not POST.
         > >> -Roy
         > >> On Jun 28, 2012, at 1:03 PM, Dennis Heimbigner wrote:
         > >>>> I am old and slow, but suppose I am in OpeNDAP, are you
        proposing
         > >>>> to separate say constraint expressions and server-side
        function
         > >>>> requests basically the same (ie I just scan what is
        after each
         > >>>> comma) or do you propose some method that signifies in
        the URL
         > >>>> that what follows is an expression?  In F-TDS and GDS
        the form of
         > >>>> the URL is:
         > >>> First, I am proposing to subsume DAP constraints.
         > >>> Second, I am proposing, like DAP, to put the expressions
         > >>> in the query part of the URL (i.e. after the '?').
         > >>>
         > >>>>
        
http://machine:port/thredds/dodsC/dataset_expr_{dataset2,dataset3,...}{expression1;expression2;...}.URLsuffix?constraint
         > >>> So, I would rewrite this as something more-or-less like this:
         > >>> http://machine.../dataset?expression1,expression2,...
         > >>> Where the expressions would include the references to
        dataset2, dataset3,
         > >>> and the constraint.
         > >>>
         > >>>> BTW, the reason I have asked about the experience of
        people who
         > >>>> are using F-TDS and GDS on whether synchronous requests
        can cover
         > >>>> the large majority of cases, is because I am very partial to
         > >>>> systems where the URL completely defines the request,
        and hence
         > >>>> essentially use GET as the verb.
         > >>> The synchronous/asynchronous issue is, for me, a
        separable issue.
         > >>> I should note that GET has a limit on the size of URLS, so
         > >>> there needs to be ways to deal with that. Two
        possibilities are
         > >>> 1) use POST or PUT, or 2) provide a way to upload a long
        expression
         > >>> in parts USING multiple GETs.
         > >>>
         > >>>> The reason for this is long
         > >>>> experience.  where client code has broken with changes in
         > >>>> operating system and/or application, fixes were slow in
        coming,
         > >>>> so many users were left with nothing working.  In a
        system where
         > >>>> the URL completely defined the request, say ERDDAP, in
        Matlab:
         > >>>>
         > >>>>>>
        
link='http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.mat?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]';
         > >>>>>> F=urlwrite(link,'cwatch.mat');
         > >>>>>>
         > >>>> Will get the related file, and the entire command is in
        Matlab,
         > >>>> no extra code required.  The same in R is:
         > >>>>
         > >>>>>>
        
download.file(url="http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.nc?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]",
        destfile="AGssta.nc",mode='wb')
         > >>>>>>
         > >>>> again, "download.file" is an R command.
         > >>> I think that we do not want to be R/MATLAB specific
         > >>> in a proposal to put stuff in URLs. I would rather
         > >>> propose to allow uploading of R/MATLAB scripts to serve
         > >>> as additional, user-defined functions.
         > >>>
         > >>> I would prefer to
         > >>>> maintain this simplicity and cover 80% of the cases if
        possible,
         > >>>> than cover the rest but where more complex, application
        specific
         > >>>> code would have to be developed and maintained.
         > >>> Agreed. However my assumption is the the output of any
        function that
         > >>> is not assigned to a single-assignment variable will be
        returned as part
         > >>> of the response; but other ways of specifying this are
        possible within
         > >>> the functional framework I am proposing.
         > >>>
         > >>> =Dennis Heimbigner
         > >>> Unidata
         > >> **********************
         > >> "The contents of this message do not reflect any position
        of the U.S. Government or NOAA."
         > >> **********************
         > >> Roy Mendelssohn
         > >> Supervisory Operations Research Analyst
         > >> NOAA/NMFS
         > >> Environmental Research Division
         > >> Southwest Fisheries Science Center
         > >> 1352 Lighthouse Avenue
         > >> Pacific Grove, CA 93950-2097
         > >> e-mail: Roy.Mendelssohn@xxxxxxxx
        <mailto:Roy.Mendelssohn@xxxxxxxx> (Note new e-mail address)
         > >> voice: (831)-648-9029 <tel:%28831%29-648-9029>
         > >> fax: (831)-648-8440 <tel:%28831%29-648-8440>
         > >> www: http://www.pfeg.noaa.gov/
         > >> "Old age and treachery will overcome youth and skill."
         > >> "From those who have been given much, much will be
        expected" "the arc of the moral universe is long, but it bends
        toward justice" -MLK Jr.
         >
         > **********************
         > "The contents of this message do not reflect any position of
        the U.S. Government or NOAA."
         > **********************
         > Roy Mendelssohn
         > Supervisory Operations Research Analyst
         > NOAA/NMFS
         > Environmental Research Division
         > Southwest Fisheries Science Center
         > 1352 Lighthouse Avenue
         > Pacific Grove, CA 93950-2097
         >
         > e-mail: Roy.Mendelssohn@xxxxxxxx
        <mailto:Roy.Mendelssohn@xxxxxxxx> (Note new e-mail address)
         > voice: (831)-648-9029 <tel:%28831%29-648-9029>
         > fax: (831)-648-8440 <tel:%28831%29-648-8440>
         > www: http://www.pfeg.noaa.gov/
         >
         > "Old age and treachery will overcome youth and skill."
         > "From those who have been given much, much will be expected"
         > "the arc of the moral universe is long, but it bends toward
        justice" -MLK Jr.
         >
         > _______________________________________________
         > thredds mailing list
         > thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
         > For list information or to unsubscribe,  visit:
        http://www.unidata.ucar.edu/mailing_lists/
         >
         > _______________________________________________
         > thredds mailing list
         > thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
         > For list information or to unsubscribe,  visit:
        http://www.unidata.ucar.edu/mailing_lists/


        _______________________________________________
        thredds mailing list
        thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
        For list information or to unsubscribe,  visit:
        http://www.unidata.ucar.edu/mailing_lists/



    _______________________________________________
    thredds mailing list
    thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
    For list information or to unsubscribe,  visit:
    http://www.unidata.ucar.edu/mailing_lists/





  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: