Re: [thredds] nco as a web service

Dear Roy et al., 
Sorry for coming late to the party …  Roy asked for some feedback from GDS 
administrators on how server-side analysis is being used. 

On Jul 1, 2012, at 4:13 PM, Roy Mendelssohn wrote:

> ...  That is why I would like to hear more from people who are running F-TDS 
> and GDS - how many requests do they get for server side functions,
I did a quick 'grep' on our GDS log files (100 individual months) and 
calculated an average of 5585 server-side analysis requests per month, which is 
< 1% of the total number of data requests to the server. Many months had 0, the 
maximum was 247811. Most of these were for the real time GFS forecast data; we 
are not serving a whole lot of climate data on our GDS. The complexity of the 
analysis expressions is pretty broad -- some examples are basic subsets (which 
I would describe as user misunderstanding the purpose of server-side analysis), 
simple expressions to get the wind speed and direction from vector components, 
slp differences at two grid points, time series of area averages, ensemble 
averages, and variance of ensemble averages (this uses the cached result from 
the ensemble average calculation).


> what is the usual response time and download for these request,
It would take some clever parsing of the log entries to get an average time, 
but a cursory glance suggests most are less than 10 seconds. 

> how large are the usual expressions?  
If by 'large' you mean 'lots of characters in the expression', here are some 
examples (1 short, 2 long): 
_expr_{gfs2/gfs.2010062800i}{mag(u10m,v10m))}{8.45:8.45,56.0:56.0,1000:1000,00Z28jun2010:12Z02jul2010}

_expr_{/gfsens/gfsens.2008052300,_exprcache_12118899183320}{tloop(ave(sqrt(pow(t2m-
result.2,2)),e=1,e=21))}{-77:-77,39:39,1000:1000,23may2008:28may2008,c00:c00}

_expr_{ssta,z5a}{tmave(const(maskout(aave(ssta.1,lon=-180,lon=-90,lat=-10,lat=10),aave
(ssta.1,lon=-180,lon=-90,lat=-10,lat=10)-1.0),1),z5a.2(lev=500),t=1,t=600)}{0:360,0:90,500:500,jan1950:jan1950}

The size of a request in terms of data volume can be constrained by server 
configuration. The third example above is from the GDS documentation, and a lot 
of users try it out and then modify it to suit their needs. It's more of a 
climate analysis kind of expression, it calculates the mean 500mb height 
anomaly associated with warm tropical SST anomalies. 


> … I would welcome people who are using some of these other approaches to 
> describe what they have done, the benefits of doing things that way, and what 
> it means for a client.  

I would say server-side analysis (of the kind employed by our GDS users) is 
useful on a small scale -- individuals who desire forecast information at their 
particular location. For hard-core climate research that requires the analysis 
of BIG data, we haven't yet been able to exploit the power of server-side 
analysis (moving the analysis to the data). At COLA, we generate a lot of data 
at remote super computer centers (e.g. NCAR), but then we move a lot of it back 
to our own disks to analyze it with our favorite tools, or else we login with 
accounts at the remote locations where our data reside and use the analysis 
servers set up there for users to access their data. For CMIP5, it is just not 
practical to try to automate remote analysis of data that are so widely 
distributed, with subtle differences between each data server, and a data 
structure that is highly granular. Nobody at COLA is interested in using a 
browser to do any data analysis, it must be programmable to be useful. 

--Jennifer

--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: