[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: buffer size and performance



John-

Since we are putting the IDV release out next week, let's not
change the nj22 buffer side until we have time to thoroughly test
things.  After that, we can start testing nj22 changes.

Along those lines, I think it's time to get the IDV and THREDDS
groups together to map out and coordinate plans for the next 3-6
months.  Jeff will be away the week of 7-11, but can we meet
the week after (Aug 14-18), perhaps at the Thursday THREDDS meeting
slot?

Among the things that are high on our plate are:

- nj22/TDS performance testing - the IDV group was tasked by
the Unidata User's Committee and the IDV Steering Committee with
focusing on performance issues and we've made some great improvements
on this issue for the 2.0 release on the IDV side.  We also
need to have this be a high priority on the nj22 side because as
Jeff has discovered, there are some issues there.  From the
user's perspective, it's the IDV's problem when loading data is
slow, but there are some things that are in your domain, not ours.
So, let's work together to address this where we can in the
near future.  We can provide specific examples, but there also
needs to be a systematic way of testing these issues.
- Grid aggregation
  - aggregating Chiz's (and eventually LEAD's) WRF GRIB output
    which has each forecast hour in a separate file.
- Accessing station obs datasets from the TDS and local file system.
  Current issues are:
  - time binning
  - need a query so the client can ask "what to you have"?
  - aggregation of mutliple files (requests crossing days)
  - efficient geo/temporal subsetting
- Accessing radar datasets from TDS/local files in the IDV
  - as Tom B showed the world at the Unidata Modelling workshop,
    a catalog is not a good interface to such a large data holding.
    We need a way like we have with ADDE to ask what stations, times
    and products are available.  An interface similar to the
    current IDV chooser is needed.
- Accessing other datasets from TDS/local file system:
   - GINI and McIDAS satellite imagery (formats and query issues exist)
   - lightning (NLDN and new system)
   - upper air
   - profiler

Are there issues from your side that we need to address in the
IDV?

Don


John Caron wrote:
Hi Jeff:

BTW, the default buffer size in Structure.Iterator is 500K, which gives better performance (I think) when the structure is actually a "psuedo-structure", that is, stored by column rather than row.

I could try modifying the default buffer size of RAF, that might improve performance on the server.

The trick is to have a wide range of queries to optimize over, and not just get one case right at the expense of the whole.

Do you want to put together an interesting "load" on the server, maybe based on a bunch of IDV tests or something?



Jeff McWhirter wrote:


What experience do you all have with the performance characteristics of data access with varying bufferesizes. I have found with the point obs that it varies dramatically for remote
data sets. e.g, getting a data iterator with BUFFERSIZE using:

PointObsDataset.getDataIterator(BUFFERSIZE)

and reading this data set (~90000 obs,32 values) http://lead4.unidata.ucar.edu:8080/thredds/dodsC/station/metar/20060716_metar.nc
I get:
buffer size:4096 Total time:41882
buffer size:8192 Total time:30348
buffer size:16384 Total time:25124
buffer size:32768 Total time:23919
buffer size:65536 Total time:26789
buffer size:131072 Total time:27540

It seems as though 32768 is a sweet spot. Any ideas why the higher buffer sizes give worse performance? Are you allocating the buffers repeatedly? (thus triggering worse GC behavior?)

On the other hand reading a local file:
/upc/share/testdata/station/madis/20060615_1200
which has 23000 obs and 180 vars I get the exact opposite:
buffer size:4096 Total time:8557
buffer size:8192 Total time:8534
buffer size:16384 Total time:8307
buffer size:32768 Total time:11004
buffer size:65536 Total time:9247
buffer size:131072 Total time:9010
buffer size:262144 Total time:9032



-Jeff


===============================================================================
To unsubscribe thredds-dev, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
===============================================================================

===============================================================================
To unsubscribe thredds-dev, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
===============================================================================

--
*************************************************************
Don Murray                               UCAR Unidata Program
address@hidden                        P.O. Box 3000
(303) 497-8628                              Boulder, CO 80307
http://www.unidata.ucar.edu/staff/donm
*************************************************************