[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: phone message -- sample XML to follow (here it is)




Hi John -

This sounds great - and good timing too! I'd be very happy to try a beta version and test it out on our server at GFDL. Steve and I are actually at GFDL right now, and heading back to Seattle tonight. Can I grab the beta from the usual place on the THREDDS pages?

thanks!
Kevin

John Caron wrote:
Hi guys:

The good news is that Ive found the problem with the caching. Performance now 
is a lot better, though i dont have a measurement, and a lot may depend on your 
server.

The bad news (maybe) is that I am only going to fix this in the 4.0 version of 
NcML/TDS. We are pushing hard to get this out to beta this month. Id love to 
have you start to use it, to get feedback on other issues that may be lurking.

The main problem was the "anonymous" inner aggregations. To get the caching 
right, we need to give them ids, eg:

<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
  <aggregation dimName="time" type="joinExisting">
    <netcdf ncoords="36500" id="first100">
      <aggregation type="union">
        <netcdf location="pr_A2.00010101-01001231.nc"/>
        <netcdf location="tasmax_A2.00010101-01001231.nc"/>
        <netcdf location="tasmin_A2.00010101-01001231.nc"/>
      </aggregation>
    </netcdf>
    <netcdf ncoords="36500" id="sec100">
      <aggregation type="union">
        <netcdf location="pr_A2.01010101-02001231.nc"/>
        <netcdf location="tasmax_A2.01010101-02001231.nc"/>
        <netcdf location="tasmin_A2.01010101-02001231.nc"/>
      </aggregation>
    </netcdf>
    <netcdf ncoords="36500" id="third100">
      <aggregation type="union">
        <netcdf location="pr_A2.02010101-03001231.nc"/>
        <netcdf location="tasmax_A2.02010101-03001231.nc"/>
        <netcdf location="tasmin_A2.02010101-03001231.nc"/>
      </aggregation>
    </netcdf>
  </aggregation>
</netcdf>

I might be able to generate auto ids, but for now they have to be added by 
hand. As I said, this will only be useful in the 4.0 version. Ill get a release 
out later today in case you want to try it.

John

Steve Hankin wrote:
Hi John,

Thanks for looking into this.   At this moment Kevin is modifying the
code that creates the ncML aggregation configuration from the contents
of our database.  It looks like we will be "down to the wire" in seeing
how much faster TDS becomes when we start using the improved ncML (the
changes are bigger than just moving the ncoords attribute).
Can we ask you to "stand by" and maybe be willing to set your peepers on
it later today?  Kevin's preliminary tests indicated that we will still
getting the cache hit failures (that for unknown reasons TDS rebuilds
the aggregation in cache instead of reusing what it saved previously). But we don't have an up-to-date TDS site to show you yet.

   - Steve

John Caron wrote:
hi kevin:

your ftp site is pretty slow (600 KB/sec) - is it throttled, or just
overwhelmed? should i wait until tonight to try to download these files?

Kevin OBrien wrote:
Hi John -

I did as you suggested and moved the ncoords attribute to the outer
aggregation and  I was able to get to the aggregation in around 28
seconds.  Just to confirm that it wasn't something system-related, I
changed the xml configuration back, and verified that when the ncoords
attribute was in the inner aggregation, it took around 2 minutes to
open.  So that's  a big speed improvement!    I think I will expand the
aggregation xml to include full experiments and see how the performance
changes.

By the way, you can get all of these files at
ftp://nomads.gfdl.noaa.gov/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/



and you'll see there are many more that would actually be configured
into the aggregation..

One thing I did notice and have a question about - after I moved the
ncoords attribute to the outer aggregation, and I restarted the server -
a cache file showed up in the cacheAged directory.   When I then just
restarted the server to test the use of cache, after I had opened the
aggregation (which again took around 30 seconds), I noticed that the
cache file in the cacheAged directory had apparently been updated (at
least the time stamp of the file was new).  If nothing in the
aggregation has changed, should it be updating the cache file?  Or
should it use the cache file already there?

thanks -
kevin

John Caron wrote:
Hi Kevin, Steve:

You should try putting the ncoords attribute on the outer aggregation:

            <netcdf
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";
ncoords="36500">
              <aggregation type="union">
                 <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.00010101-01001231.nc" />
                 <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.00010101-01001231.nc"
/>
                 <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.00010101-01001231.nc"
/>
              </aggregation>
            </netcdf>

let me know if that helps.

Id like to test this nested aggregation as a use case . Can I get
those 9 files? thanks.


Steve Hankin wrote:
(This is a continuation of the conversation that Kevin O'Brien
started with you.)

Hi John,

Below is the ncML and TDS configuration information.  It all "works"
...  except the caching.  Any clues?

     - Steve

===

This from threddsConfig.xml

  <AggregationCache>
<dir>/home/pmel/DataPortal/apache-tomcat-5.5.25/content/thredds/cacheAged/</dir>


    <scour>24 hours</scour>
    <maxAge>90 days</maxAge>
  </AggregationCache>  ===

And this is the latest ncML that Kevin tested:  "It took nearly two
minutes to open the aggregation the first time.  After that, accesses
were quick -- evidently caching was working.  Then I restarted the
tomcat server, and again it took nearly two minutes to open the
aggregation.  I could see that the cache file in the caching
directory was again updated after the second tomcat restart (ie, the
cache was rewritten rather than used)..."

<catalog name="test IPCC Datasets"
xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0";
        xmlns:xlink="http://www.w3.org/1999/xlink";>
<service name="thisDODS3" serviceType="OpenDAP"
base="/thredds/dodsC/" />
     <dataset ID="CM2Q-d2_1PctTo4x_j1 atmos daily all vars
00010101-03001231 test" name="CM2Q-d2_1PctTo4x_j1 atmos daily all
vars 00010101-03001231 test"
urlPath="ipcc_ar4_CM2.0_R1_1to4x-0_daily_atmos_00010101-03001231_test">

        <serviceName>thisDODS3</serviceName>
        <netcdf
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
          <aggregation dimName="time" type="joinExisting">
             <netcdf
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
               <aggregation type="union">
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.00010101-01001231.nc"

ncoords="36500"   />
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.00010101-01001231.nc"

ncoords="36500" />
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.00010101-01001231.nc"

ncoords="36500" />
               </aggregation>
             </netcdf>
             <netcdf
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
               <aggregation type="union">
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.01010101-02001231.nc"

ncoords="36500" />
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.01010101-02001231.nc"

ncoords="36500" />
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.01010101-02001231.nc"

ncoords="36500" />
               </aggregation>
             </netcdf>
             <netcdf
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
               <aggregation type="union">
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.02010101-03001231.nc"

ncoords="36500"/>
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.02010101-03001231.nc"

ncoords="36500"/>
                  <netcdf
location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.02010101-03001231.nc"

ncoords="36500" />
               </aggregation>
             </netcdf>
          </aggregation>
        </netcdf>
      </dataset>

</catalog>




Steve Hankin wrote:
Hi John,

We're at the phone number in the signature line below.  Will follow
this email shortly with some XML fragments ... hoping maybe you have
a suggestion.

   - Steve

--
Steve Hankin, NOAA/PMEL -- address@hidden
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744

"The only thing necessary for the triumph of evil is for good men
to do nothing." -- Edmund Burke