Re: [thredds] ncml aggregation 200 times slower in thredds 4.6.1

  • To: Heiko Klein <Heiko.Klein@xxxxxx>
  • Subject: Re: [thredds] ncml aggregation 200 times slower in thredds 4.6.1
  • From: John Caron <caron@xxxxxxxx>
  • Date: Mon, 25 May 2015 12:35:25 -0600
Hi Heiko:

Are you sure there are 4500 files? Is it possible there are 100 files with
49 time coordinates each? If there are ~4500 files with ~50 coords each,
you would see ~225,000 time coordinates.

John

On Fri, May 22, 2015 at 4:34 AM, Heiko Klein <Heiko.Klein@xxxxxx> wrote:

> Hi John,
>
> I read now also David Robertsons mail and tried a
>
> ncdump -v time '
> http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'
>
> to cache the coordinate values. Performance after that get's slightly
> better, from 20s to 17s for a
> $ time ncdump -h '
> http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'
>
>
> I've had now a closer look at the 'cache' entries as described by David.
> Even after a 'ncdump -v time...' only 100 files (of 4575) are cached. The
> thredds 4.3 cache had also only 100 files cached.
>
> Questions are now:
>
> a) Why does 4.6 need all the coordinate values even if not requested? (Or
> why didn't 4.3 need those values?)
>
>
> b) Is there a setting to increase the amount of cached aggregation files?
> (I followed more or less blindly
> http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html,
> so settings with '100' are:
> <NetcdfFileCache>
>   <minFiles>100</minFiles>
> ...
> <TimePartition>
>   <minFiles>100</minFiles>
> ...
> )
>
> I attach my threddsConfig.xml and the mbr000_all.ncml file.
>
> Heiko
>
>
>
> On 2015-05-22 04:39, John Caron wrote:
>
>> ok, so looking closer, i see that there is a single file to cache the
>> results, so .1 seconds is more likely.
>>
>> make sure all files have been cached, by requesting the time coordinate
>> values in a dods request (see post earlier today).
>>
>> if thats the case, and subsequent accesses are still slow, send me the
>> mbr000_all.ncml file.
>>
>> ps, i am gone until monday, so we can resume then...
>>
>>
>>
>> On Thu, May 21, 2015 at 6:58 PM, John Caron <caron@xxxxxxxx
>> <mailto:caron@xxxxxxxx>> wrote:
>>
>>     reading 4575 files in .1 seconds seems a bit too fast. Im guessing
>>     that the dataset is actually getting cached in memory, and you are
>>     seeing that performance. Then the question might be why isnt that
>>     happening as fast in 4.6?
>>
>>     If you have "remote management" enabled, you can see what files are
>>     in the cache, and also clear the cache.
>>
>>
>> http://www.unidata.ucar.edu/software/thredds/v5.0/tds/reference/RemoteManagement.html
>>
>>     so, how long does it take in each version:
>>
>>     1) the first time that the aggregation is called, when theres
>>     nothing in the disk cache (eg
>> lustre/mnt/heikok/FAUNA/mbr000_all.ncml)
>>     2) the first time that the aggregation is called after the TDS
>>     starts up, when the disk cache is populated, but the dataset is not
>>     in memory (or it has been cleared from memroy)
>>     3) how long it takes after its in memory.
>>
>>     i believe that 2) could take 7 secs, and 3) takes .1 second, and
>>     maybe 1) takes 10-120 secs.
>>
>>     its possible that for 4.6, 2) has slowed down to 20 secs, and maybe
>>     its not getting memory cached so 3) never happens. I will
>>     investigate that possibility.
>>
>>     if you get a chance to experiment with checking/clearing memory
>>     cache with the 2 versions, let me know the results.
>>
>>     John
>>
>>     On Wed, May 20, 2015 at 10:41 AM, John Caron <caron@xxxxxxxx
>>     <mailto:caron@xxxxxxxx>> wrote:
>>
>>         ok, we'll see if we can reproduce the problem.
>>
>>
>>
>>         On Wed, May 20, 2015 at 10:25 AM, Heiko Klein
>>         <heiko.klein@xxxxxx <mailto:heiko.klein@xxxxxx>> wrote:
>>
>>             Hi John,
>>
>>             in 4.6.1, request-time stays at ~20s each time I try it.
>>             Only in 4.3.23 I see a huge perfomance-gain (from 7s to
>>             0.1s) after the first fetch.
>>
>>             Heiko
>>
>>             ----- Original Message -----
>>              > Hi Heiko:
>>              >
>>              > Can you see whether the second time you access the
>>             dataset, if the times
>>              > are fast again?
>>              >
>>              > thanks,
>>              > John
>>              >
>>              > On Wed, May 20, 2015 at 2:07 AM, Heiko Klein
>>             <Heiko.Klein@xxxxxx <mailto:Heiko.Klein@xxxxxx>> wrote:
>>              >
>>              > > Hi,
>>              > >
>>              > > I have some performance problems after upgrading to
>>             thredds 4.6.1.
>>              > >
>>              > > I'm aggregating a large dataset with a joinExisting
>>             aggregation. Reading
>>              > > the metadata from the aggregation took, with thredds
>>             4.3.23, about 0.1s
>>              > > (first time up to 7s). After upgrading to 4.6.1, the
>>             same request takes 20s
>>              > > (second time, first time not measured) and is unusable
>>             slow. A 'ncview' of
>>              > > the aggregated dataset is no longer possible.
>>              > >
>>              > > Suspecting some caching problems, I followed the
>>             guidelines in
>>              > >
>>
>> http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html
>>              > > The aggregation-cache contains two files
>>              > >
>>              > > $ ls -l file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
>>              > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
>>              > > -rw-r--r-- 1 tomcat7 tomcat7 74339 May 20 09:43
>>              > > file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
>>              > > -rw-r--r-- 1 tomcat7 tomcat7 74131 May 20 10:01
>>              > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
>>              > >
>>              > > (I guess the first one is thredds 4.3, while the second
>>             one is thredds 4.6)
>>              > >
>>              > >
>>              > > The ncml-file is:
>>              > >
>>              > > <netcdf
>>             xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2
>> ">
>>              > >   <aggregation dimName="time" type="joinExisting">
>>              > >     <scan location="." regExp=".*snapMet000\.nc$"
>>             subdirs="true"/>
>>              > >   </aggregation>
>>              > > </netcdf>
>>              > >
>>              > >
>>              > > It is aggregating 4575 files, each about 1.3G.
>>              > >
>>              > >
>>              > > The requests according to the logs are:
>>              > > 2015-05-20T09:49:17.111 +0200 [    372240][     281]
>>             INFO  -
>>              > > threddsServlet - Remote host: 157.249.113.42 - Request:
>>             "GET
>>              > >
>>             /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
>>             HTTP/1.1"
>>              > > 2015-05-20T09:49:17.116 +0200 [    372245][     281]
>>             INFO  -
>>              > > threddsServlet - Request Completed - 200 - -1 - 5
>>              > > 2015-05-20T09:49:17.117 +0200 [    372246][     282]
>>             INFO  -
>>              > > threddsServlet - Remote host: 157.249.113.42 - Request:
>>             "GET
>>              > >
>>             /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.das
>>             HTTP/1.1"
>>              > > 2015-05-20T09:49:17.131 +0200 [    372260][     282]
>>             INFO  -
>>              > > threddsServlet - Request Completed - 200 - -1 - 14
>>              > > 2015-05-20T09:49:17.134 +0200 [    372263][     283]
>>             INFO  -
>>              > > threddsServlet - Remote host: 157.249.113.42 - Request:
>>             "GET
>>              > >
>>             /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
>>             HTTP/1.1"
>>              > > 2015-05-20T09:49:17.138 +0200 [    372267][     283]
>>             INFO  -
>>              > > threddsServlet - Request Completed - 200 - -1 - 4
>>              > >
>>              > >
>>              > > Best regards,
>>              > >
>>              > > Heiko
>>              > >
>>              > >
>>              > > --
>>              > > Dr. Heiko Klein                   Norwegian
>>             Meteorological Institute
>>              > > Tel. + 47 22 96 32 58
>>             <tel:%2B%2047%2022%2096%2032%2058>             P.O. Box 43
>>             Blindern
>>              > > http://www.met.no                 0313 Oslo NORWAY
>>              > >
>>              > > _______________________________________________
>>              > > thredds mailing list
>>              > > thredds@xxxxxxxxxxxxxxxx <mailto:
>> thredds@xxxxxxxxxxxxxxxx>
>>              > > For list information or to unsubscribe,  visit:
>>              > > http://www.unidata.ucar.edu/mailing_lists/
>>              > >
>>              >
>>
>>
>>
>>
>>
  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: