Hi Heiko:
Are you sure there are 4500 files? Is it possible there are 100 files with
49 time coordinates each? If there are ~4500 files with ~50 coords each,
you would see ~225,000 time coordinates.
John
On Fri, May 22, 2015 at 4:34 AM, Heiko Klein <Heiko.Klein@xxxxxx> wrote:
> Hi John,
>
> I read now also David Robertsons mail and tried a
>
> ncdump -v time '
> http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'
>
> to cache the coordinate values. Performance after that get's slightly
> better, from 20s to 17s for a
> $ time ncdump -h '
> http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'
>
>
> I've had now a closer look at the 'cache' entries as described by David.
> Even after a 'ncdump -v time...' only 100 files (of 4575) are cached. The
> thredds 4.3 cache had also only 100 files cached.
>
> Questions are now:
>
> a) Why does 4.6 need all the coordinate values even if not requested? (Or
> why didn't 4.3 need those values?)
>
>
> b) Is there a setting to increase the amount of cached aggregation files?
> (I followed more or less blindly
> http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html,
> so settings with '100' are:
> <NetcdfFileCache>
> <minFiles>100</minFiles>
> ...
> <TimePartition>
> <minFiles>100</minFiles>
> ...
> )
>
> I attach my threddsConfig.xml and the mbr000_all.ncml file.
>
> Heiko
>
>
>
> On 2015-05-22 04:39, John Caron wrote:
>
>> ok, so looking closer, i see that there is a single file to cache the
>> results, so .1 seconds is more likely.
>>
>> make sure all files have been cached, by requesting the time coordinate
>> values in a dods request (see post earlier today).
>>
>> if thats the case, and subsequent accesses are still slow, send me the
>> mbr000_all.ncml file.
>>
>> ps, i am gone until monday, so we can resume then...
>>
>>
>>
>> On Thu, May 21, 2015 at 6:58 PM, John Caron <caron@xxxxxxxx
>> <mailto:caron@xxxxxxxx>> wrote:
>>
>> reading 4575 files in .1 seconds seems a bit too fast. Im guessing
>> that the dataset is actually getting cached in memory, and you are
>> seeing that performance. Then the question might be why isnt that
>> happening as fast in 4.6?
>>
>> If you have "remote management" enabled, you can see what files are
>> in the cache, and also clear the cache.
>>
>>
>> http://www.unidata.ucar.edu/software/thredds/v5.0/tds/reference/RemoteManagement.html
>>
>> so, how long does it take in each version:
>>
>> 1) the first time that the aggregation is called, when theres
>> nothing in the disk cache (eg
>> lustre/mnt/heikok/FAUNA/mbr000_all.ncml)
>> 2) the first time that the aggregation is called after the TDS
>> starts up, when the disk cache is populated, but the dataset is not
>> in memory (or it has been cleared from memroy)
>> 3) how long it takes after its in memory.
>>
>> i believe that 2) could take 7 secs, and 3) takes .1 second, and
>> maybe 1) takes 10-120 secs.
>>
>> its possible that for 4.6, 2) has slowed down to 20 secs, and maybe
>> its not getting memory cached so 3) never happens. I will
>> investigate that possibility.
>>
>> if you get a chance to experiment with checking/clearing memory
>> cache with the 2 versions, let me know the results.
>>
>> John
>>
>> On Wed, May 20, 2015 at 10:41 AM, John Caron <caron@xxxxxxxx
>> <mailto:caron@xxxxxxxx>> wrote:
>>
>> ok, we'll see if we can reproduce the problem.
>>
>>
>>
>> On Wed, May 20, 2015 at 10:25 AM, Heiko Klein
>> <heiko.klein@xxxxxx <mailto:heiko.klein@xxxxxx>> wrote:
>>
>> Hi John,
>>
>> in 4.6.1, request-time stays at ~20s each time I try it.
>> Only in 4.3.23 I see a huge perfomance-gain (from 7s to
>> 0.1s) after the first fetch.
>>
>> Heiko
>>
>> ----- Original Message -----
>> > Hi Heiko:
>> >
>> > Can you see whether the second time you access the
>> dataset, if the times
>> > are fast again?
>> >
>> > thanks,
>> > John
>> >
>> > On Wed, May 20, 2015 at 2:07 AM, Heiko Klein
>> <Heiko.Klein@xxxxxx <mailto:Heiko.Klein@xxxxxx>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I have some performance problems after upgrading to
>> thredds 4.6.1.
>> > >
>> > > I'm aggregating a large dataset with a joinExisting
>> aggregation. Reading
>> > > the metadata from the aggregation took, with thredds
>> 4.3.23, about 0.1s
>> > > (first time up to 7s). After upgrading to 4.6.1, the
>> same request takes 20s
>> > > (second time, first time not measured) and is unusable
>> slow. A 'ncview' of
>> > > the aggregated dataset is no longer possible.
>> > >
>> > > Suspecting some caching problems, I followed the
>> guidelines in
>> > >
>>
>> http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html
>> > > The aggregation-cache contains two files
>> > >
>> > > $ ls -l file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
>> > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
>> > > -rw-r--r-- 1 tomcat7 tomcat7 74339 May 20 09:43
>> > > file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
>> > > -rw-r--r-- 1 tomcat7 tomcat7 74131 May 20 10:01
>> > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
>> > >
>> > > (I guess the first one is thredds 4.3, while the second
>> one is thredds 4.6)
>> > >
>> > >
>> > > The ncml-file is:
>> > >
>> > > <netcdf
>> xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2
>> ">
>> > > <aggregation dimName="time" type="joinExisting">
>> > > <scan location="." regExp=".*snapMet000\.nc$"
>> subdirs="true"/>
>> > > </aggregation>
>> > > </netcdf>
>> > >
>> > >
>> > > It is aggregating 4575 files, each about 1.3G.
>> > >
>> > >
>> > > The requests according to the logs are:
>> > > 2015-05-20T09:49:17.111 +0200 [ 372240][ 281]
>> INFO -
>> > > threddsServlet - Remote host: 157.249.113.42 - Request:
>> "GET
>> > >
>> /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
>> HTTP/1.1"
>> > > 2015-05-20T09:49:17.116 +0200 [ 372245][ 281]
>> INFO -
>> > > threddsServlet - Request Completed - 200 - -1 - 5
>> > > 2015-05-20T09:49:17.117 +0200 [ 372246][ 282]
>> INFO -
>> > > threddsServlet - Remote host: 157.249.113.42 - Request:
>> "GET
>> > >
>> /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.das
>> HTTP/1.1"
>> > > 2015-05-20T09:49:17.131 +0200 [ 372260][ 282]
>> INFO -
>> > > threddsServlet - Request Completed - 200 - -1 - 14
>> > > 2015-05-20T09:49:17.134 +0200 [ 372263][ 283]
>> INFO -
>> > > threddsServlet - Remote host: 157.249.113.42 - Request:
>> "GET
>> > >
>> /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
>> HTTP/1.1"
>> > > 2015-05-20T09:49:17.138 +0200 [ 372267][ 283]
>> INFO -
>> > > threddsServlet - Request Completed - 200 - -1 - 4
>> > >
>> > >
>> > > Best regards,
>> > >
>> > > Heiko
>> > >
>> > >
>> > > --
>> > > Dr. Heiko Klein Norwegian
>> Meteorological Institute
>> > > Tel. + 47 22 96 32 58
>> <tel:%2B%2047%2022%2096%2032%2058> P.O. Box 43
>> Blindern
>> > > http://www.met.no 0313 Oslo NORWAY
>> > >
>> > > _______________________________________________
>> > > thredds mailing list
>> > > thredds@xxxxxxxxxxxxxxxx <mailto:
>> thredds@xxxxxxxxxxxxxxxx>
>> > > For list information or to unsubscribe, visit:
>> > > http://www.unidata.ucar.edu/mailing_lists/
>> > >
>> >
>>
>>
>>
>>
>>