Hi John,
you are right, there are only 104 files, so the number of files in the
cache files matches the total files. Sorry about adding to the confusion.
Heiko
On 2015-05-25 20:35, John Caron wrote:
Hi Heiko:
Are you sure there are 4500 files? Is it possible there are 100 files
with 49 time coordinates each? If there are ~4500 files with ~50 coords
each, you would see ~225,000 time coordinates.
John
On Fri, May 22, 2015 at 4:34 AM, Heiko Klein <Heiko.Klein@xxxxxx
<mailto:Heiko.Klein@xxxxxx>> wrote:
Hi John,
I read now also David Robertsons mail and tried a
ncdump -v time
'http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'
to cache the coordinate values. Performance after that get's
slightly better, from 20s to 17s for a
$ time ncdump -h
'http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'
I've had now a closer look at the 'cache' entries as described by
David. Even after a 'ncdump -v time...' only 100 files (of 4575) are
cached. The thredds 4.3 cache had also only 100 files cached.
Questions are now:
a) Why does 4.6 need all the coordinate values even if not
requested? (Or why didn't 4.3 need those values?)
b) Is there a setting to increase the amount of cached aggregation
files? (I followed more or less blindly
http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html,
so settings with '100' are:
<NetcdfFileCache>
<minFiles>100</minFiles>
...
<TimePartition>
<minFiles>100</minFiles>
...
)
I attach my threddsConfig.xml and the mbr000_all.ncml file.
Heiko
On 2015-05-22 04:39, John Caron wrote:
ok, so looking closer, i see that there is a single file to
cache the
results, so .1 seconds is more likely.
make sure all files have been cached, by requesting the time
coordinate
values in a dods request (see post earlier today).
if thats the case, and subsequent accesses are still slow, send
me the
mbr000_all.ncml file.
ps, i am gone until monday, so we can resume then...
On Thu, May 21, 2015 at 6:58 PM, John Caron <caron@xxxxxxxx
<mailto:caron@xxxxxxxx>
<mailto:caron@xxxxxxxx <mailto:caron@xxxxxxxx>>> wrote:
reading 4575 files in .1 seconds seems a bit too fast. Im
guessing
that the dataset is actually getting cached in memory, and
you are
seeing that performance. Then the question might be why
isnt that
happening as fast in 4.6?
If you have "remote management" enabled, you can see what
files are
in the cache, and also clear the cache.
http://www.unidata.ucar.edu/software/thredds/v5.0/tds/reference/RemoteManagement.html
so, how long does it take in each version:
1) the first time that the aggregation is called, when theres
nothing in the disk cache (eg
lustre/mnt/heikok/FAUNA/mbr000_all.ncml)
2) the first time that the aggregation is called after the TDS
starts up, when the disk cache is populated, but the
dataset is not
in memory (or it has been cleared from memroy)
3) how long it takes after its in memory.
i believe that 2) could take 7 secs, and 3) takes .1
second, and
maybe 1) takes 10-120 secs.
its possible that for 4.6, 2) has slowed down to 20 secs,
and maybe
its not getting memory cached so 3) never happens. I will
investigate that possibility.
if you get a chance to experiment with checking/clearing memory
cache with the 2 versions, let me know the results.
John
On Wed, May 20, 2015 at 10:41 AM, John Caron
<caron@xxxxxxxx <mailto:caron@xxxxxxxx>
<mailto:caron@xxxxxxxx <mailto:caron@xxxxxxxx>>> wrote:
ok, we'll see if we can reproduce the problem.
On Wed, May 20, 2015 at 10:25 AM, Heiko Klein
<heiko.klein@xxxxxx <mailto:heiko.klein@xxxxxx>
<mailto:heiko.klein@xxxxxx <mailto:heiko.klein@xxxxxx>>> wrote:
Hi John,
in 4.6.1, request-time stays at ~20s each time I
try it.
Only in 4.3.23 I see a huge perfomance-gain (from 7s to
0.1s) after the first fetch.
Heiko
----- Original Message -----
> Hi Heiko:
>
> Can you see whether the second time you access the
dataset, if the times
> are fast again?
>
> thanks,
> John
>
> On Wed, May 20, 2015 at 2:07 AM, Heiko Klein
<Heiko.Klein@xxxxxx <mailto:Heiko.Klein@xxxxxx>
<mailto:Heiko.Klein@xxxxxx <mailto:Heiko.Klein@xxxxxx>>> wrote:
>
> > Hi,
> >
> > I have some performance problems after
upgrading to
thredds 4.6.1.
> >
> > I'm aggregating a large dataset with a
joinExisting
aggregation. Reading
> > the metadata from the aggregation took, with
thredds
4.3.23, about 0.1s
> > (first time up to 7s). After upgrading to
4.6.1, the
same request takes 20s
> > (second time, first time not measured) and is
unusable
slow. A 'ncview' of
> > the aggregated dataset is no longer possible.
> >
> > Suspecting some caching problems, I followed the
guidelines in
> >
http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html
> > The aggregation-cache contains two files
> >
> > $ ls -l
file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
> > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
> > -rw-r--r-- 1 tomcat7 tomcat7 74339 May 20 09:43
> > file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
> > -rw-r--r-- 1 tomcat7 tomcat7 74131 May 20 10:01
> > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
> >
> > (I guess the first one is thredds 4.3, while
the second
one is thredds 4.6)
> >
> >
> > The ncml-file is:
> >
> > <netcdf
xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
> > <aggregation dimName="time" type="joinExisting">
> > <scan location="." regExp=".*snapMet000\.nc$"
subdirs="true"/>
> > </aggregation>
> > </netcdf>
> >
> >
> > It is aggregating 4575 files, each about 1.3G.
> >
> >
> > The requests according to the logs are:
> > 2015-05-20T09:49:17.111 +0200 [ 372240][
281]
INFO -
> > threddsServlet - Remote host: 157.249.113.42 -
Request:
"GET
> >
/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
HTTP/1.1"
> > 2015-05-20T09:49:17.116 +0200 [ 372245][
281]
INFO -
> > threddsServlet - Request Completed - 200 - -1 - 5
> > 2015-05-20T09:49:17.117 +0200 [ 372246][
282]
INFO -
> > threddsServlet - Remote host: 157.249.113.42 -
Request:
"GET
> >
/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.das
HTTP/1.1"
> > 2015-05-20T09:49:17.131 +0200 [ 372260][
282]
INFO -
> > threddsServlet - Request Completed - 200 - -1 - 14
> > 2015-05-20T09:49:17.134 +0200 [ 372263][
283]
INFO -
> > threddsServlet - Remote host: 157.249.113.42 -
Request:
"GET
> >
/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
HTTP/1.1"
> > 2015-05-20T09:49:17.138 +0200 [ 372267][
283]
INFO -
> > threddsServlet - Request Completed - 200 - -1 - 4
> >
> >
> > Best regards,
> >
> > Heiko
> >
> >
> > --
> > Dr. Heiko Klein Norwegian
Meteorological Institute
> > Tel. + 47 22 96 32 58
<tel:%2B%2047%2022%2096%2032%2058>
<tel:%2B%2047%2022%2096%2032%2058> P.O.
Box 43
Blindern
> > http://www.met.no 0313 Oslo NORWAY
> >
> > _______________________________________________
> > thredds mailing list
> > thredds@xxxxxxxxxxxxxxxx
<mailto:thredds@xxxxxxxxxxxxxxxx>
<mailto:thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>>
> > For list information or to unsubscribe, visit:
> > http://www.unidata.ucar.edu/mailing_lists/
> >
>
--
Dr. Heiko Klein Norwegian Meteorological Institute
Tel. + 47 22 96 32 58 P.O. Box 43 Blindern
http://www.met.no 0313 Oslo NORWAY