Hi,
the TDS version 4.2.10 is presenting some scalability issues when
serving many files over static catalogs.
The following entry is in the threddsConfig.xml:
<Catalog>
<cache>false</cache>
</Catalog>
The config and a heap dump can be found here: http://cmip3.dkrz.de/test/
The thredds instance is running here: http://cmip3.dkrz.de/thredds
To sum up my findings:
If a few files are re used, a large number of static catalogs can be
generated without triggering memory problems (i.e. 20.000 Catalgos per
GB memory)
However if each catalog points to ~40 different files, the number
decreases considerably (i.e. ~1800 Catalogs per GB)
The limiting factor seems to be the number of files which implies
there's something being cached regarding them (file objects? Paths?)
This is limiting considerably the usability of the TDS for holding CMIP5
replicas.
Thanks,
Estani
PS: Thanks John for pointing me to the correct email list.
Am 22.06.2012 20:16, schrieb John Caron:
On 6/19/2012 4:09 AM, Estanislao Gonzalez wrote:
Hi John,
I've been using the latest TDS for a while since the older one used
by esgf wouldn't scale.
Now I'm reaching the limit of it too and I wonder if there's anything
I could do.
the TDS has 23700 catalogs published at the moment (~2.3G) and the
JVM is running with 16GB.
A while ago I did some stress test on it and I've measured ~20.000
catalogs per GB. Though the catalogs where artificially generated
from an existing one, basically it meant the files pointed from them
where always the same.
Now I'm thinking that perhaps the directory structure or files or
something is still being cached. Is there any configuration I should
be using to avoid the TDS from taking that much memory?
I expect to publish 40.000 datasets. Are there any known limits for
what the TDS can handle? Benchmarks perhaps?
the server is running here: cmip3.dkrz.de/thredds
Thanks,
Estani
Hi Estani:
so you have
<Catalog**>
<*cache*>false</*cache*>
</Catalog>
on version > 4.2.8 ? what exact version ?
can you generate a heap dump from your running production server and
make it available for download ?
PS, its generally better to send these questions to
support-thredds@xxxxxxxxxxxxxxxx so they get logged into our tracker,
or to thredds@xxxxxxxxxxxxxxxx so all can follow along.
John
--
Estanislao Gonzalez
Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany
Phone: +49 (40) 46 00 94-126
E-Mail: gonzalez@xxxxxxx