[thredds] 4.2.10 memory scalability issues

Hi,

the TDS version 4.2.10 is presenting some scalability issues when serving many files over static catalogs.
The following entry is in the threddsConfig.xml:
 <Catalog>
   <cache>false</cache>
 </Catalog>

The config and a heap dump can be found here: http://cmip3.dkrz.de/test/
The thredds instance is running here: http://cmip3.dkrz.de/thredds

To sum up my findings:
If a few files are re used, a large number of static catalogs can be generated without triggering memory problems (i.e. 20.000 Catalgos per GB memory) However if each catalog points to ~40 different files, the number decreases considerably (i.e. ~1800 Catalogs per GB)

The limiting factor seems to be the number of files which implies there's something being cached regarding them (file objects? Paths?)

This is limiting considerably the usability of the TDS for holding CMIP5 replicas.

Thanks,
Estani

PS: Thanks John for pointing me to the correct email list.

Am 22.06.2012 20:16, schrieb John Caron:
On 6/19/2012 4:09 AM, Estanislao Gonzalez wrote:
Hi John,

I've been using the latest TDS for a while since the older one used by esgf wouldn't scale. Now I'm reaching the limit of it too and I wonder if there's anything I could do.

the TDS has 23700 catalogs published at the moment (~2.3G) and the JVM is running with 16GB.

A while ago I did some stress test on it and I've measured ~20.000 catalogs per GB. Though the catalogs where artificially generated from an existing one, basically it meant the files pointed from them where always the same.

Now I'm thinking that perhaps the directory structure or files or something is still being cached. Is there any configuration I should be using to avoid the TDS from taking that much memory? I expect to publish 40.000 datasets. Are there any known limits for what the TDS can handle? Benchmarks perhaps?

the server is running here: cmip3.dkrz.de/thredds

Thanks,
Estani


Hi Estani:

so you have

  <Catalog**>
    <*cache*>false</*cache*>
  </Catalog>

on version > 4.2.8 ? what exact version ?

can you generate a heap dump from your running production server and make it available for download ?

PS, its generally better to send these questions to support-thredds@xxxxxxxxxxxxxxxx so they get logged into our tracker, or to thredds@xxxxxxxxxxxxxxxx so all can follow along.

John



--
Estanislao Gonzalez

Max-Planck-Institut für Meteorologie (MPI-M)
Deutsches Klimarechenzentrum (DKRZ) - German Climate Computing Centre
Room 108 - Bundesstrasse 45a, D-20146 Hamburg, Germany

Phone:   +49 (40) 46 00 94-126
E-Mail:  gonzalez@xxxxxxx