Hallo Jay,
How many main memory has the host server?
During ESGF data publication we had problems (e.g. server stopped during
reinit) when we reached something like an internal limit at about 24000
datasets on one thredds data server (albedo2.dkrz.de, cmip3.dkrz.de). We
increased main memory from 32 GB to 64 GB. With more memory we had
problems later, less or never.
Maybe this could help you with your processing problem.
Thanks
Hans
Am 11.12.2013 23:36, schrieb Jay Alder:
Hi, we've recently released a web application that uses TDS for
mapping, which is getting a lot of traffic. At one point the server
stopped responding altogether, which is a major problem. A quick
restart of tomcat got it going again, so I'm starting to dig into the
logs. We normally get the GET / request complete behavior, but
occasionally we'll have:
GET ...url...
GET ...url...
GET ...url...
GET ...url...
GET ...url...
GET ...url...
GET ...url...
GET ...url...
meanwhile having a 100% CPU spike (with 12 CPUs) for a minute or more
request compete
request compete
request compete
request cancelled by client
request cancelled by client
request compete
request compete
While watching the logs the few times I've seen this occur it seems to
pull out of it ok. However the time the server failed, requests were
never returned. From the logs, requests came in for roughly 40 minutes
without being completed. Unfortunately do to the high visibility we
started to get emails from users and the press about the application
no longer working.
Has anyone experienced this before and/or can you give guidance on how
to diagnose or prevent this?
Here are some config settings:
CentOS 5.7
Java 1.6
TDS 4.3.17
only WMS is enabled
Java -Xmx set to 8Gb (currently taking 5.3, the dataset is 600 Gb of
30-arcsecond grids for the continental US, 3.4 Gb per file)
For better or worse we are configured to use 2 instances of TDS to
keep the catalogs and configuration isolated. I'm not sure if this
matters, but I didn't want to omit it. Since it is a live server I
can't easily change to the preferred proxy configuration.
I am trying not to panic yet. However, if the server goes unresponsive
again, staying calm may no longer be an option.
Jay Alder
US Geological Survey
Oregon State University
104 COAS Admin Building
Office Burt Hall 166
http://ceoas.oregonstate.edu/profile/alder/
_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/
--
Hans Ramthun
Tel.: +49 (0)40 460 094 - 112
Deutsches Klimarechenzentrum - DKRZ
Abteilung Datenmanagement http://www.dkrz.de/
Bundesstr. 45a
D-20146 Hamburg Germany
"Only he who knows his destination finds the way." (Laozi)