Great. We’ve actually been talking about the possibility of using some form of
caching (memcached or other) since we started getting these errors. The
application in question can create about 56,000 possible maps. Since its finite
a cache service could work out nicely.
Thanks
On Dec 13, 2013, at 12:12 AM, Heiko Klein <Heiko.Klein@xxxxxx> wrote:
> Hi Jay,
>
> not sure if this is connected, but we've had similar problems with
> ncWMS/thredds some years ago
> http://www.unidata.ucar.edu/mailing_lists/archives/thredds/2010/msg00069.html
>
> WMS-clients will request many maps/tiles at once and this will give high
> server load if not cached. This might lead to canceled client requests. In
> addition, tomcat6 with ncWMS had a strange bug with requests cancelled by the
> client, leading to a server-crash at the end.
>
>
> We solved this by adding a apache 'mod-cache' in front of tomcat and made
> tomcat deliver all pages with cache-headers keeping pictures for 7 days in
> the cache. Server-load dropped nicely due to the cache, and tomcat doesn't
> get any more 'client abort exceptions' since those are swollowed by the cache.
>
> Heiko
>
> On 2013-12-11 23:33, Jay Alder wrote:
>> Hi, we’ve recently released a web application that uses TDS for mapping,
>> which is getting a lot of traffic. At one point the server stopped
>> responding altogether, which is a major problem. A quick restart of
>> tomcat got it going again, so I’m starting to dig into the logs. We
>> normally get the GET / request complete behavior, but occasionally we’ll
>> have:
>>
>> GET …url…
>> GET …url…
>> GET …url…
>> GET …url…
>> GET …url…
>> GET …url…
>> GET …url…
>> GET …url…
>>
>> meanwhile having a 100% CPU spike (with 12 CPUs) for a minute or more
>>
>> request compete
>> request compete
>> request compete
>> request cancelled by client
>> request cancelled by client
>> request compete
>> request compete
>>
>> While watching the logs the few times I’ve seen this occur it seems to
>> pull out of it ok. However the time the server failed, requests were
>> never returned. From the logs, requests came in for roughly 40 minutes
>> without being completed. Unfortunately do to the high visibility we
>> started to get emails from users and the press about the application no
>> longer working.
>>
>> Has anyone experienced this before and/or can you give guidance on how
>> to diagnose or prevent this?
>>
>> Here are some config settings:
>> CentOS 5.7
>> Java 1.6
>> TDS 4.3.17
>> only WMS is enabled
>> Java -Xmx set to 8Gb (currently taking 5.3, the dataset is 600 Gb of
>> 30-arcsecond grids for the continental US, 3.4 Gb per file)
>> For better or worse we are configured to use 2 instances of TDS to keep
>> the catalogs and configuration isolated. I’m not sure if this matters,
>> but I didn’t want to omit it. Since it is a live server I can’t easily
>> change to the preferred proxy configuration.
>>
>> I am trying not to panic yet. However, if the server goes unresponsive
>> again, staying calm may no longer be an option.
>>
>> Jay Alder
>> US Geological Survey
>> Oregon State University
>> 104 COAS Admin Building
>> Office Burt Hall 166
>> http://ceoas.oregonstate.edu/profile/alder/
>>
>>
>>
>> _______________________________________________
>> thredds mailing list
>> thredds@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe, visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>>
>
> --
> Dr. Heiko Klein Tel. + 47 22 96 32 58
> Development Section / IT Department Fax. + 47 22 69 63 55
> Norwegian Meteorological Institute http://www.met.no
> P.O. Box 43 Blindern 0313 Oslo NORWAY
>
> _______________________________________________
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/
Jay Alder
US Geological Survey
Oregon State University
104 COAS Admin Building
Office Burt Hall 166
http://ceoas.oregonstate.edu/profile/alder/