Re: [thredds] ncWMS GetCapabilities return sometimes empty document

Hi Heiko,

I'm still puzzled by this problem.  But also, I don't understand how
Tomcat manages threads etc internally.  Could you please explain why you
think that this is caused by a thread being left in an invalid state?

Cheers, Jon

-----Original Message-----
From: Heiko Klein [mailto:Heiko.Klein@xxxxxx] 
Sent: 19 March 2010 08:01
To: Ethan Davis
Cc: thredds@xxxxxxxxxxxxxxxx; Jonathan Blower
Subject: Re: [thredds] ncWMS GetCapabilities return sometimes empty
document

Hi Ethan,

I just tried the new snapshot.

In jetty, the previous exception disappeared, but I get a strange new
one when aborting a GetMap request:

2010-03-19 08:43:07.328:WARN::header full:
java.lang.NullPointerException

and no stack-trace here. Strange, because NullPointerExceptions are
normally SEVERE?


In tomcat, the situation hasn't changed. Still an empty GetCapabilities
document shortly after a GetMap request has been canceled. And the same
old 'SEVERE: Exception initializing page context' log.

Heiko

On 2010-03-19 04:03, Ethan Davis wrote:
> Hi Heiko,
> 
> I believe this is a separate problem that happens when an exception is
> thrown after the GetMap response has been committed and the exception
> gets caught by a catch blocks that calls response.sendError(...)
without
> first checking if response.isCommitted().
> 
> I've added a check to all the places we didn't have it. I made a
> snapshot release with those changes if you want to try it again. The
> snapshot is at
> 
> ftp://ftp.unidata.ucar.edu/pub/thredds/4.1/Snapshot/thredds.war
> 
> On the other hand, I'm surprised we haven't seen this before since, if
> I'm right, this could happen in Tomcat as well. Maybe the time after
the
> response is committed is very short so we just haven't hit it before.
Or
> we just haven't noticed the log messages. If you see it with the
> snapshot release, we'll know my interpretation is incorrect.
> 
> Ethan
> 
> 
> On 3/18/2010 3:56 AM, Heiko Klein wrote:
>> Hi,
>>
>> I'm not familiar enough with the thredds or ncWMS code to be of help
>> there. But I just made another test which might be of help:
>>
>> I tried thredds4.1.3 on jetty 6.1.22. Here, the exception is thrown
>> directly during the client-abort. And it doesn't seem to be forwarded
to
>> GetCapabilities.
>>
>> Exception from the jetty log when aborting a GetMap request ( GET
>>
'http://localhost:8080/thredds/wms/testAll/2004050412_eta_211.nc?REQUEST
=GetMap&LAYERS=Z_sfc&PALETTE=redblue&SERVICE=WMS&FORMAT=image/png&VERSIO
N=1.3.0&CRS=EPSG:4326&BBOX=-130,30,20,90&WIDTH=400&HEIGHT=400&STYLES=BOX
FILL/ncview'
>>> /dev/null ) :
>>
>> 2010-03-18 10:53:21.207:WARN::Committed before 500 null
>> 2010-03-18
10:53:21.208:WARN::/thredds/wms/testAll/2004050412_eta_211.nc
>> java.lang.IllegalStateException: Committed
>>      at org.mortbay.jetty.Response.resetBuffer(Response.java:1024)
>>      at org.mortbay.jetty.Response.sendError(Response.java:240)
>>      at org.mortbay.jetty.Response.sendError(Response.java:342)
>>      at
>>
thredds.server.wms.WMSController.handleRequestInternal(WMSController.jav
a:304)
>>      at
>>
org.springframework.web.servlet.mvc.AbstractController.handleRequest(Abs
tractController.java:153)
>>      at
>>
org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handl
e(SimpleControllerHandlerAdapter.java:48)
>>      at
>>
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherS
ervlet.java:875)
>>      at
>>
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherSe
rvlet.java:809)
>>      at
>>
org.springframework.web.servlet.FrameworkServlet.processRequest(Framewor
kServlet.java:571)
>>      at
>>
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.
java:501)
>>      at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>>      at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>>      at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>>      at
>>
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1166)
>>      at
>>
thredds.servlet.filter.RequestPathFilter.doFilter(RequestPathFilter.java
:105)
>>      at
>>
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHan
dler.java:1157)
>>      at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
>>      at
>>
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
16)
>>      at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>>      at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
>>      at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
>>      at
>>
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
Collection.java:230)
>>      at
>>
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.jav
a:114)
>>      at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>      at org.mortbay.jetty.Server.handle(Server.java:326)
>>      at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>>      at
>>
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
ction.java:923)
>>      at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
>>      at
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>>      at
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>>      at
>>
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:
409)
>>      at
>>
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java
:582)
>>
>>
>>
>>
>> Jetty was freshly started, I made only one request to the jetty. I
>> aborted the request before it was finished.
>>
>> Best regards,
>>
>> Heiko
>>
>>
>>
>> On 2010-03-18 06:46, Ethan Davis wrote:
>>> Hi again,
>>>
>>> OK, I can reproduce the problem reliably. It wasn't very reliable
when I
>>> was doing it by hand. Now I'm running a script that continually
makes
>>> GetCapabilities requests one after the other. While that script is
>>> running I make GetMap requests and abort them with a ctrl-C. I also
have
>>> Tomcat configured to just 3 threads. (Thanks Heiko for figuring all
that
>>> out! Works like a charm.)
>>>
>>> I'm catching the GetCapabilities request that is going to fail with
a
>>> conditional breakpoint ("response.isCommitted() == true") at the
>>> beginning of the WMSController.handleRequestInternal() method. As we
>>> might expect, it looks like the response is committed from the time
the
>>> org.apache.coyote.http11.Http11Processor creates the request and
>>> response objects.
>>>
>>> I haven't figured out how to reliably catch the problem during the
abort
>>> of the GetMap request. Seems like a Tomcat issue rather than a
TDS/WMS
>>> issue. I would think Tomcat should handle things even if we are
dealing
>>> with an exception poorly.
>>>
>>> I'll dig around a bit more tomorrow. See if I can figure out why the
>>> response doesn't get reset properly.
>>>
>>> Ethan
>>>
>>>
>>> On 3/17/2010 11:03 AM, Ethan Davis wrote:
>>>> Hi Jon, Heiko,
>>>>
>>>> Jon, forgot to mention, we have code in the TDS WMS controller for
>>>> catching SocketException-s and ClientAbortException/IOException-s
>>>> similar to the change you reference.
>>>>
>>>> Have either of you seen this with ncWMS? Or is it so far just
showing up
>>>> in TDS?
>>>>
>>>> Most of the references to this exception that I've found deal with
>>>> exceptions interrupting JSP processing and causing a switch to a
JSP
>>>> error page that then can't create a session object. Since the
problem
>>>> happens in one of our main JSP pages (capabilities_xml.jsp) and the
>>>> ClientAbortException-s are happening in a different thread, I'm
leaning
>>>> towards the "thread in an invalid state" idea as well.
>>>>
>>>> One thing we could try is to set our JSP pages to not create
session
>>>> objects: "<%@ page session='false' ...%>". But since we don't know
>>>> if/how the threads might be invalid, I'm afraid that would just
make
>>>> this problem even harder to see.
>>>>
>>>> I'm going to try for a bit to get things setup so I can debug into
>>>> Tomcat code and see if I can catch this in action.
>>>>
>>>> Ethan
>>>>
>>>> On 3/17/2010 1:24 AM, Ethan Davis wrote:
>>>>> Hi Jon, Heiko,
>>>>>
>>>>> I'm having a hard time reproducing the problem reliably. I've only
seen
>>>>> it twice so far. And haven't been able to catch it in action. I
did get
>>>>> the IllegalStateException from the Jasper JspFactoryImpl both
times:
>>>>>
>>>>>   "java.lang.IllegalStateException: Cannot create a session
>>>>>   after the response has been committed"
>>>>>
>>>>> Googling on the above exception message brings up a fair number of
hits
>>>>> but none of them seem totally relevant. Or when they sound
somewhat
>>>>> relevant, there isn't a solution.
>>>>>
>>>>> More tomorrow.
>>>>>
>>>>> Ethan
>>>>>
>>>>> On 3/16/2010 9:05 AM, Jonathan Blower wrote:
>>>>>> Hi Heiko, Ethan,
>>>>>>
>>>>>> I've been trying to figure out what might be going on here -
thanks for
>>>>>> your great diagnosis Heiko.  I'm not sure what can be done in the
>>>>>> THREDDS/WMS code to prevent this, but I've modified some code in
the
>>>>>> ncWMS WmsController class to attempt to catch exceptions that are
thrown
>>>>>> when clients disconnect:
>>>>>>
>>>>>>
http://www.resc.rdg.ac.uk/trac/ncWMS/browser/branches/tds-refactor-2010-
>>>>>>
01-09/src/java/uk/ac/rdg/resc/ncwms/controller/WmsController.java#L250
>>>>>>
>>>>>> Heiko - you mentioned that it's possible that poor exception
handling
>>>>>> could cause a tomcat thread to be left in an invalid state.  I
can't
>>>>>> reproduce your problem reliably so I can't test this myself, but
I
>>>>>> wonder if you could find out whether this is happening in your
setup?
>>>>>> The VisualVM tool
>>>>>> (http://java.sun.com/javase/6/docs/technotes/guides/visualvm/)
can give
>>>>>> a thread dump from your running Tomcat system - perhaps you could
run
>>>>>> this when an error occurs and see if anything is suspicious?
>>>>>>
>>>>>> Ethan - are you able to reproduce Heiko's problem?
>>>>>>
>>>>>> Cheers, Jon
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ethan Davis [mailto:edavis@xxxxxxxxxxxxxxxx] 
>>>>>> Sent: 05 March 2010 23:32
>>>>>> To: Heiko Klein
>>>>>> Cc: Jonathan Blower; thredds@xxxxxxxxxxxxxxxx
>>>>>> Subject: Re: [thredds] ncWMS GetCapabilities return sometimes
empty
>>>>>> document
>>>>>>
>>>>>> Hi Heiko,
>>>>>>
>>>>>> That is great. Thanks.
>>>>>>
>>>>>> I'm a bit swamped through next week but I will jump on this the
>>>>>> following week before I get to some other WMS work.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Ethan
>>>>>>
>>>>>> On 3/5/2010 4:44 AM, Heiko Klein wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> finally, I managed to generate a simple case how to reproduce
this
>>>>>> error:
>>>>>>>
>>>>>>> a) start tomcat
>>>>>>> b) abort fetching some maps (3-4 times on a tomcat with
>>>>>>> 10thredds/HTTP1.1 connector), i.e. call
>>>>>>> GET
>>>>>>>
>>>>>>
'http://localhost:8081/thredds/wms/data/met.no/hirlam12/wam_nsea.fc.2009
>>>>>>
0604.nc?REQUEST=GetMap&LAYERS=significant_wave_height&PALETTE=redblue&SE
>>>>>>
RVICE=WMS&FORMAT=image/png&VERSION=1.3.0&CRS=EPSG:4326&BBOX=-30,-60,30,9
>>>>>> 0&WIDTH=400&HEIGHT=400&STYLES=BOXFILL/ncview'
>>>>>>>> /dev/null
>>>>>>> and abort this call before it is finished. (Ctrl-C)
>>>>>>> c) get some 'getCapabilities' documents:
>>>>>>> while [ 1 ]; do GET
>>>>>>>
>>>>>>
'http://localhost:8081/thredds/wms/data/met.no/hirlam12/wam_nsea.fc.2009
>>>>>> 0604.nc?service=WMS&version=1.3.0&request=GetCapabilities'
>>>>>>>> testCap.xml; ls -l testCap.xml; done
>>>>>>> ...
>>>>>>> -rw-r--r-- 1 heikok heikok 172126 2010-03-05 12:11 testCap.xml
>>>>>>> -rw-r--r-- 1 heikok heikok 172126 2010-03-05 12:11 testCap.xml
>>>>>>> -rw-r--r-- 1 heikok heikok 172126 2010-03-05 12:11 testCap.xml
>>>>>>> -rw-r--r-- 1 heikok heikok 0 2010-03-05 12:11 testCap.xml
>>>>>>> -rw-r--r-- 1 heikok heikok 172126 2010-03-05 12:11 testCap.xml
>>>>>>> -rw-r--r-- 1 heikok heikok 172126 2010-03-05 12:11 testCap.xml
>>>>>>>
>>>>>>>
>>>>>>> When the capabilites document with 0 byte size is fetched, the
earlier
>>>>>>> mentioned exception is thrown.
>>>>>>>
>>>>>>>
>>>>>>> Since step b) is the crucial step to reproduce the 0 byte
document, I
>>>>>>> guess there is somewhere a badly caught
>>>>>> ClientAbortException/IOException
>>>>>>> when writing to the responses OutputStream, leaving a
tomcat-thread in
>>>>>>> an invalid state.
>>>>>>>
>>>>>>>
>>>>>>> We can solve this problem currently by adding a proxy in front
of
>>>>>>> thredds (apache mod_proxy) which seems to swallow all
>>>>>>> client-connection-aborts before they reach tomcat.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Heiko
>>>>>>>
>>>>>>> On 2010-02-19 16:54, Heiko Klein wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm still trying to track down this problem, so here a short
update:
>>>>>>>>
>>>>>>>> The ncWMS/TDS crashes seems to be connected to concurrency: The
>>>>>> problem
>>>>>>>> occurs when reloading the wms-client (openlayers in firefox),
that
>>>>>> means
>>>>>>>> GetCapabilites and GetMap are run in parallel. We're
downloading
>>>>>> tiles,
>>>>>>>> so reloading will fetch one capability doc + approx 12 maps.
>>>>>>>>
>>>>>>>> * If I configure tomcat to only use one thread, and use the
default
>>>>>>>> 'HTTP/1.1' connector, the problem disappears (but
response-times are
>>>>>>>> very slow).
>>>>>>>> * Using 4 threads + 'HTTP/1.1', the problem appears very fast.
(one
>>>>>> or
>>>>>>>> two reloads)
>>>>>>>> * Using the 'Nio' connector and 4 threads, the problem appears,
but
>>>>>> not
>>>>>>>> so fast (3-4 reloads)
>>>>>>>> * Using the 'Nio' connector with only one thread, is about the
same
>>>>>> as
>>>>>>>> Nio with several threads.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Heiko
>>>>>>>>

-- 
Dr. Heiko Klein                              Tel. + 47 22 96 30 00
Development Section / IT Department          Fax. + 47 22 69 63 55
Norwegian Meteorological Institute           http://www.met.no
P.O. Box 43 Blindern  0313 Oslo NORWAY



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: