[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050310: LDM log errors



Patrick,

The dcrdf problem is a bulletin that exceeds the DCMXBF definition of 100K
in the $GEMPAK/include/bridge.h and BRIDGE.PRM files.

I am increasing this to 200K in the next release.

The Dcwcn message is suspicious in that you state you have
upgraded to 5.7.3, yet your pqact.conf file is showing
your GEMTBL pointing to the 5.6 directory.
Prossibly your pqact.conf need stobe updates to match the
running decoder version if the table in question isn't in your old dist..

Steve CHiswell
Unidata User Support




>From: "Patrick O'Reilly" <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200503101825.j2AIPTv2025398

>Hi,
>
>I apologize in advance for the long email, but wanted to include as much
>info as possible.  I have had these errors for a long while now, and wanted
>to get to the bottom of it but can't.  I have researched why I might be
>having them, and come up empty.  To start, I see about 25-100 hourly of:
>
>Mar 10 17:48:55 thunder pqact[2144]: pbuf_flush 31: time elapsed   4.049759
>Mar 10 17:49:41 thunder pqact[2144]: pbuf_flush 30: time elapsed   5.226785
>Mar 10 17:50:16 thunder pqact[2144]: pbuf_flush 34: time elapsed   4.490816
>Mar 10 17:50:44 thunder pqact[2144]: pbuf_flush 34: time elapsed   7.396645
>Mar 10 17:50:57 thunder pqact[2144]: pbuf_flush 34: time elapsed   6.140092
>Mar 10 17:52:10 thunder pqact[2144]: pbuf_flush 30: time elapsed   2.537028
>Mar 10 17:52:12 thunder pqact[2144]: child 17930 exited with status 127
>Mar 10 17:52:13 thunder pqact[2144]: child 17932 exited with status 127
>Mar 10 17:52:31 thunder pqact[2144]: child 17936 exited with status 127
>Mar 10 17:52:32 thunder pqact[2144]: child 17938 exited with status 127
>
>and interspersed:
>
>Mar 10 17:59:17 thunder pqact[2144]: pipe_dbufput:
>decoders/dcrdf-v4-ddata/gempak/logs/dcrdf.log-eGEMTBL=/home/gempak/GEMPAK5.6
>/gempak/tablesdata/gempak/rdf/YYYYMMDDHH.rdf write error
>
>Mar 10 17:26:40 thunder pqact[2144]: pipe_dbufput:
>decoders/dcwcn-ddata/gempak/logs/dcwcn.log-eGEMTBL=/home/gempak/GEMPAK5.6/ge
>mpak/tablesdata/gempak/storm/wcn/YYYYMMDDHHNN.wcn write error
>
>For giggles, see  http://thunder.storm.uni.edu/data/logs/
>
>So, I checked several things.  Both decoders (dcwcn dcrdf) are in
>~ldm/decoders and are accessible/executable:
>
>-rwxr-xr-x  1 ldm users 114152 Jan 24 16:28 dcwcn
>-rwxr-xr-x  1 ldm users 148928 Jan 24 16:28 dcrdf
>
>Both logs in gempak/logs are there, and are being written to, and both are
>full of:
>
>[17979] 050310/1159[DC 2] Number of bulletins read and processed: 0
>
>also, the directories that the decoders are supposed to be writing files to
>are owned by ldm user, with 755 permissions.  So they exist and are writable
>by ldm.  But, of course, they're empty.
>
>In my gempak/logs/dcwcn.log, it appears that things were peachy until I
>switched from Version 5.6.l.1  to Version 5.7.3:
>
>**SNIP**
>[3030] 041027/1428 [DC 3] Version 5.6.l.1
>[3030] 041027/1428 [DCWCN -9]
>[3030] 041027/1428 [DCWCN -1]
>[3030] 041027/1428 [DCWCN -9]
>[3030] 041027/1428 [DCWCN -1]
>[3030] 041027/1429 [DCWCN -9]
>[3030] 041027/1429 [DCWCN -1]
>[3030] 041027/1429 [DCWCN -10] T.CAN.KMKX.TO.A.9007.000000T0000Z-041027T21
>[3030] 041027/1429 [DCWCN -10] T.CAN.KMKX.TO.A.9007.000000T0000Z-041027T21
>[3030] 041027/1430 [DCWCN -9]
>[3030] 041027/1430 [DCWCN -9]
>[2899] 041027/1431 [DC 2] Interrupt Signal
>[2899] 041027/1431 [DC 5]
>[2899] 041027/1431 [DC 2] Number of bulletins read and processed: 1
>[2899] 041027/1431 [DC 6]
>[2846] 041027/1431 [DC 2] Interrupt Signal
>[2846] 041027/1431 [DC 5]
>[2846] 041027/1431 [DC 2] Number of bulletins read and processed: 5
>[2846] 041027/1431 [DC 6]
>[3030] 041027/1431 [DC 2] Interrupt Signal
>[3030] 041027/1431 [DC 5]
>[3030] 041027/1431 [DC 2] Number of bulletins read and processed: 7
>[2687] 041027/1431 [DC 2] Interrupt Signal
>[3030] 041027/1431 [DC 6]
>[2687] 041027/1431 [DC 5]
>[2498] 041027/1431 [DC 2] Interrupt Signal
>[2687] 041027/1431 [DC 2] Number of bulletins read and processed: 1
>[2557] 041027/1431 [DC 2] Interrupt Signal
>[2687] 041027/1431 [DC 6]
>[2557] 041027/1431 [DC 5]
>[2557] 041027/1431 [DC 2] Number of bulletins read and processed: 1
>[2514] 041027/1431 [DC 2] Interrupt Signal
>[2514] 041027/1431 [DC 5]
>[2557] 041027/1431 [DC 6]
>[2514] 041027/1431 [DC 2] Number of bulletins read and processed: 1
>[2615] 041027/1431 [DC 2] Interrupt Signal
>[2514] 041027/1431 [DC 6]
>[2615] 041027/1431 [DC 5]
>[2615] 041027/1431 [DC 2] Number of bulletins read and processed: 5
>[2615] 041027/1431 [DC 6]
>[2498] 041027/1431 [DC 5]
>[2498] 041027/1431 [DC 2] Number of bulletins read and processed: 1
>[2635] 041027/1431 [DC 2] Interrupt Signal
>[2498] 041027/1431 [DC 6]
>[2635] 041027/1431 [DC 5]
>[2635] 041027/1431 [DC 2] Number of bulletins read and processed: 1
>[2635] 041027/1431 [DC 6]
>[2181] 041027/1431 [DC 2] Interrupt Signal
>[2181] 041027/1431 [DC 5]
>[2181] 041027/1431 [DC 2] Number of bulletins read and processed: 4
>[2181] 041027/1431 [DC 6]
>[3426] 041027/1433[DC 3] Version 5.7.3
>[3426] 041027/1433[FL -1] mzcntys.tbl
>[3426] 041027/1433[DC 5]
>[3426] 041027/1433[DC 2] Number of bulletins read and processed: 0
>[3426] 041027/1433[DC 6]
>[3427] 041027/1433[DC 3] Version 5.7.3
>[3427] 041027/1433[FL -1] mzcntys.tbl
>[3427] 041027/1433[DC 5]
>[3427] 041027/1433[DC 2] Number of bulletins read and processed: 0
>[3427] 041027/1433[DC 6]
>[3431] 041027/1433[DC 3] Version 5.7.3
>[3431] 041027/1433[FL -1] mzcntys.tbl
>[3431] 041027/1433[DC 5]
>[3431] 041027/1433[DC 2] Number of bulletins read and processed: 0
>[3431] 041027/1433[DC 6]
>[3437] 041027/1433[DC 3] Version 5.7.3
>[3437] 041027/1433[FL -1] mzcntys.tbl
>[3437] 041027/1433[DC 5]
>[3437] 041027/1433[DC 2] Number of bulletins read and processed: 0
>[3437] 041027/1433[DC 6]
>[3597] 041027/1438[DC 3] Version 5.7.3
>[3597] 041027/1438[FL -1] mzcntys.tbl
>[3597] 041027/1438[DC 5]
>[3597] 041027/1438[DC 2] Number of bulletins read and processed: 0
>[3597] 041027/1438[DC 6]
>[3598] 041027/1438[DC 3] Version 5.7.3
>[3598] 041027/1438[FL -1] mzcntys.tbl
>[3598] 041027/1438[DC 5]
>[3598] 041027/1438[DC 2] Number of bulletins read and processed: 0
>[3598] 041027/1438[DC 6]
>[3617] 041027/1440[DC 3] Version 5.7.3
>[3617] 041027/1440[FL -1] mzcntys.tbl
>**SNIP**
>
>BTW, I see the mzcntys.tbl in $GEMTBL/stns and it's readable by anybody, as
>are all the rest of the tables.
>
>I think it's possible these errors at times slow down my ldm
>decoding/writing data from queue to disk.  For example, at 1718Z today, a
>1700Z US surface temp plot was created, and had about 10 data points, while
>my ldm feed showed my IDS|DDPLUS feed right on time (though my HDS feed was
>about 25 mins latent). And coincidentally, these pbuf flush/child exited
>errors went from about 25 an hour before 1700Z, to about 100 during the
>1700Z hour.
>
>So...what's going on?  Anyone?
>
>Patrick
>
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.