[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20040510: 20040506: Gemapk dcgrib cores



I did the kill -USR2 pid command and let it run for a couple of hours and did 
not see any delay type comments in the log.  Here are the logs and config that 
you asked for.

Adam



Quoting Unidata Support <address@hidden>:

> 
> Adam,
> 
> The core dumps are definitely a problem and need to be resolved.
> Possibly being caused by either not finding the correct tables, or
> a corrupted grid file if your pqact is falling behind.
> One way to check if pqact is falling behind is to issue 2
> "kill -USR2 pid" commands where pid is the process ID of the pqact
> in question. After issuing the kills, the pqact process will
> be in debug mode. Looking at the ldmd.log file will
> show "Delay" messages about the time between when the
> product was received and when it was processed by
> pqact. If this islarger than a few seconds, then
> you probably have to split up your pqact processing
> to multiple independent invocations in your ldmd.conf
> file (for example the split pqact.gempak_xxxxx files
> from the gen_pqact.cshscript in $NAWIPS/ldm/etc).
> 
> The messages about grid too large will occur with an icing
> product from AWC which requires 1,679,940 points
> where the default build has a maximum of 750,000, so that
> is a correct log message.
> 
> A message of "bulletin too long" also signifies a correct behavior
> by the decoder in detecting a product that is missing the end sequence 
> in the proper location.
> 
> The messages about "no file template" and "table grib3.tbl not found"
> both signify that the cntrgrib1.tbl and gribkey.tbl files in
> $GEMTBL/grid are not correctly being found/read, or there
> is garbage in the product they are decoding.
> If your -e GEMTBL is correct, you may also need to verify that
> the LDM has read access to the files.
> 
> Another problem could result if more than 1 decoder were writing to the
> file at the same time, for example if things were running so slow that
> pqact fired up a second PIPE because the first was full. That should
> be evident in the logs by overlapping process ids.
> 
> Please provide your pqact.conf file, and your dcgrib.log file
> for more information.
> 
> Steve Chiswell
> 
> 
> 
> 
> >From: address@hidden
> >Organization: UCAR/Unidata
> >Keywords: 200405101526.i4AFQ0tK019287
> 
> >I have double checked everything three times.  All GEMTBL's are set for all
> 
> >instances of dcgrib2 (everything else is working fine, even the call below
> 
> >works most of the time).  The pqact call that is causing the cores is
> below:
> >
> >HDS|NMC2        ^([HOYZ]|/afs)
> >        PIPE    decoders/dcgrib2 -v 1 -d logs/gempak/dcgrib.log
> >        -e GEMTBL=/home/ldm/nawips/gempak/tables
> >
> >The couple of other instances of dcgrib2 that are being called, are for 
> >FNEXRAD, NOGAPS, amd CMC.  These three have different log names like 
> >dcgrib_fnmoc.log, dcgrib_cmc.log, and dcgrib_radar.log.  All of the core
> files
> >  
> >when you grep through the logs for the process id however all come from 
> >dcgrib.log which is the log for HDS and NMC2.  So in a nut shell, i am still
> 
> >getting core dumps from dcgrib2 and the same type of messages as listed in
> my 
> >first email.  The only thing that has changed this weekend is that i saw
> that 
> >there was version 5.7.2p2 out and i downloaded it, built it, and installed
> 
> >it.  However i still get the same types of errors.
> >
> >I am really confused as to what is going on.
> >
> >Thanks
> >
> >Adam
> >
> >
> >
> >Quoting Unidata Support <address@hidden>:
> >
> >> 
> >> Adam,
> >> 
> >> Sounds like your GEMTBL environmental variable is not set.
> >> 
> >> The pqact entres for dcgrib2 can be generated from
> >> $NAWIPS/ldm/etc/gen_pqact.csh
> >> (as described at the top of
> >> 
> 
>http://my.unidata.ucar.edu/content/software/gempak/GEMPAK5.7/configuration.htm
> > l
> >)
> >> 
> >> The gen_pqact.csh will use your GEMTBL environmental variable setting
> for
> >> creating the pqact.conf entries. Your GEMTBL variable is set by sourcing
> 
> >> $NAWIPS/Gemenviron.
> >> 
> >> The dcgrib2 entries use the -e GEMTBL=location setting for the decoder
> to
> >> find the necessary tables to decoder grib. If this is not set, and
> >> not in the environment of the LDM account that started the pqact
> process,
> >> you will not get the proper tables.
> >> 
> >> 
> >> Steve Chiswell
> >> Unidata User Support
> >> 
> >> 
> >> 
> >> 
> >> 
> >> >From: address@hidden
> >> >Organization: UCAR/Unidata
> >> >Keywords: 200405061848.i46ImJtK016553
> >> 
> >> >I have upgraded gempak to the newest version about a week ago 5.7. 
> Since
> >> then
> >> >  
> >> >I have been getting alot of core files in the LDM directory.  greping
> >> through 
> >> >the logs has shown them all to be from the dcgrib2 decoder.  My dcgrib
> logs
> >> 
> >> >shows everything from "no file template", "Grid to large", "table
> grib3.tbl
> >> 
> >> >cannot be opened", "bulletin to long", and "Grid navigation 255
> incompatibl
> > e
> >> 
> >> >with file ....."
> >> >
> >> >HELP!!!!
> >> >
> >> >-- 
> >> >Adam Taylor
> >> >Computing Center
> >> >University of Louisiana at Monroe
> >> >
> >> --
> >>
> ****************************************************************************
> >> <
> >> Unidata User Support                                    UCAR Unidata
> Program
> >> <
> >> (303)497-8643                                                  P.O. Box
> 3000
> >> <
> >> address@hidden                                   Boulder, CO
> 80307
> >> <
> >>
> ----------------------------------------------------------------------------
> >> <
> >> Unidata WWW Service             
> http://my.unidata.ucar.edu/content/support 
> >> <
> >>
> ----------------------------------------------------------------------------
> >> <
> >> NOTE: All email exchanges with Unidata User Support are recorded in the
> >> Unidata inquiry tracking system and then made publically available
> >> through the web.  If you do not want to have your interactions made
> >> available in this way, you must let us know in each email you send to
> us.
> >> 
> >
> >
> >-- 
> >Adam Taylor
> >Computing Center
> >University of Louisiana at Monroe
> >
> --
> ****************************************************************************
> <
> Unidata User Support                                    UCAR Unidata Program
> <
> (303)497-8643                                                  P.O. Box 3000
> <
> address@hidden                                   Boulder, CO 80307
> <
> ----------------------------------------------------------------------------
> <
> Unidata WWW Service              http://my.unidata.ucar.edu/content/support 
> <
> ----------------------------------------------------------------------------
> <
> NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publically available
> through the web.  If you do not want to have your interactions made
> available in this way, you must let us know in each email you send to us.
> 


-- 
Adam Taylor
Computing Center
University of Louisiana at Monroe