[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19991108: I need some assistance



On Fri, 3 Dec 1999, McIDAS wrote:

> I manually deleted the ldm.pq file and restarted ldm.  So far the
> program has been running for more than 40 minutes, so we'll have to

Karli,

Somehow the LDM queue wasn't getting deleted, maybe it was a permissions
problem.

> see if this keeps up.  I just checked the route.k file and to
> my surprise most of the feeds are dated 99317 (but I am not sure what
> day that is supposed to be).  Those are the latest dates except for NLDN
> with 99320.  


Use the "ldmadmin watch" command to see the dates on the product.  Reading
from left to right, time of machine, time product was ingested, and the
product time.


If this fails also, I would be very glad if you log into
> the machine.  Just let me know how would you want me to supply the
> password, if by phone or email.

At this time, I don't think I need the password.

> Also, the year's almost over and I beleive I hear that old versions of
> ldm have Y2K problems, is this correct.  If so what versions of ldm and
> ldm-mcidas should I be running?  Again, I have ldm 5.0.5 

The lastest version ldm-5.0.8 have the y2k fixes. There are binary
versions available for many of the platforms.

and ldm-mcidas
> 7.1.1 (or 7.1.3) running for Mcidas 7.5 which I'll soon upgrade to 7.6
> 
I don't know what version you need, but I'll pass this onto McIdas
support.

Robb...



> Again, thanks.
> 
> Karli Lopez
> 
> Robb Kambic wrote:
> > 
> > Karli,
> > 
> > The log messages are still saying that "Que corrupt:".  I would go to the
> > data directory and delete the ldm.pq file.  Somehow the file might not be
> > delete.  Also, the data directory needs to be on a local disk drive. If
> > you are still having a problem, can I get a login to the machine?
> > 
> > Robb...
> > 
> > On Sun, 28 Nov 1999, McIDAS wrote:
> > 
> > > The machine's clock shows the correct time.  This is an Octane Machine
> > > with 128MB Ram running IRIX 6.4 and has two partitions with 1.3GB and
> > > 5.0GB free respectively.  It is running McIDAS 7.5, ldm-5.0.5 and
> > > ldm-mcidas-7.1.1 (or 7.1.3 if it had been configured correctly).
> > >
> > > After commenting out the line 'exec "pqact"' I got this output:
> > > -----------------------------------------------------------------------
> > > ldm@breeze 45% alias sverb   "bin/rpc.ldmd -vl -
> > > etc/ldmd.conf"
> > > ldm@breeze 46% sverb
> > > Nov 28 19:01:34 rpc.ldmd[5644]: Starting Up (built: Aug 22 1997
> > > 12:07:40)
> > > Nov 28 19:01:34 aqua[5646]: run_requester: Starting Up:
> > > aqua.atmos.uah.edu
> > > Nov 28 19:01:34 striker[5593]: run_requester: Starting Up:
> > > striker.atmos.albany.
> > > edu
> > > Nov 28 19:01:35 udp.ldmd[5647]: Starting Up
> > > Nov 28 19:01:59 aqua[5646]: lastmatch:
> > > c9896c74abb279dea769a3a091a1b891    44766
> > >  19991128180616.338  MCIDAS 000  LWTOA3 205 DIALPROD=U3 99332 180612
> > > Nov 28 19:01:59 aqua[5646]: run_requester: 19991128180616.338 TS_ENDT
> > > {{FSL2|MCI
> > > DAS,  ".*"}}
> > > Nov 28 19:01:59 striker[5593]: lastmatch:
> > > b9144b67fb6c5fd60c2fb5938b418cef
> > >  84 19991128185500.141    NLDN 000  99332184853
> > > Nov 28 19:01:59 striker[5593]: run_requester: 19991128185500.141 TS_ENDT
> > > {{NLDN,
> > >   ".*"}}
> > > Nov 28 19:01:59 striker[5593]: FEEDME(striker.atmos.albany.edu): OK
> > > Nov 28 19:01:59 aqua[5646]: FEEDME(aqua.atmos.uah.edu): reclass:
> > > 19991128180616.
> > > 338 TS_ENDT {{MCIDAS,  ".*"}}
> > > Nov 28 19:01:59 striker[5593]: hereis: dup:       84
> > > 19991128185500.141    NLDN000  99332184853
> > > Nov 28 19:01:59 aqua[5646]: FEEDME(aqua.atmos.uah.edu): OK
> > > Nov 28 19:02:00 striker[5593]: Que corrupt: ftbl
> > > Nov 28 19:02:00 striker[5593]:       84 19991128190100.696    NLDN 000
> > > 99332185
> > > 459
> > > Nov 28 19:02:00 aqua[5646]: dup    :    44766 19991128180616.338  MCIDAS
> > > 000  LW
> > > TOA3 205 DIALPROD=U3 99332 180612
> > > Nov 28 19:02:03 aqua[5646]:   189889 19991128181050.145  MCIDAS 000
> > > LWTOA3 193DIALPROD=U1 99332 181048
> > > Nov 28 19:02:03 aqua[5646]: assertion "rp->prev == OFF_NONE" failed:
> > > file "pq.c"
> > > , line 678
> > > Nov 28 19:02:05 rpc.ldmd[5644]: child 5648 terminated by signal 6
> > > Nov 28 19:02:05 rpc.ldmd[5644]: Killing (SIGINT) process group
> > > Nov 28 19:02:05 rpc.ldmd[5644]: Interrupt
> > > Nov 28 19:02:05 rpc.ldmd[5644]: Exiting
> > > Nov 28 19:02:05 striker[5593]: Interrupt
> > > Nov 28 19:02:05 striker[5593]: Exiting
> > > Nov 28 19:02:05 udp.ldmd[5647]: Interrupt
> > > Nov 28 19:02:05 udp.ldmd[5647]: Exiting
> > > Nov 28 19:02:06 rpc.ldmd[5644]: Terminating process group
> > > Nov 28 19:02:29 rpc.ldmd[5644]: child 5646 terminated by signal 6
> > > Nov 28 19:02:29 rpc.ldmd[5644]: Killing (SIGINT) process group
> > > -----------------------------------------------------------------------
> > > after eliminating all requests I still had the same problem:
> > > -----------------------------------------------------------------------
> > >
> > > ldm@breeze 56% !s
> > > sverb
> > > Nov 28 19:31:04 rpc.ldmd[5454]: Starting Up (built: Aug 22 1997
> > > 12:07:40)
> > > Nov 28 19:31:05 udp.ldmd[5767]: Starting Up
> > > Nov 28 19:31:45 rpc.ldmd[5454]: child 5717 terminated by signal 6
> > > Nov 28 19:31:45 rpc.ldmd[5454]: Killing (SIGINT) process group
> > > Nov 28 19:31:45 rpc.ldmd[5454]: Interrupt
> > > Nov 28 19:31:45 rpc.ldmd[5454]: Exiting
> > > Nov 28 19:31:45 udp.ldmd[5767]: Interrupt
> > > Nov 28 19:31:45 udp.ldmd[5767]: Exiting
> > > Nov 28 19:31:45 rpc.ldmd[5454]: Terminating process group
> > > Nov 28 19:31:45 rpc.ldmd[5454]: child 5791 terminated by signal
> > > 15
> > > -----------------------------------------------------------------------
> > > and this is what top was showing me while I ran LDM without any
> > > requests:
> > > -----------------------------------------------------------------------
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.00,0.07,0.08] 20:08:28   50 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >    karli  6011  6011   0.37    0   20   115    75    0:00
> > > top
> > >     root  1038  1038   0.06    *   20   140    60   10:30
> > > mediad
> > >     root  1117  1112   0.03    *   20  1078    34    8:25
> > > clogin
> > >     root  1102  1102   0.03    *   20   879    96    6:49
> > > Xsgi
> > >     root  5708   171   0.01    *   20   111    53    0:00
> > > telnetd
> > >     root   261   261   0.01    *   +0   121   121    2:36
> > > xntpd
> > >     root  1039   171   0.01    *   20   120    45    0:50
> > > fam
> > >
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.13,0.09,0.09] 20:08:33   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950  11.54    *   20  6353   780    0:00
> > > pqexpire
> > >      ldm  6014  5950   3.67    *   20   287    61    0:00
> > > dmmisc.k
> > >      ldm  6021  5950   3.48    *   20   283    59    0:00
> > > dmsyn.k
> > >      ldm  6022  5950   3.36    *   20   271    58    0:00
> > > dmraob.k
> > >    karli  6011  6011   3.20    0   20   116    76    0:00
> > > top
> > >      ldm  6030  5950   2.11    *   20   279    59    0:00
> > > dmsfc.k
> > >     root  5708   171   0.12    *   20   111    53    0:00
> > > telnetd
> > >     root  1102  1102   0.06    *   20   879    96    6:49
> > > Xsgi
> > >     root  1117  1112   0.06    *   20  1078    34    8:25
> > > clogin
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.13,0.09,0.09] 20:08:35   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950   7.84    *   20  6353  1212    0:00
> > > pqexpire
> > >    karli  6011  6011   1.11    0   20   116    76    0:00
> > > top
> > >     root  1117  1112   0.07    *   20  1078    34    8:25
> > > clogin
> > >     root  5708   171   0.07    *   20   111    53    0:00
> > > telnetd
> > >     root  1102  1102   0.05    *   20   879    96    6:49
> > > Xsgi
> > >     root   261   261   0.02    *   +0   121   121    2:36
> > > xntpd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.18,0.11,0.10] 20:08:38   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950  10.93    *   20  6353  2192    0:00
> > > pqexpire
> > >    karli  6011  6011   1.20    0   20   116    76    0:00
> > > top
> > >     root  1117  1112   0.07    *   20  1078    34    8:25
> > > clogin
> > >     root  1102  1102   0.06    *   20   879    96    6:49
> > > Xsgi
> > >     root  5708   171   0.03    *   20   111    53    0:00
> > > telnetd
> > >      ldm  6005  5950   0.02    *   20  6364    30    0:00
> > > rpc.ldmd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.18,0.11,0.10] 20:08:39   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950   4.14    *   20  6353  2338    0:00
> > > pqexpire
> > >    karli  6011  6011   0.97    0   20   116    76    0:00
> > > top
> > >     root  1038  1038   0.15    *   20   140    60   10:30
> > > mediad
> > >     root  5708   171   0.06    *   20   111    53    0:00
> > > telnetd
> > >     root  1117  1112   0.06    *   20  1078    34    8:25
> > > clogin
> > >     root  1102  1102   0.05    *   20   879    96    6:49
> > > Xsgi
> > >     root  1039   171   0.03    *   20   120    45    0:50
> > > fam
> > >     root   261   261   0.02    *   +0   121   121    2:36
> > > xntpd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.24,0.12,0.10] 20:08:43   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >    karli  6011  6011   1.76    0   20   116    76    0:00
> > > top
> > >     root  1117  1112   0.05    *   20  1078    34    8:25
> > > clogin
> > >     root  5708   171   0.05    *   20   111    53    0:00
> > > telnetd
> > >     root  1102  1102   0.04    *   20   879    96    6:49
> > > Xsgi
> > >      ldm  5950  5950   0.02    *   20  6363    31    0:00
> > > rpc.ldmd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.29,0.13,0.10] 20:08:46   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950   6.30    *   20  6353  4050    0:01
> > > pqexpire
> > >    karli  6011  6011   0.95    0   20   116    76    0:00
> > > top
> > >     root   816   816   0.74    *   20   120    58    2:10
> > > sendmail
> > >     root  1117  1112   0.06    *   20  1078    34    8:25
> > > clogin
> > >     root  5708   171   0.05    *   20   111    53    0:00
> > > telnetd
> > >     root  1102  1102   0.05    *   20   879    96    6:49
> > > Xsgi
> > >     root   261   261   0.02    *   +0   121   121    2:36
> > > xntpd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.33,0.14,0.11] 20:08:50   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950  10.59    *   20  6353  4457    0:01
> > > pqexpire
> > >    karli  6011  6011   0.75    0   20   116    70    0:00
> > > top
> > >     root  1117  1112   0.07    *   20  1078    23    8:25
> > > clogin
> > >     root  5708   171   0.06    *   20   111    51    0:00
> > > telnetd
> > >     root  1102  1102   0.06    *   20   879    88    6:49
> > > Xsgi
> > >      ldm  5950  5950   0.02    *   20  6363    23    0:00
> > > rpc.ldmd
> > >     root   261   261   0.01    *   +0   121   121    2:36
> > > xntpd
> > >      ldm  6005  5950   0.01    *   20  6364    23    0:00
> > > rpc.ldmd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.37,0.15,0.11] 20:08:54   60 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >      ldm  5997  5950  10.17    *   20  6353  4860    0:01
> > > pqexpire
> > >    karli  6011  6011   0.91    0   20   116    70    0:00
> > > top
> > >     root  1038  1038   0.32    *   20   140    54   10:30
> > > mediad
> > >     root  1102  1102   0.13    *   20   879    85    6:49
> > > Xsgi
> > >     root   261   261   0.11    *   +0   121   121    2:36
> > > xntpd
> > >     root  1117  1112   0.11    *   20  1078    21    8:25
> > > clogin
> > >     root  5708   171   0.07    *   20   111    50    0:00
> > > telnetd
> > >
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.37,0.15,0.11] 20:08:56   53 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >    karli  6011  6011   0.88    0   20   116    70    0:00
> > > top
> > >      ldm  5950  5950   0.24    *   20  6363    38    0:00
> > > rpc.ldmd
> > >     root  1117  1112   0.14    *   20  1078    32    8:25
> > > clogin
> > >     root  1102  1102   0.13    *   20   879    91    6:49
> > > Xsgi
> > >      ldm  6024  5950   0.09    *   20   222    40    0:00
> > > startxcd.k
> > >     root  5437   171   0.09    *   20   111    49    0:00
> > > telnetd
> > >     root  5708   171   0.05    *   20   111    50    0:00
> > > telnetd
> > >      ldm  6005  5950   0.04    *   20  6364    35    0:00
> > > rpc.ldmd
> > >     root    77     0   0.02    *   20    96    43    0:02
> > > syslogd
> > >     root   165     0   0.02    *   20    92    42    0:07
> > > portmap
> > >     root   261   261   0.02    *   +0   121   121    2:36
> > > xntpd
> > >
> > > IRIX64 breeze 6.4 02121744 IP30 Load[0.87,0.26,0.15] 20:08:57   50 procs
> > >     user   pid  pgrp   %cpu proc  pri  size   rss    time
> > > command
> > >    karli  6011  6011   0.96    0   20   116    70    0:00
> > > top
> > >     root     1     0   0.59    *   20    26    18    0:27
> > > init
> > >     root  1038  1038   0.15    *   20   140    60   10:30
> > > mediad
> > >     root    77     0   0.13    *   20    96    50    0:02
> > > syslogd
> > >     root   165     0   0.12    *   20    92    46    0:07
> > > portmap
> > >     root  1039   171   0.11    *   20   120    45    0:50
> > > fam
> > >     root  5708   171   0.06    *   20   111    50    0:00
> > > telnetd
> > >     root  5437   171   0.06    *   20   111    49    0:00
> > > telnetd
> > >      ldm  5463  5463   0.05    *   20    36    16    0:00
> > > csh
> > >     root  1117  1112   0.05    *   20  1078    32    8:25
> > > clogin
> > >     root  1102  1102   0.04    *   20   879    91    6:49
> > > Xsgi
> > >     root   261   261   0.01    *   +0   121   121    2:36
> > > xntpd
> > > -----------------------------------------------------------------------
> > > Karli Lopez
> > >
> > >
> > > Robb Kambic wrote:
> > > >
> > > > On Mon, 22 Nov 1999, McIDAS wrote:
> > > >
> > > > > Rob,
> > > > > thanks for the tip.  Executing the command yielded some pretty
> > > > > interesting output:
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > ldm@breeze 1% bin/rpc.ldmd -vl - etc/ldmd.conf
> > > > > Nov 22 19:59:03 rpc.ldmd[21390]: Starting Up (built: Aug 22 1997
> > > > > 12:07:40)
> > > > > Nov 22 19:59:03 aqua[21329]: run_requester: Starting Up:
> > > > > aqua.atmos.uah.edu
> > > > > Nov 22 19:59:03 striker[21395]: run_requester: Starting Up:
> > > > > striker.atmos.albany.edu
> > > > > Nov 22 19:59:04 udp.ldmd[21382]: Starting Up
> > > > > Nov 22 19:59:30 aqua[21329]: pq_sequence: xdr_prod_info() failed
> > > > > Nov 22 19:59:30 striker[21395]: pq_sequence: xdr_prod_info() failed
> > > > > Nov 22 19:59:30 aqua[21329]: pq_last: seq:I/O error (errno = 5)
> > > > > Nov 22 19:59:30 aqua[21329]: run_requester: 19991122185903.945 TS_ENDT
> > > > > {{UNIDATA,  ".*"},{FSL2|MCIDAS,  ".*"}}
> > > > > Nov 22 19:59:30 striker[21395]: pq_last: seq:I/O error (errno = 5)
> > > > > Nov 22 19:59:30 striker[21395]: run_requester: 19991122185903.951
> > > > > TS_ENDT {{NLDN,  ".*"}}
> > > >
> > > > Karla,
> > > >
> > > > The first thing to check is that your machine time is correct. Also,
> > > > comment out the "exec pqact ...." line in your ldmd.conf file.  I would
> > > > also comment out the other request lines in the ldmd.conf until it runs
> > > > correctly.  What type of machine is this?  What's the output of top?
> > > >
> > > > Robb...
> > > >
> > > >  > Nov 22 19:59:36 rpc.ldmd[21390]: child 21416 terminated by
> > > > signal 6 > Nov 22 19:59:36 rpc.ldmd[21390]: Killing (SIGINT) process 
> > > > group
> > > > > Nov 22 19:59:36 rpc.ldmd[21390]: Interrupt
> > > > > Nov 22 19:59:36 rpc.ldmd[21390]: Exiting
> > > > > Nov 22 19:59:36 striker[21395]: Interrupt
> > > > > Nov 22 19:59:36 striker[21395]: Exiting
> > > > > Nov 22 19:59:36 aqua[21329]: Interrupt
> > > > > Nov 22 19:59:36 aqua[21329]: Exiting
> > > > > Nov 22 19:59:36 udp.ldmd[21382]: Interrupt
> > > > > Nov 22 19:59:36 udp.ldmd[21382]: Exiting
> > > > > Nov 22 19:59:36 rpc.ldmd[21390]: Terminating process group
> > > > > ldm@breeze 2%
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > I got this output in less that a minute.  My guess is that the data
> > > > > stream is failing (but this wouldn't cause it to die) or something is
> > > > > externally killng it.
> > > > > Karli
> > > > >
> > > > > Robb Kambic wrote:
> > > > > >
> > > > > > Karli,
> > > > > >
> > > > > > Run the ldm from  home on the command line with the messages to the
> > > > > > screen, ie.
> > > > > >
> > > > > > % bin/rpc.ldmd -vl - etc/ldmd.conf
> > > > > >
> > > > > > This should give us a clue of the problem.
> > > > > >
> > > > > > Robb...
> > >
> > > ===============================================================================
> > > > Robb Kambic                                Unidata Program Center
> > > > Software Engineer III                      Univ. Corp for Atmospheric 
> > > > Research
> > > > address@hidden                   WWW: http://www.unidata.ucar.edu/
> > > > ===============================================================================
> > >
> > > --
> > >
> > > ====================================================================
> > > Amos Winter                                  address@hidden
> > > Director
> > > Puerto Rico Climatology Center
> > > P.O. Box 9013
> > > Department of Marine Sciences                  phone: (787) 265-5416
> > > University of Puerto Rico - Mayaguez             fax: (787) 265-2195
> > > Mayaguez, PR 00681-9013
> > >
> > 
> > ===============================================================================
> > Robb Kambic                                Unidata Program Center
> > Software Engineer III                      Univ. Corp for Atmospheric 
> > Research
> > address@hidden                   WWW: http://www.unidata.ucar.edu/
> > ===============================================================================
> 
> -- 
> 
> ====================================================================
> Amos Winter                                  address@hidden
> Director
> Puerto Rico Climatology Center 
> P.O. Box 9013                           
> Department of Marine Sciences                  phone: (787) 265-5416    
> University of Puerto Rico - Mayaguez             fax: (787) 265-2195
> Mayaguez, PR 00681-9013
> 

===============================================================================
Robb Kambic                                Unidata Program Center
Software Engineer III                      Univ. Corp for Atmospheric Research
address@hidden             WWW: http://www.unidata.ucar.edu/
===============================================================================