[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19991201: ldm losing large files




On Thu, 2 Dec 1999, Robb Kambic wrote:

> On Wed, 1 Dec 1999, Unidata Support wrote:
> 
> > >To: address@hidden
> > >From: Jeff Masters <address@hidden>
> > >Subject: ldm losing large files
> > >Organization: .
> > >Keywords: 199912011922.MAA04768
> > 
> > 
> >   Hi, I am using ldmsend to transfer large files between ldm machines. I
> > am having intermittent problems with ldm dropping files that are bigger
> > than about 12Mb. 
> > 
> >   from the source machine named lanina (running Linux slackware 2.0.36 and
> > ldm-5.0.8 with a 400 Mb queue) I issue this command:
> > 
> > ldmsend -h breeze /logs/lanina.log.gz
> > 
> >   I see the following activity in the ldm.log file on the destination
> > machine (breeze), a slackware Linux 2.2.6 box running ldm-5.0.8 with a 600
> > Mb queue: 
> > 
> > Dec 01 19:08:24 breeze lanina[29943]: Connection from
> > lanina.wunderground.com 
> > Dec 01 19:08:24 breeze lanina[29943]: hiya: 19991201190824.745 TS_ENDT
> > {{EXP,  ".*"}} 
> > Dec 01 19:08:24 breeze lanina[29943]: Connection reset by peer 
> > Dec 01 19:08:24 breeze lanina[29943]: Exiting 
> > 
> > sometimes the file (16Mb) will show up, sometimes not. an "ldmadmin watch
> > -f EXP" does not show the file coming across and entering the queue when
> > the file doesn't show up. No error messages on either machine indicate why
> > the file didn't make it.
> > 
> > Any ideas for troubleshooting? 
> 
> Jeff,
> 
> ldmsend is not as reliable as the newer ldm commands to send files to a
> remote machine.  Try using pqinsert or pqsend to perform the tasks because
> they use the LDM internal connections, etc verses ldmsend temporary set up
> of connections between machine. For example, the possibility to resend the
> product if a network connection is broken.  Look at the man pages for
> pqinsert and pqsend.
> 

Thanks, I'll do that. The queue was big enough, there were no "Deleteing
oldest" messages. However, I did find that the problem was mostly when I
ran ldm-5.0.6 on the upstream machine, and 5.0.8 on the downstream
machine. Running the same version of ldm seems to fix the problem in all
but one case I tried, which required me to restart the web server before
it would work.

> 
> 
> Is there a command I can issue to see how
> > mcuh of the queue is currently being taken up? 
> 
> The "pqutil stats" command gives the stats about the queue. Again look at
> the man page.
> 
> I have about 80 Mb/hour of
> > data going between the machines, so the 400-600 Mb queue size I am using
> > should be enough to handle a 16Mb file.
> > 
> I would check the log file for entries about "deleting oldest" to insert
> a product into the queue.  If you don't see any of these type of messages
> than the queue is large enough. A 400-600 Mb queue is large enough for
> 80Mb / hour.
> 
> Robb...
> 
> 
> ps. Can you add an "allow UNIDATA" line for the following sites for
> failover purposes?
> 

Done. Can you add an "allow" line on thelma and iita for these machines:

yang.sprl.umich.edu
onesky.engin.umich.edu

We will be retiring both blueskies and aldehyde.sprl.umich.edu this month,
so I am setting up a new top-tier machine and backup here at umich.

Thanks, jeff


> grayskies.atmos.washington.edu
> norte.sfsu.edu
> typhoon.atmos.ucla.edu
> vision.soest.hawaii.edu
> ocs.oce.orst.edu
> bodie.met.nps.navy.mil 
> ldmhost.dri.edu
> meteora.ucsd.edu
> nimbus.atmo.arizona.edu
> 
> Thanks,
> Robb...
> 
> 
> > Thanks, Jeff
> > ------------------------------------------------------------------------------
> >  Dr. Jeff Masters (address@hidden)                           (  )  
> >  Chief Meteorologist                              /\ Home of the       (    
> > ) 
> >  The Weather Underground, Inc.               /\  /  \  /\       /\    (     
> >  )
> >  P.O. Box 3605                              /  \/    \/  \ /\  /  \    
> > ------
> >  Ann Arbor, MI 48106-3605            ______/              /  \/    \_   
> > \\\\\
> >  734-994-8824                                   Weather Underground      
> > \`\`\
> >                                             http://www.wunderground.com
> > 
> 
> ===============================================================================
> Robb Kambic                              Unidata Program Center
> Software Engineer III                    Univ. Corp for Atmospheric Research
> address@hidden                   WWW: http://www.unidata.ucar.edu/
> ===============================================================================
>