[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re[2]: [awipsldm] Re: LDM Observations and Comments (fwd)

Subject: Re[2]: [awipsldm] Re: LDM Observations and Comments (fwd)
Date: Tue, 8 Feb 2000 08:09:47 -0700 (MST)

===============================================================================
Robb Kambic                                Unidata Program Center
Software Engineer III                      Univ. Corp for Atmospheric Research
address@hidden             WWW: http://www.unidata.ucar.edu/
===============================================================================

---------- Forwarded message ----------
Date: 07 Feb 2000 18:08:52 -0500
From: Ken Waters <address@hidden>
To: address@hidden
     address@hidden, address@hidden
Subject: Re[2]: [awipsldm] Re: LDM Observations and Comments


     Thanks for the thorough response, Russ.  I do appreciate it.
     
     I haven't fully absorbed it all yet, but a couple more items of 
     explanation are probably in order.  The files you saw represented a 
     transition and that is why you might have seen what you thought were 
     duplications.  I am moving from having several entries in the 
     pqact.conf file to having just one sending all products to the same 
     script.  Again, the reasoning behind this was that the LDM didn't have 
     the necessary flexibility I needed (e.g., -NOT- operators, ELSE 
     syntax, and top appending).
     
     I'm beginning to rethink that strategy---possibly have a series of 
     scripts, specific to the different data types.  My concern there, is 
     that this might become unmanageable in the pqact.conf file due to the 
     great number of data types we receive.
     
     I also think I had the problem of "losing" my stdin feed as you hinted 
     at in your reply.  I think I tried reusing it and discovered that it 
     was lost after the first use.  That's why I had to resort to the 
     writing of a temp file.
     
     Hope this all helps clear the picture.  Now, I'll go read your message 
     carefully and see I can come up with a good solution.
     
     Regards,
     
     Ken


______________________________ Reply Separator _________________________________
Subject: Re: [awipsldm] Re: LDM Observations and Comments
Author:  address@hidden at EXTERNAL
Date:    2/7/2000 4:45 PM


>To: address@hidden
>From: Ken Waters <address@hidden>
>Subject: Re: 20000207: [awipsldm] Re: LDM Observations and Comments 
>Organization: NWS Southern Region
>Keywords: PIPE action, decoder processes
     
Hi Ken,
     
> If you don't mind, I'd like to pursue the matter of my Perl script 
> that kicks off other system calls.  I have attached my script for 
> your reference.
>
> Basically, I write out the stdin to a temp file because there are a 
> series of actions that can be done to a file.  Maybe there is a
> better way.  I know you suggested not writing temp files, but either 
> way I'm forced to use system calls to write the file out, right?
     
If you want to do a series of actions to a product, the usual way to 
handle this is have multiple pqact.conf pattern-action entries match 
the product, with each specifying the same pattern but a different 
action.  But you must already know this, because you are already doing 
4 things with each product, according to your pqact.conf entries:
     
  # Test script for ALL products
  AFOS     ^(...)(...)(...|..)
       PIPE     -strip /home/ldm/process \1 \2 \3
     
  # Rotate all versions
  AFOS     ^(...)(...)(...|..)
       PIPE     -strip /home/ldm/version.csh \1 \2 \3
     
  AFOS     ^(...)(...)(...|..)
       FILE     -strip -overwrite /home/ldm/data/\1/\2/\1\2\3.1.txt
     
  # Append products
  AFOS     ^(...)(...)(...|..)
       FILE     -strip /home/ldm/data/\1/\2/\1\2\3.txt
     
The above is generating a *lot* of processes, two for each AFOS 
product for starters ("process" and "version.csh", with these 
generating even more processes as explained below.  The LDM is 
designed to permit you to start up a process once and keep it running 
to handle multiple products, instead of starting a process for each 
product.  That's the way our perl decoders work, and maybe you could 
use the same pattern to have your perl scripts each handle multiple 
products.
     
There's a brief description of this in the Site Manager's Guide:
     
    The PIPE command permits execution of an arbitrary process (an 
    executable program or a shell script, for example) with a data 
    product as standard input to the process. The program should read 
    from standard input and exit upon reading end of file. It should 
    also time out if no input is read for some time.
     
    Like files, pipelines to child processes are cached for use in 
    processing subsequent products. The pipeline will remain open until 
    the LDM server needs to recycle the output descriptor or until the 
    child process exits of its own accord. The -close option forces the 
    pipe to close after one product has been received.
     
The LDM pqact program maintains a list of 32 (currently) open file 
descriptors corresponding to files it is writing data to and pipes to 
which it is writing data to be read by running "decoder" processes. 
When a new product comes along, if pqact needs to invoke a "PIPE" 
action on it, pqact checks its list of open file descriptors to see if 
the invocation of the program and arguments are the same as for any of 
the open pipes on its list.  If so, it just writes the data down that 
pipe and it doesn't have to start up a new process.
     
In order to work with this model, processes that will handle multiple 
products need to be able to detect the delimiters that separate 
products, parse the product header if necessary, and block when there's 
no input until more input is available on the pipe they're reading 
from.  The only time such a process would exit would be if it detected 
that the pipe had been closed on the other end (by pqact, because it 
needed another descriptor and this one hadn't been used for the longest 
time of any of the open descriptors).
     
So pqact typically starts up a perl decoder for upper-air products, 
for example, and that one decoder keeps running, decoding every 
upper-air report that it reads from its stdin connected to the pipe 
from pqact.  If we started up a new instance of a decoder for every 
product, it would probably bog down the server, though it still would 
be feasible on modern workstations if there weren't too many products.
     
But you are starting up a couple of perl scripts for every product, 
and the first of these, "process", is starting up lots of other 
processes using the perl "system" function to mv and cp files.  I 
think it would be OK if you were only invoking 1 or 2 processes per 
product, but it looks like you're starting up more like a dozen.
     
In order to use the same process for multiple products, you have to 
make sure you don't use unique arguments for each process invocation, 
but instead let the process parse some of the information that is 
unique to each product.  That way, pqact will see the same process and 
the same argument string, and know to use the existing running process 
without starting up a new one.  For example, our upper air decoder 
only gets the year and month as arguments from the product header that 
it uses to invoke the decoder, so theoretically, that one decoder 
could stay running for a month:
     
    # upper air perl decoder
    DDS|IDS     ^U[ABDEFGHIJKLMPQRSTXZ].... .... ([0-3][0-9])
         PIPE     /usr/local/ldm/decoders/ua2nc
              etc/ua.cdl
              data/decoded
              (\1:yy)(\1:mm)
     
> Anyway, what's going on is that my process script is getting hung up 
> on all the "system("cat > $temp/$filenm.tmp");" lines.  At any one
> time, I find about 10-15 open jobs to the process script and each 
> one has another job which is writing out this temp file.  What is 
> puzzling to me is that the other uses of "cat" later in the script
> don't seem to be a problem.  I think what's going on is the various 
> different instances of the script are 'falling all over each other' 
> trying to read from stdin.  Some of these jobs are taking up to 5
> minutes to run!  This is, in some cases, leading to errors in the 
> proper storage of the data files.
     
Since pqact is starting up a new instance of the "process" script for 
each product, and not reusing any of these, and it only keeps 32 file 
descriptors open, each product results in a new file descriptor and 
closing the least-recently used file descriptor, so that process sees 
its pipe closed on the other end.  I don't know how your perl script 
reacts to its stdin being closed, but it may be causing some of the 
problems you are seeing.
     
Also, since you are doing most of the work in the perl scripts in 
"system" calls, lots of extra processes are getting started.  If 
there's some way you can limit the number of "system" calls from the 
perl script and do some of these another way, it would probably help 
cut down on all the CPU overhead involved in creating and destroying 
processes.
     
> It's important that I get this worked out as our data requirements 
> are increasing and I want to ensure this data feed works as well as 
> possible.
>
> For your reference, I have enclosed a copy of (1) my pqact.conf, (2) 
> the process script, and (3) an sample of jobs running [ps -ef].
     
Thanks for sending such a complete description of the problem.  I 
hope this helps explain what is going on.
     
--Russ
Prev by Date: Re: [awipsldm] Re: LDM Observations and Comments
Next by Date: Re: 20000208: LDM scour
Previous by thread: Re: [awipsldm] Re: LDM Observations and Comments
Next by thread: Re: 20000208: LDM scour
Index(es):
- Date
- Thread