[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20010516: 20010514: 20010507: 20010507: oabsnd and swap space



I think you're right - I increased the swap space by 250 MB and things
seem to be running normally.  Thanks for your help.

Chris


On Thu, 17 May 2001, Unidata Support wrote:

> 
> Chris,
> 
> I logged in and took a look at your system. I couldn't run "sar" on
> your system.  You didn't have "top" that I could find so I copied one over:
> 
> ----
> load averages:  1.50,  0.77,  0.50                14:36:58
> 95 processes:  90 sleeping, 3 running, 1 zombie, 1 on cpu
> CPU states: 39.4% idle, 36.5% user,  9.2% kernel, 14.9% iowait,  0.0% swap
> Memory: 512M real, 13M free, 250M swap in use, 125M swap free
> ---
> 
> It appears that it really is a lack of system resources that is causing the
> problem of forking the process. Some of the problem could possibly be
> alleviated with fewer httpd processes. You could probably stand to add
> some sawp space since you only have about 375MB available with 512MB of RAM.
> 
> In GEMPAK 5.4, the maximum grid size was 100,000 grid points. I increased this
> size to 400,000 in GEMPAK 5.6 to enable the use of several of
> NCEP's larger grids. 
> 
> Since oabsnd (about 90M of space is allocated on launching) probably uses up
> the available free memory, the forking of gplt fails.
> 
> We have 3 options:
> 
> 1) add more swap and see if this helps. You can do this with "mkfile" and
> adding it to swap with "swap -a". If you have the disk space, adding
> 128MB would get you closer to the 1 to 1 ratio.
> 
> 2) recompile GEMPAK 5.6 with the smaller grid sizes defined so that you have 
> the
> same array sizes as when you were happy with GEMPAK 5.4.
> 
> 3) I was able to get the upperair.csh script to somewhat work around the 
> problem
> by forcing gplt to be launched before running oabsnd. This worked better,
> but still ran into problems when McIDAS was doing imgremap.k.
> 
> 
> Here is what I tried:
> 
> I created a directory under $GEMDATA/tmp/upperair.chiz and coppied your
> upperair.csh script to that directory for tinkering. I changed some paths
> in the script so as not to overwrite your grids in $HOME, and to
> use the upperair.chiz directory.
> 
> Following the gddelt section you have, I added:
> # Lets just launch a program to get gplt fired up. Use this one
> # gplt for the entire script.
> echo " "
> echo "get gplt launched....."
> gpmap << GPLT
> 
>    e
> GPLT
> 
> 
> 
> Launching gpmap gets gplt running.  Now, you won't have to worry about
> forking gplt in oabsnd, since it is already running. You can still see:
> "Killed" when trying to launch oabsnd if you don't have enough memory.
> 
> 
> I also removed the individual gpend commands you had in the script
> since you only need it after you are finished with all your oabsnd 
> invocations.
> Since it takes time to fork the gplt process, you are better of only 
> having to start it up once.
> 
> I left the upperair.chiz directory (with "top" there too) for you.
> 
> One other thing, in your .cshrc, your source Gemenviron, then you set
> you path. Since Gemenviron will add GEMEXE and SCRIPTS_EXE to your path,
> it is better to set your path first (without hardcoding the gempak
> binary directory in it, and then sourcing Gemenviron. Since you were not
> adding the SCRIPTS_EXE directory into the PATH, you were overriding the PATH
> and preventing scripts like "cleanup" from working.
> 
> Running "cleanup -c" will remove the message queues and kill off the 
> gplt and parrent processes for a user.
> 
> Steve Chiswell
> Unidata User Support
> 
> 
> 
> 
> 
> 
>  
> >From: address@hidden (Chris Hennon)
> >Organization: UCAR/Unidata
> >Keywords: 200105161652.f4GGqdp13789
> 
> >Steve -
> >
> >I apologize for taking up so much of your time.  I'll understand if you
> >have other things to take care of.
> >
> >I've been working with one specific script which I am running by itself.
> >Hopefully, this specific example will yield some useful information.  I
> >was wondering if you could login to my machine and take a look.  The
> >script is located in:
> >
> >/usr/local/gempak/scripts/upperair/upperair.csh
> >
> >Basically, it runs oabsnd multiple times to create upperair grids, then
> >calls a variety of other scripts that produce upperair plots.  When this
> >script completes, it leaves behind several message queues:
> >
> >ipcs -pt
> >IPC status from <running system> as of Wed May 16 11:15:48 EDT 2001
> >T     ID      KEY      MODE         OWNER   GROUP LSPID LRPID STIME
> >STIME    RTIME    CTIME
> >Message Queues:
> >q   2951   0x4b3fb75  --rw-rw-rw-   gempak   ldm  4181   0  23:30:03
> >q    102   0x4b3fbe4  --rw-rw-rw-   gempak   ldm  4292   0  23:30:11
> >q    103   0x4b3fc28  --rw-rw-rw-   gempak   ldm  4360   0  23:30:20
> >q    104   0x4b3fca4  --rw-rw-rw-   gempak   ldm  4484   0  23:30:28
> >q    105   0x4b3fd0d  --rw-rw-rw-   gempak   ldm  4589   0  23:30:36
> >q    106   0x4b3fd81  --rw-rw-rw-   gempak   ldm  4705   0  23:30:44
> >q    107   0x4b3fdf1  --rw-rw-rw-   gempak   ldm  481    0  23:30:52
> >q   1108   0x4b402d8  --rw-rw-rw-   gempak   ldm  6072   0  23:35:20
> >q    109   0x4b402f1  --rw-rw-rw-   gempak   ldm  6097   0  23:35:39
> >T         ID      KEY        MODE        OWNER    GROUP  CPID  LPID
> >ATIME    DTIME    CTIME
> >Shared Memory:
> >m        202   0          --rw-rw-rw-   gempak      ldm 19174   294
> >17:01:19 17:01:19 17:01:19
> >T         ID      KEY        MODE        OWNER    GROUP   OTIME    CTIME
> >Semaphores:
> >twister:[/home/chennon/output/gifs/sat/1998]%
> >
> >but no gplt processes.  There is a log file from the last time I tried to 
> >run the script in:
> >
> >/usr/local/gempak/logs/upperair.log
> >
> >I ran it just after a reboot, so the system should have been clean.  One
> >other thing that happened after the rebuild that shouldn't have an impact
> >but I thought I would mention - we turned off a bunch of system processes
> >due to security concerns - the ones that are no longer active are in
> >/etc/rc2.d/turnedoff.  I don't see any that would have an impact on gempak
> >programs but I thought I would mention it. 
> >
> >I appreciate your efforts.  Thanks.
> >
> >Chris
> >  
> >================================================
> >| Chris Hennon              Ohio State University   |
> >| Tropical Meteorology      address@hidden   |
> >|                                              |
> >| Dept of Geography   Office: 1155 Derby Hall  |
> >| 1036 Derby Hall     Phone : (614) 292-2704   |
> >| Columbus, OH 43210  Fax   : (614) 292-6213   |
> >================================================
> >
> >On Mon, 14 May 2001, Unidata Support wrote:
> >
> >> 
> >> Chris,
> >> I'm not saying that you can't run more than 1 GEMPAK program at the same 
> >> tim
> > e.
> >> What I can say is:
> >> 1) if you have a program that frequently exits abnormally, and leaves 
> >> behind
> >>    a gplt, or other process, then the likelihood is that system resources 
> >> wi
> > ll
> >>    start to run short.
> >> 
> >> 2) If 2 processes ask for a gplt at the same time, it is possible for both 
> >> p
> > rograms
> >>    to be issued the same message queue ID by the system. This happens 
> >> becaus
> > e
> >>    until the program actually gets the gplt process running, the system 
> >> will
> >  keep
> >>    handing out the same available message queue. Using _gf programs where 
> >> th
> > e 
> >>    gplt and gf processes are linked to the application reduces the total
> >>    number of processes running on your system at any one time, and avoids 
> >> th
> > e
> >>    use of message queues- thereby avoiding the possible conflict above.
> >> 
> >> 3) If multiple programs are running at the same time, you should have ntl 
> >> ru
> > nning
> >>    on the display so that all processes use the shared color map so you 
> >> don'
> > t run out of
> >>    colors on the display (you can run ntl on a screen:1 as well). Or, use 
> >> th
> > e gif device
> >>    driver that doesn't require an X display to be running (you'll have to 
> >> us
> > e message queues
> >>    for the gif driver - except with the radmap_sw program which I do have 
> >> li
> > nked with gif insted of gf).
> >> 
> >> Nothing has changed in the underlying message queue system between 5.4 and 
> >> 5
> > .6, or the
> >> shared color system- so that isn't a cause for differences.
> >> 
> >> when you say that models take 2-3 hours to run, are you saying that the 
> >> time
> >  over which the data
> >> arrives is 2-3 hours, or are you saying that the GEMPAK programs take that 
> >> l
> > ong to run?
> >> I can help you organize actions to kick off when the LDM receives 
> >> necessary 
> > grids, or
> >> determine when all the pieces of data exists so that you don't have to run 
> >> p
> > rograms
> >> multiple times to recreate plots as more data arrives. Let me know if I 
> >> can 
> > help you.
> >> 
> >> Steve Chiswell
> >> Unidata User Support
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> >From: address@hidden (Chris Hennon)
> >> >Organization: UCAR/Unidata
> >> >Keywords: 200105142213.f4EMDfp11331
> >> 
> >> >Steve -
> >> >
> >> >This issue seems to have been resolved after a reboot, though I am not
> >> >sure why.
> >> >
> >> >Just to clarify,
> >> >are you saying that two or more gempak programs cannot be running at the
> >> >same time?  When I was using 5.4 and before the rebuild, I sometimes had 4
> >> >or 5 scripts cranking along at the same time with no problem.  I've
> >> >followed your suggestions, using the _gf programs where possible and using
> >> >master scripts for large jobs.  But there are still issues with
> >> >overlapping jobs - for example, surface fields get plotted every hour, but
> >> >to run the NGM,ETA, and AVN models takes at least 2-3 hours to run.
> >> >
> >> >Thanks ahead.
> >> >
> >> >Chris
> >> >
> >> >================================================
> >> >| Chris Hennon           Ohio State University   |
> >> >| Tropical Meteorology      address@hidden   |
> >> >|                                              |
> >> >| Dept of Geography   Office: 1155 Derby Hall  |
> >> >| 1036 Derby Hall     Phone : (614) 292-2704   |
> >> >| Columbus, OH 43210  Fax   : (614) 292-6213   |
> >> >================================================
> >> >
> >> >On Mon, 7 May 2001, Unidata Support wrote:
> >> >
> >> >> 
> >> >> Chris,
> >> >> 
> >> >> I was actually refering to the grid dimensions, can you send me the
> >> >> GDINFO for your grid file?
> >> >> 
> >> >> Steve Chiswell
> >> >> Unidata User Support
> >> >> 
> >> >> 
> >> >> 
> >> >> >From: address@hidden (Chris Hennon)
> >> >> >Organization: UCAR/Unidata
> >> >> >Keywords: 200105071933.f47JXqp00391
> >> >> 
> >> >> >Steve -
> >> >> >
> >> >> >The upperstr.grd file is pretty big:
> >> >> >
> >> >> >twister:[/usr/local/gempak/grids]% ls -l
> >> >> >-rw-r--r--   1 gempak   ldm      2575360 Apr 12 23:30 upperstr.grd
> >> >> >
> >> >> >oabsnd is version 5.6.a, as is dcuair.  
> >> >> >
> >> >> >Chris
> >> >> >
> >> >> >================================================
> >> >> >| Chris Hennon        Ohio State University   |
> >> >> >| Tropical Meteorology      address@hidden   |
> >> >> >|                                              |
> >> >> >| Dept of Geography   Office: 1155 Derby Hall  |
> >> >> >| 1036 Derby Hall     Phone : (614) 292-2704   |
> >> >> >| Columbus, OH 43210  Fax   : (614) 292-6213   |
> >> >> >================================================
> >> >> >
> >> >> >On Mon, 7 May 2001, Unidata Support wrote:
> >> >> >
> >> >> >> 
> >> >> >> Chris,
> >> >> >> 
> >> >> >> What is the size of the $HOME/grids/upperstr.grd file?
> >> >> >> What version of GEMPAK are you running (eg 5.6, 5.6.C)?
> >> >> >> Are you running a different version of the dcuair decoder?
> >> >> >> 
> >> >> >> For example:
> >> >> >>  GEMPAK-OABSND>version
> >> >> >> 
> >> >> >>  GEMPAK Version 5.6.c.1
> >> >> >> 
> >> >> >> % dcuair -help
> >> >> >> ....
> >> >> >> >Version 5.6.c.1<
> >> >> >> 
> >> >> >> 
> >> >> >> Steve Chiswell
> >> >> >> Unidata User Support
> >> >> >> 
> >> >> >> 
> >> >> >> >From: address@hidden (Chris Hennon)
> >> >> >> >Organization: UCAR/Unidata
> >> >> >> >Keywords: 200105071647.f47Gltp15071
> >> >> >> 
> >> >> >> >Steve -
> >> >> >> >
> >> >> >> >I double checked and all looks well there:
> >> >> >> >
> >> >> >> >twister:[/usr/local/gempak/scripts/upperair]% cd $GEMEXE
> >> >> >> >twister:[/usr/local/gempak/bin/sol]% ls -l gplt
> >> >> >> >-rwxr-xr-x   1 gempak   ldm       496276 Apr 23 13:45 gplt*
> >> >> >> >twister:[/usr/local/gempak/bin/sol]% cd ../../scripts/upperair
> >> >> >> >twister:[/usr/local/gempak/scripts/upperair]% oabsnd
> >> >> >> > SNFILE    Sounding data file                
> >> >> >> > $RAW_UPA/20010507_upa.ge
> > m
> >> >> >> > GDFILE    Grid file                         
> >> >> >> > $HOME/grids/upperstr.grd
> >> >> >> > SNPARM    Sounding parameter list           tmpc
> >> >> >> > STNDEX    Stability indices                  
> >> >> >> > LEVELS    Vertical levels                   925
> >> >> >> > VCOORD    Vertical coordinate type          PRES
> >> >> >> > DATTIM    Date/time                         12
> >> >> >> > DTAAREA   Data area for OA                   
> >> >> >> > GUESS     Guess file*time                    
> >> >> >> > GAMMA     Convergence parameter             0.3
> >> >> >> > SEARCH    Search radius/Extrapolation       20/EX
> >> >> >> > NPASS     Number of passes                  2
> >> >> >> > QCNTL     Quality control threshold          
> >> >> >> > Parameters requested: 
> >> >> >> > SNFILE,GDFILE,SNPARM,STNDEX,LEVELS,VCOORD,DATT
> > IM,
> >> >> >> > DTAAREA,GUESS,GAMMA,SEARCH,NPASS,QCNTL.
> >> >> >> > GEMPAK-OABSND>r
> >> >> >> >Could not fork
> >> >> >> > [GEMPLT -101]  NOPROC   - Nonexistent executable.
> >> >> >> > [OABSND -3]  Fatal error initializing GEMPLT.
> >> >> >> >twister:[/usr/local/gempak/scripts/upperair]%
> >> >> >> >
> >> >> >> >Chris
> >> >> >> >
> >> >> >> >================================================
> >> >> >> >| Chris Hennon             Ohio State University   |
> >> >> >> >| Tropical Meteorology      address@hidden   |
> >> >> >> >|                                              |
> >> >> >> >| Dept of Geography   Office: 1155 Derby Hall  |
> >> >> >> >| 1036 Derby Hall     Phone : (614) 292-2704   |
> >> >> >> >| Columbus, OH 43210  Fax   : (614) 292-6213   |
> >> >> >> >================================================
> >> >> >> >
> >> >> >> >On Mon, 7 May 2001, Unidata Support wrote:
> >> >> >> >
> >> >> >> >> 
> >> >> >> >> Chris,
> >> >> >> >> 
> >> >> >> >> OABSFC requires that "gplt" be found. The non-existent
> >> >> >> >> executable seems to indicate that $GEMEXE/gplt is either
> >> >> >> >> not bring found, that you don't have permission to execute it, 
> >> >> >> >> or that for some reason the system is not able to execute gplt.
> >> >> >> >> 
> >> >> >> >> Since it says non-existent, it sounds like the program is
> >> >> >> >> not being found. See if there is any problem with your $GEMEXE
> >> >> >> >> environmental variable (which is set when you sourced Gemenviron),
> >> >> >> >> and double check that gplt is executable as well.
> >> >> >> >> 
> >> >> >> >> The attempt to execute gplt occurs when you run the analysis,
> >> >> >> >> eg, not when you first start up oabxxx.
> >> >> >> >> 
> >> >> >> >> Steve Chiswell
> >> >> >> >> Unidata User Support
> >> >> >> >> 
> >> >> >> >> 
> >> >> >> >> 
> >> >> >> >> >From: address@hidden (Chris Hennon)
> >> >> >> >> >Organization: UCAR/Unidata
> >> >> >> >> >Keywords: 200105071618.f47GIbp13844
> >> >> >> >> 
> >> >> >> >> >Steve -
> >> >> >> >> >
> >> >> >> >> >I've run into a curious problem.  I'm trying to run "oabsnd" for 
> >> >> >> >> >j
> > ust
> >> >  on
> >> >> > e
> >> >> >> >> >level and one variable and the program exits with a NOPROC - 
> >> >> >> >> >Nonex
> > ist
> >> > ent
> >> >> >> >> >executable and "Could not fork" errors.  I think I have plenty 
> >> >> >> >> >of 
> > swa
> >> > p
> >> >> >> >> >space:
> >> >> >> >> >
> >> >> >> >> >swap -s
> >> >> >> >> >total: 67792k bytes allocated + 167728k reserved = 235520k used, 
> >> >> >> >> >1
> > 596
> >> > 08k
> >> >> >> >> >available
> >> >> >> >> >
> >> >> >> >> >There are no rogue processes around that I can see.  There are 
> >> >> >> >> >no 
> > dea
> >> > d
> >> >> >> >> >message queues.  In the past, I have run oabsnd under the same 
> >> >> >> >> >con
> > dit
> >> > ion
> >> >> > s
> >> >> >> >> >without a problem, even with more levels and more variables.  
> >> >> >> >> >The 
> > sup
> >> > por
> >> >> > t
> >> >> >> >> >archives all seem to indicate a problem with either swap space 
> >> >> >> >> >or 
> > orp
> >> > han
> >> >> > ed
> >> >> >> >> >processes but it doesn't appear that I have those issues.  Any 
> >> >> >> >> >ide
> > as?
> >> >> >> >> >Thanks.
> >> >> >> >> >
> >> >> >> >> >Chris    
> >> >> >> >> >
> >> >> >> >> >================================================
> >> >> >> >> >| Chris Hennon          Ohio State University   |
> >> >> >> >> >| Tropical Meteorology      address@hidden   |
> >> >> >> >> >|                                              |
> >> >> >> >> >| Dept of Geography   Office: 1155 Derby Hall  |
> >> >> >> >> >| 1036 Derby Hall     Phone : (614) 292-2704   |
> >> >> >> >> >| Columbus, OH 43210  Fax   : (614) 292-6213   |
> >> >> >> >> >================================================
> >> >> >> >> >
> >> >> >> >> 
> >> >> >> >> *******************************************************************
> > ***
> >> > ***
> >> >> > ***
> >> >> >> >  <
> >> >> >> >> Unidata User Support                                    UCAR 
> >> >> >> >> Unidat
> > a P
> >> > rog
> >> >> > ram
> >> >> >> >  <
> >> >> >> >> (303)497-8644                                                  
> >> >> >> >> P.O.
> >  Bo
> >> > x 3
> >> >> > 000
> >> >> >> >  <
> >> >> >> >> address@hidden                                   Boulder,
> >  CO
> >> >  80
> >> >> > 307
> >> >> >> >  <
> >> >> >> >> -------------------------------------------------------------------
> > ---
> >> > ---
> >> >> > ---
> >> >> >> >  <
> >> >> >> >> Unidata WWW Service                        
> >> >> >> >> http://www.unidata.ucar.
> > edu
> >> > /  
> >> >> >    
> >> >> >> >  <
> >> >> >> >> *******************************************************************
> > ***
> >> > ***
> >> >> > ***
> >> >> >> >  <
> >> >> >> >> 
> >> >> >> >
> >> >> >> 
> >> >> >> **********************************************************************
> > ***
> >> > ***
> >> >> >  <
> >> >> >> Unidata User Support                                    UCAR Unidata 
> >> >> >> P
> > rog
> >> > ram
> >> >> >  <
> >> >> >> (303)497-8644                                                  P.O. 
> >> >> >> Bo
> > x 3
> >> > 000
> >> >> >  <
> >> >> >> address@hidden                                   Boulder, CO
> >  80
> >> > 307
> >> >> >  <
> >> >> >> ----------------------------------------------------------------------
> > ---
> >> > ---
> >> >> >  <
> >> >> >> Unidata WWW Service                        
> >> >> >> http://www.unidata.ucar.edu
> > /  
> >> >    
> >> >> >  <
> >> >> >> **********************************************************************
> > ***
> >> > ***
> >> >> >  <
> >> >> >> 
> >> >> >
> >> >> 
> >> >> *************************************************************************
> > ***
> >> >  <
> >> >> Unidata User Support                                    UCAR Unidata 
> >> >> Prog
> > ram
> >> >  <
> >> >> (303)497-8644                                                  P.O. Box 
> >> >> 3
> > 000
> >> >  <
> >> >> address@hidden                                   Boulder, CO 80
> > 307
> >> >  <
> >> >> -------------------------------------------------------------------------
> > ---
> >> >  <
> >> >> Unidata WWW Service                        http://www.unidata.ucar.edu/ 
> >> >>  
> >    
> >> >  <
> >> >> *************************************************************************
> > ***
> >> >  <
> >> >> 
> >> >
> >> 
> >> ****************************************************************************
> >  <
> >> Unidata User Support                                    UCAR Unidata 
> >> Program
> >  <
> >> (303)497-8644                                                  P.O. Box 
> >> 3000
> >  <
> >> address@hidden                                   Boulder, CO 80307
> >  <
> >> ----------------------------------------------------------------------------
> >  <
> >> Unidata WWW Service                        http://www.unidata.ucar.edu/    
> >>  
> >  <
> >> ****************************************************************************
> >  <
> >> 
> >
> 
> **************************************************************************** <
> Unidata User Support                                    UCAR Unidata Program <
> (303)497-8644                                                  P.O. Box 3000 <
> address@hidden                                   Boulder, CO 80307 <
> ---------------------------------------------------------------------------- <
> Unidata WWW Service                        http://www.unidata.ucar.edu/      <
> **************************************************************************** <
>