[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[McIDAS #XDP-858749]: Mcidas Memory problems



Hi Mary Ellen,

re:
> for the second time in a week our McIDAS jobs launched from the crontab have
> begun failing due to a memory problem. The error message is:
> 
> "Cannot make positive UC: could not create 384320-byte shared memory segment"

This is typically an indication that McIDAS sessions (a cron-initiated
script that does McIDAS processing is a session in the same way that an
interactive session is) are not being setup and exited correctly. If
the processing script is not setup correctly (meaning that environment
variables needed by McIDAS are not created correctly) or is interrupted
and so does not shutdown correctly, shared memory segments that have
been created by the user running the script(s) are not returned back
to the OS.  When enough shared memory segments are left, the OS runs out
of memory needed to create those shared memory segments.

re:
> I can log on to the machine as user "mcidas" and successfully run commands
> but not as user "sleet" which owns and runs the scripts through the cron.
> For example, a simple aka.k LIST command produces:
> 
> sleet@tornado:~/mcidas/data$ aka.k LIST
> aka.k: Cannot make positive UC: could not create 384320-byte shared memory 
> segment

Try this:

- logon as 'sleet' and run:

ipcs -a

I am willing to bet that there are a number of shared memory segments owned by
'sleet'.  These can be cleared using the 'ipcrm' utility; do a 'man ipcrm'
on your machine to see how to run it.

The other thing that will be created along with the shared memory segments
are subdirectories of the .mctmp directory in the HOME directory of the
user.  Along with running 'ipcs -a', also run:

<as 'sleet'>
ls -alt ~sleet/.mctmp

I am willing to bet that there are LOTS of subdirectories listed.  These
can be removed by 'sleet'.

re:
> When this happened the first time, by simply logging onto the machine
> everything was fixed. Literally Randy and did nothing except log on and it
> started working.

This is surprising.

re: 
> The symptom is back and a simple log on is not clearing things up this time.

A simple login is not the solution even if it works.

re:
> We rebooted the machine and that fixed things this time.

Rebooting the machine will forceably remove all of the shared memory
segments for all users.  It will not, however, delete the subdirectories
created by McIDAS (the subdirectories of ~user/.mctmp).

re:
> Obviously, we don't
> want to have to reboot this machine every 3-4 days but rather we need to
> determine what is causing the memory overload or how to flush the memory
> nightly.

The "overload", as you call it, is caused by scripts running McIDAS commands
either not being setup correctly, or not terminating correctly.  Again, when
this happens the shared memory segments that are created do not get deleted.
When this happens enough times, there is no shared memory left out of which
new shared memory segments can be created.

The fix will likely be twofold:

- investigate carefully the scripts being running and fix whatever problems
  that they have

- make sure that your OS is configured to have enough shared memory

  The Unidata McIDAS User Guide has a section on setting the shared memory
  in various OSes:


Unidata HomePage
http://www.unidata.ucar.edu

  Software -> McIDAS
  http://www.unidata.ucar.edu/software/mcidas/

    Documentation & Training
    http://www.unidata.ucar.edu/software/mcidas/#documentation

      McIDAS user's guide
      http://www.unidata.ucar.edu/software/mcidas/current/users_guide/toc.html

Installing and Configuring McIDAS-X
http://www.unidata.ucar.edu/software/mcidas/current/users_guide/InstallingandConfiguringMcIDAS-X.html#17470

  Installing McIDAS-X on Unix or Mac OS X Workstations
  
http://www.unidata.ucar.edu/software/mcidas/current/users_guide/InstallingMcIDAS-XonUnixorMacOSXWorkstations.html#63755

    Preparing the Workstation
    
http://www.unidata.ucar.edu/software/mcidas/current/users_guide/PreparingtheWorkstation.html#25760

      Allocating Sufficient Shared Memory -> Shared Memory Configuration
      
http://www.unidata.ucar.edu/software/mcidas/current/users_guide/workstation.html

re:
> The issue is critical because of our data download schedules and
> realtime processing are being interrupted.

I understand.

re:
> Any thoughts would be greatly appreciated.

Please review what I wrote above and let me know if you have any questions.

re:
> Thanks so much!

No worries.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: XDP-858749
Department: Support McIDAS
Priority: Normal
Status: Closed