Re: computer freezes

Gabe Langbauer wrote:
Thanks Art,

I've checked out the hardware and it seems fine...We don't have any NFS
mounted, so that's not a problem...I've checked all system logs, none
produce anything...I've done a little logging now and it appears that a
gempak "gf" program that runs at about the same time as my cleanup script
"runs away" such that it runs for several hours, taking up 99% of one of
our cps's, then CRASH!!!

Gabe,
I would have responded to this thread earlier but I was out late last week. We had tons of problems with our ldm server running Linux last fall which sounds very similar to the behavior you are describing. I had migrated our LDM from Freebsd to Linux (Slackware, not that it makes much difference) and had huge problems with I/O waits bogging the entire system down. Our ldm server runs numerous gempak scripts from the cron which would frequently go weird on me and add to the system load. Our ldm is running on a dual cpu Dell with a hardware raid-5 SCSI controller (384MB cache on the controller). I tweaked the frequency and configuration of our scour scripts endlessly and never could get the system to run reliably. It would eventually have numerous instances of scour all competing for the the disk. I spent a LOT of time and effort in tweaking the filesystem parameters, kernel configuration, application configurations, etc. Regardless, it would eventually load up to the point that it became totally unresponsive and I could not log in on the console to clean things up. We have a two channel NOAAport feed and file nearly everything for both gempak and mcidas so our load is probably a bit atypical.

Eventually, I tired of babysitting the server and migrated the system back to Freebsd. The combination of the runaway gempak processes and scour problems proved to be too much hassle. Since then, the system has run nearly flawlessly. The only problems I've had were created by my own action: tweaking a script or something of that nature. The system I/O waits are about 1/10 of what they were with Linux (using ext3, reiserfs, or XFS) when running the ldm scour. The scour now takes about 1/20th the time that it did running under Linux. Gempak scripts behave *much* more predictably as well. Personally, I prefer to run Linux whenever possible as I've been using it since 94 or '95 but I think it could be worth your time to look at Freebsd or Solaris x86 as either would perform much better under high load.


--
Mark Tucker
Meteorology Dept. Systems Administrator
Lyndon State College
http://apollo.lsc.vsc.edu
mark.tucker@xxxxxxxxxxxxxxx
(802)-626-6328


  • 2005 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the gembud archives: