[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20041230: Bigbird.tamu.edu status



>From: Gerry Creager n5jxs <address@hidden>
>Organization: AATLT, Texas A&M University
>Keywords: 200412201755.iBKHtxlI027412 IDD

Hi Gerry,

I logged onto bigbird today and noticed that the load average was
overly high (e.g., 15-20).  A quick 'df -k' showed that the /data file
system was full.  Closer examination showed that the /data file system
is now very small given the amount of data that the LDM has been
configured to FILE in it:

[ldm@bigbird VIS]$ df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda2             56997280  11037704  43064264  21% /
/dev/hda1               101086     21240     74627  23% /boot
none                   1036156         0   1036156   0% /dev/shm
/dev/sda1            196079060 182878548  13200512  94% /data
10.2.9.100:/data1/bigbird
                     1446051616 565418624 807177792  42% /safety

When I logged on, /data was 100% used.  I deleted three days of NEXRAD
Level II data, and got enough space to "breath".  After doing this, I
watched the load average drop dramatically:

20041230.2214  14.59 14.49 13.42   24  11  35   3721  13M  17M  0 
scourBY(number|day)
20041230.2215  13.24 14.18 13.38   24  11  35   3630   3M  17M  0 
scourBY(number|day)
20041230.2217   9.93 12.45 12.83   24  11  35   3610  10M  16M  0 
scourBY(number|day)
20041230.2218  10.27 11.80 12.57   24  11  35   3626   3M  15M  0 
scourBY(number|day)
20041230.2219  10.49 11.53 12.42   24  11  35   3588   3M  15M  0 
scourBY(number|day)
20041230.2220   8.20 10.75 12.10   24  11  35   3521   3M  15M  0 
scourBY(number|day)
20041230.2221   4.38  9.24 11.49   24  11  35   3533   4M  15M  0 
scourBY(number|day)
20041230.2221   2.86  7.91 10.89   24  11  35   3543   3M  15M  0 
scourBY(number|day)
20041230.2223   5.50  7.62 10.60   24  11  35   3560   2M  15M  0 
scourBY(number|day)
20041230.2223   6.04  7.32 10.30   24  11  35   3545   3M  15M  0 
scourBY(number|day)
20041230.2225   4.19  6.59  9.86   24  11  35   3509   3M  15M  0 
scourBY(number|day)

Also, after freeing up some space I noticed that the CPU use by rpc.ldmd
processes that are feeding downstream machines climbed.  This told me
that they were trying to catch up on feeds that had been slowed by
bigbird's laboring which was induced by a lack of space in /data.

The questions I need to ask are:

- why is /data so small

- is /data scheduled to be expanded into the > 1 TB range that it used
  to be

If the size of /data will not be increased in the very near future, I
strongly recommend turning off all LDM pqact processing actions so
LDM data relaying will not be adversly effected.

Have a great New Years!

Cheers,

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.

>From address@hidden  Thu Dec 30 20:07:01 2004

I've gotta look into this: I upgraded the hardware to a total of 2.3TB 
yesterday.  I'll go kill some stuff and see what's happening.

I've been working on some other buglets, and didn't notice that!

I'll be back with you!
gerry