[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20040527: bigbird rises from the ashes :-)



Unidata Support wrote:
From: Gerry Creager N5JXS <address@hidden>
Organization: Texas A&M University -- AATLT
Keywords:  200405270011.i4R0BjtK010667 LDM Linux RAID JFS


Hi Gerry,


I do not yet propose to rename it 'phoenix'...


I was going to recommend that you do if all goes well :-)  I just
watched Flight of the Phoenix again (tenth time?) the other night, so
this was on my mind.

I want to see it fly farther than the next crash before we rename it. I love that movie!

re: chunk-size setup in /etc/raidtab


I did find several web references recently that recommended 4k chunking. I'd done larger chunks with the 'hardware' RAID, and we know the results, although these were likely muddied by other issues. I don't mind re-chunking if you want to try it, and I'd be willing to go to 128k for a test.


Well, bigbird is crusing right now, so it is hard to argue with the 4K
chunk-size:

-- tail end of ~ldm/logs/bigbird.uptime

20040527.1628   0.91  1.26  1.82    8   0   8   1970  49M    0  0 scourBYnumber
20040527.1629   0.56  1.10  1.73    8   0   8   1983  48M    0  0 scourBYnumber
20040527.1630   1.13  1.17  1.71    8   0   8   2010  50M    0  0 scourBYnumber
20040527.1631   2.07  1.49  1.80    8   0   8   2021  48M    0  0 scourBYnumber
20040527.1632   1.21  1.34  1.72    8   0   8   2056  49M    0  0 scourBYnumber
20040527.1633   0.83  1.24  1.66    8   0   8   2067  49M    0  0 scourBYnumber
20040527.1634   0.65  1.12  1.59    8   0   8   2082  48M    0  0 scourBYnumber
20040527.1635   0.88  1.12  1.56    8   0   8   2101  49M    0  0 scourBYnumber
20040527.1636   0.50  0.96  1.48    9   0   9   2123  48M    0  0 scourBYnumber
20040527.1637   1.11  1.08  1.49    8   0   8   2155  49M    0  0 scourBYnumber
20040527.1638   0.98  1.06  1.45    8   0   8   2184  48M    0  0 scourBYnumber
20040527.1639   0.96  1.04  1.42    8   0   8   2204  48M    0  0 scourBYnumber
20040527.1640   1.87  1.24  1.46    8   0   8   2228  49M    0  0 scourBYnumber
20040527.1641   1.74  1.35  1.49    8   0   8   2258  48M    0  0 scourBYnumber
20040527.1642   0.87  1.18  1.42    8   0   8   2280  49M    0  0 scourBYnumber
20040527.1643   0.76  1.08  1.37    8   0   8   2288  49M    0  0 scourBYnumber
20040527.1644   1.56  1.19  1.38    8   0   8   2315  49M    0  0 scourBYnumber

These are the lowest load averages I have ever seen on bigbird while
the LDM and decoders are running!

I suspect going back to ext3 may also help that. It's a fairly efficient journal scheme, and jfs, while it's theoretically better for the filesizes we're seeing, isn't necessarily efficient in journaling.

Curiosity:  was the setup error in raidtab specifying that there were
spare disks that weren't there?  I didn't study the differences between
raidtab and raidtab.old last night...

I had 2 disks identified in there that didn't exist and I'd forgotten to remove them. I was trying to run spares on 2ndary channels of IDE and that wasn't helping. I've pruned the RAID size by one disk and made the prune'd disk a spare now.

I am all for staying with this setup if it works well.  Again, I am
just trying to learn as much as possible about RAID on Linux.  More and
more Unidata sites are moving to Linux (and Linux clusters), and
installing RAIDs since disks are so cheap (too bad the same can't be
said for memory!).

Indeed. I'd love to help identify a good config for hard disk/RAID and memory. I think that's going to be important in the long run.

By the way, my Google searches last night showed that O'Reilly has a
book out on Linux and RAIDs: Managing RAID on Linux.  I will be picking
up a copy of this tomorrow if it is the store, otherwise I will be
ordering it ASAP.

I've got it. It's rather disparaging of s/w RAID, and of IDE RAID in general. While I would love to do a SCSI RAID, I can't afford the disks and most of the SCSI disks are considerably smaller... Most of what he has offers alternatives with very few concrete suggestions. I used it as a guide, with interpolation from SCSI to IDE, for RAIDTAB settings. There's a brief discussion of chunk sizing. I've seen better offerings from O'Reilly. I have a pretty complete library for reference.

Gotta go into the office. I've got a PlanetLab node dead and I've got to troubleshoot it before Dell will send a technician on-site to repair.

Gerry
--
Gerry Creager -- address@hidden
Network Engineering -- AATLT, Texas A&M University  
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843