[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040508: bigbird LDM and GEMPAK setup



>From:  Gerry Creager N5JXS <address@hidden>
>Organization:  Texas A&M University -- AATLT
>Keywords:  200405081241.i48CfXtK004315 LDM GEMPAK ntpdate ntpd

Hi Gerry,

>All appears happy at this point.  Thanks for the help.

It appears that the LDM was stopped at about 2355Z yesterday and
the queue was deleted and remade.  I see this in the performance monitoring
script log output that I setup /home/ldm/logs/bigbird.uptime:

--  date.time  load1 load5 load15  rec feed tot age   mem  swap  scouring

20040507.2350   1.64  1.75  1.87    2   0   2   5040  48M   1M  0 prune_dir
20040507.2351   1.96  1.92  1.93    2   0   2   5084  49M   1M  0 prune_dir
20040507.2352   2.11  1.94  1.93    2   0   2   5132  49M   1M  0 prune_dir
20040507.2353   1.67  1.89  1.91    2   0   2   5180  49M   1M  0 prune_dir
20040507.2354   1.16  1.71  1.85    2   0   2   5226  51M   1M  0 prune_dir
20040507.2355   1.16  1.62  1.81    2   0   2   5276  46M   1M  0 prune_dir
20040507.2356   4.43  2.35  2.04    2   0   2     30 123M   1M  0 prune_dir
20040507.2357  15.10  5.41  3.09    2   0   2     90  49M   1M  0 prune_dir
20040507.2358  30.85 11.54  5.33    2   0   2    150  50M   1M  0 prune_dir
20040507.2359  55.88 22.20  9.37    2   0   2    209  48M   1M  0 prune_dir
20040508.0000  42.46 24.82 11.10    2   0   2    270  48M   2M  0 prune_dir
20040508.0001  33.47 25.40 12.15    2   0   2    330  49M   2M  0 prune_dir

You can follow the age of the oldest product in the queue (age column)
and see it grow and shrink as CONDUIT data ebbs and flows.  When it drops
to near zero, it means that the queue has most likely been deleted and
remade.  This is reinforced by the load average rocketing.

Did you delete the queue?  If yes, any particular reason why?  Just
curious...

>That said:  Chiz released Gempak 5.7p2 yesterday.

Yes.  I was not aware that he was on the brink of releasing 5.7p2.
If I had been, I might have suggested waiting before setting up
GEMPAK stuff.

>I'm now going to try 
>to get it alive, and add the new pqact instances.  Hopefully I won't 
>wedge something in the process

The big difference between p1 and p2 is the ability to read the bzip2
compressed Level II data directly.  Chiz changed the CRAFT pqact.gempak
actions to FILE the data and not run it through dcnexr2.  This will
dramatically cut down on system CPU and disk use for that data, but
will add more work for the applications that do the display (since they
will have to uncompress the data each time they want to use it).  The
downside at the moment is that the IDV does not (yet) work with the
bzip2 compressed files.

I suggest leaving bigbird configured the way it is for a bit so we can
study its performance while decoding Level II and CONDUIT data.  I am
of the mind to also turn on all decoding so that we can compare your
dual Xeon system against our dual Athlon 2800+.  Both systems are
running the same Fedora Core 1 kernel, 2.4.22-1.2188.nptlsmp (even
though mine is for Athlon), so this will be an interesting comparison.

I also want to turn on scouring in the same way that I am running it on
our dual Athlon 2800+ machine.

>In looking at the performance info (latency, etc) bigbird's timing is 
>all over the graph.

You have to be careful when looking at CRAFT latencies.  Not all of the
NEXRAD inject systems have synchronized clocks yet, so some of them
continually show latencies in the future or not so near past.  CONDUIT
latencies look good.

>Could you give me the hostname of your ntp server 
>again?  I'm thinking that Stratum-1's are not created equally.  I'm 
>currently referencing tick.uh.edu, which is supposedly a -1 source.

I jumped onto bigbird this morning and mucked with clock setting:

- edit /etc/init.d/ntpd and changed:

chkconfig: - 58 74

  to:

chkconfig: 345 58 74

- I then turned ntpd:

chkconfig --level 2345 ntpd off

- finally, I added running ntpdate from 'root's cron:

MAILTO=""
#
# Set the clock using ntpdate
0,15,30,45 * * * * /usr/sbin/ntpdate timeserver.unidata.ucar.edu

The clock on bigbird was well synchronized before the change,
so this was not really necessary.  We can undo the ntpdate stuff and
reenable running ntpd later if desired.

>I'm moving my lab in a month and I plan to institute GPS time at that 
>point in the new lab.  maybe, just maybe, that'll fix this for me.

It wasn't off. The problem is the time on products from several of
the NEXRADs.  This will be fixed as Build 5 is installed at the various
radars.

>Again, thanks for the help on short notice!

No worries.  That was pretty much of a record install: LDM and GEMPAK
in under 10 minutes. :-)

Cheers,

Tom

>From address@hidden  Sat May  8 11:51:09 2004

>Howdy!

re: Did you delete the queue?  If yes, any particular reason why?  Just
curious...

>Yes.  I wanted a clean restart _after_ checking some issues with ntp 
>that I hoped would clean up some of the time discrepancies between us 
>and you guys.

>Didn't help that I could see.

re: I suggest leaving bigbird configured the way it is for a bit so we can
study its performance while decoding Level II and CONDUIT data.  I am
of the mind to also turn on all decoding so that we can compare your
dual Xeon system against our dual Athlon 2800+.  Both systems are
running the same Fedora Core 1 kernel, 2.4.22-1.2188.nptlsmp (even
though mine is for Athlon), so this will be an interesting comparison.

>OK.  WILCO.

re: I also want to turn on scouring in the same way that I am running it on
our dual Athlon 2800+ machine.

>scour is currently on.  Feel free to change it and see what can be done 
>on the "bigger smoke" machines to head toward an optimum settings config.

re: setting up ntpdate

>Redhat and Fedora Core have a 'redhat-config-time' command that allows 
>you to select the time server of your choice; works pretty well.  That's 
>how I'd config'd tick.uh.edu

>Wish me luck.  I'm off to IOOS Techs in DC next week, then proposal 
>writing en masse for 2 days for the SURA SCOOP project.  I expect to see 
>OPeNDAP and DODS strongly supported there, as well as some commentary on 
>LDM.  Especially since I'm going to be raising the issues on data 
>aggregation and distribution...

>Take care; have a good weekend!
>gerry
>-- 
>Gerry Creager -- address@hidden
>Texas Mesonet -- AATLT, Texas A&M University   
>Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
>Page: 979.228.0173
>Office: 903A Eller Bldg, TAMU, College Station, TX 77843