[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030911: LDM question



>From: "Mark J. Laufersweiler" <address@hidden>
>Organization: OU
>Keywords: 200309111605.h8BG5kLd002485 LDM rpc.ldmd split feed

Hi Mark,

>We are getting ready to switch out our ldm machines here at OU and I
>have been poking around as to performance issues. (We are switching
>from a software disk raid to a hardware raid.)

OK.

>While doing a netsat to look at connections, I get the following:
>
>> netstat -a | grep ldm
>tcp4       0      0  stokes.unidata-ldm     snapcount.ocs.ou.49405 ESTABLISHED
>tcp4       0   1996  stokes.unidata-ldm     kelvin.ca.uky.ed.33071 ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     pluto.met.fsu.ed.46831 ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     pluto.met.fsu.ed.46824 ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     weather3.ca.uky..32771 ESTABLISHED
>tcp4       0  15048  stokes.unidata-ldm     weather3.ca.uky..32770 ESTABLISHED
>tcp4       0      0  stokes.2355            suomildm1.cosmic.unida ESTABLISHED
>tcp4       0   1288  stokes.unidata-ldm     stokes1.1025           ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     chinook.phsx.uka.48254 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     chinook.phsx.uka.48189 ESTABLISHED
>tcp4       0   2772  stokes.unidata-ldm     orchid.roc.noaa..62119 ESTABLISHED
>tcp4       0      0  stokes.3827            eldm.fsl.noaa.go.unida ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     ldm.iihr.uiowa.e.32848 ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     blizzard.storm.u.36106 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     papagayo.unl.edu.49558 ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     zephir.eas.slu.e.1305  ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     ocs059.ocs.ou.ed.41454 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     winkie.caps.ou.e.23848 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     munchkin.caps.ou.11333 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     bergeron.snr.mis.53693 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     bergeron.snr.mis.53692 ESTABLISHED
>tcp4       0     44  stokes.unidata-ldm     aqua.nsstc.uah.e.33558 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     nat.nssl.noaa.go.27254 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     blizzard.storm.u.33981 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     blizzard.storm.u.33979 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     blizzard.storm.u.33978 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     bergeron.snr.mis.53677 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     bergeron.snr.mis.53676 ESTABLISHED
>tcp4       0   1736  stokes.unidata-ldm     bergeron.snr.mis.53675 ESTABLISHED
>tcp4       0      0  stokes.unidata-ldm     bergeron.snr.mis.53674 ESTABLISHED
>tcp4       0      0  *.unidata-ldm          *.* LISTEN
>
>I am noticing that many of my downstream sites are connected more
>than once. How does this relate/affect performace of the hosting
>machine?

What is most likely going on is that downstream sites have split feed
requests to stokes, something that was made easy with LDM-6.  To verify
that this is, in fact, what is going on, do a grep for 'topo' in your
~ldm/logs/ldmd.log* files.

As to affecting performance, yes it does affect your performance, but
given the small number of connections I see on stokes, I would say
that the effect is not much.

>From top:
>
>last pid: 61742;  load averages:  3.40,  3.31,  3.84     up 4+09:35:21  15:17:
> 44
>88 processes:  1 running, 87 sleeping
>CPU states: 10.1% user,  0.0% nice,  6.2% system,  9.3% interrupt, 74.3% idle
>Mem: 703M Active, 82M Inact, 173M Wired, 44M Cache, 112M Buf, 1664K Free
>Swap: 1623M Total, 14M Used, 1609M Free
>
>  PID USERNAME PRI NICE  SIZE    RES STATE    TIME   WCPU    CPU COMMAND
>  404 ldm       18   0  1947M 40504K pause  198:13  3.03%  3.03% pqact
>61445 ldm        2   0 24396K  7900K select   2:55  2.49%  2.49% dcgrib2
>  418 ldm        2   0  1945M 27700K select  46:51  1.17%  1.17% rpc.ldmd
>61694 ldm        2   0 20836K  1512K select   0:02  1.07%  1.07% dclsfc
>61742 ldm       31   0  2024K   968K RUN      0:00  1.94%  0.93% top
>  403 ldm       18   0  1945M 36048K pause   61:46  0.83%  0.83% pqbinstats
>61693 ldm        2   0 23096K  2552K select   0:03  0.39%  0.39% dcacft
>  406 ldm       18   0  1945M 74524K pause   18:04  0.15%  0.15% rtstats
>  405 ldm       18   0  1945M 36092K pause   12:11  0.10%  0.10% rtstats
>  493 ldm       18   0  1945M 36104K pause   11:03  0.10%  0.10% rpc.ldmd
>  535 ldm        2   0  1945M 37172K select   9:54  0.10%  0.10% rpc.ldmd
>  446 ldm        2   0  1945M 38528K select   9:16  0.10%  0.10% rpc.ldmd
>22743 ldm       18   0  1945M 37168K pause    5:52  0.10%  0.10% rpc.ldmd
>  440 ldm        2   0  1945M 37852K select  11:12  0.05%  0.05% rpc.ldmd
>  449 ldm        2   0  1945M 38384K select   9:51  0.05%  0.05% rpc.ldmd
>  408 ldm        2   0  1945M 32040K select  19:09  0.00%  0.00% rpc.ldmd
>  465 ldm       18   0  1945M 36828K pause   17:39  0.00%  0.00% rpc.ldmd
>  525 ldm       18   0  1945M 75704K pause   16:32  0.00%  0.00% rpc.ldmd

This listing does not reflect all of the connections that should be
running.  The number of connections by downstream sites can be easily
determined by:

netstat | grep stokes.unidata-ldm | wc

As far as the load average goes, I see that pqact is at the top of CPU
use, and that dcgrib2 is second.  My observation is that dcgrib2 pretty
much always shows up at the top of folks's top listings.  Are you
processing CONDUIT data with it?

Tom