[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #VWJ-362177]: ldm will not start - hupsyslog issue?



Hi Chris,

I moved the LDM installation on blizzard from /home/ldm to /local/ldm
today.  I did this by:

<as 'ldm' before the change>

- delete all LDM binaries in ~ldm/ldm-6.10.1/bin
- remove the ~ldm/logs link
- delete the LDM queue in ~ldm/var/queues

<as 'root'>

- create /local
- create /local/ldm
- change ownership of /local/ldm to ldm:users
- modified /etc/passwd entries for 'ldm' to match its new home
- change ownership of /data/local/ldm to ldm:users

<as 'ldm' after the changes by 'root'>
- move everything in /home/ldm to /local/ldm
- create /data/local/ldm/logs
- softlink ~ldm/logs to /data/local/ldm/logs
- rebuilt and installed LDM-6.10.1
- created the ~ldm/decoders and ~ldm/util directories

<as 'root'>
- reviewed the LDM-related entries in the rsyslog configuration file
  /etc/rsyslog.conf... found that the entry pointing to the LDM
  log file had _not_ been changed by the new installation.

  This was not surprising since the LDM installation assumes that
  once it has configured /etc/rsyslog.conf, it doesn't need to do
  it again.

  I edited /etc/rsyslog.conf and set the LDM log file entry to
  point at its physical location, /data/local/ldm/logs/ldmd.log

<as 'ldm'>
- tested the newly built 'hupsyslog' to make sure that it could
  send a HUP signal to the rsyslogd daemon; it could

- tested to see if 'ldm' could log to the new LDM log file:

  logger -p local0.debug 'test of ldm logging'

  I found that this did _not_ work!?

<as 'root'>
- reviewed /var/log/messages to see what was happening with
  rsyslogd logging of LDM log messages

  I found that rsyslogd had not reread its configuration file,
  or, at least, it had not updated its in-memory configuration
  for the location of the LDM log file

- restarted rsyslogd:

  service restart rsyslogd

<as 'ldm'>
- re-tested LDM logging using logger (see above)

  This time logging worked as it should

- reviewed LDM registry settings in ~ldm/etc/registry.xml

  Since longtime GEMPAK users are used to having GEMPAK
  pattern-action file actions write to the ~ldm/data directory
  structure, I modified the <pqact> and <pqsurf> entries
  <datadata-path> to be /local/ldm.

  I also modified the <log> registry entry to use the link
  reference to the LDM log file: /local/ldm/logs/ldmd.log.

- reviewed LDM REQUEST entries in ~ldm/etc/ldmd.conf

  I saw that it was setup to request WMO|UNIWISC from
  storm5.atms.unca.edu.   Aside: WMO|UNIWISC is the same
  thing as UNIDATA, so the following two REQUEST lines
  are equivalent:

REQUEST WMO|UNIWISC ".*" storm5.atms.unca.edu

REQUEST UNIDATA ".*" storm5.atms.unca.edu

- added an ldmd.conf ALLOW entry for storm5.atms.unca.edu:

ALLOW   ANY     ^storm5\.atms\.unca\.edu\.?$    .*

- noticed that the firewall on blizzard is _not_ setup to allow
  inbound requests on port 388

  This would need to be changed if you want to contact the LDM
  on blizzard from the LDM on storm5 (i.e., a rule allowing inbound
  requests on port 388 would need to be added to /etc/sysconfig/iptables
  and iptables would need to be restarted (service iptables restart))

At this point, I figured that the LDM on blizzard should be ready to
start REQUESTing data from storm5.  To test this I ran 'ldmping' and
'notifyme'; both failed.  The strange thing about the 'notifyme' failure
from:

notifyme -vl- -h storm5.atms.unca.edu

seemed to indicate some sort of a routing problem:

> notifyme -vl- -h storm5.atms.unca.edu
Aug 20 20:03:17 notifyme[18798] NOTE: Starting Up: storm5.atms.unca.edu: 
20120820200317.203 TS_ENDT {{ANY,  ".*"}}
Aug 20 20:03:17 notifyme[18798] NOTE: LDM-5 desired product-class: 
20120820200317.203 TS_ENDT {{ANY,  ".*"}}
Aug 20 20:03:17 notifyme[18798] INFO: Resolving storm5.atms.unca.edu to 
152.18.68.172 took 0.000343 seconds
Aug 20 20:03:17 notifyme[18798] ERROR: NOTIFYME(storm5.atms.unca.edu): 12: 
h_clnt_create(storm5.atms.unca.edu): No route to host

The 'No route to host' message is what made me think that there was some
kind of a routing problem.

At this point, our lead system administrator and I got together to discuss
the 'No route to host' indication.  After playing around for awhile, we
decided to run 'tcpdump' (as 'root') in one window while running the
'notifyme' invocation above in another window (as 'ldm').  'tcpdump'
indicated the following for the connection being attempted by 'notifyme':

'unreachable - admin prohibited'

At the same time, a 'notifyme' from blizzard to the Unidata-operated
toplevel IDD cluster, idd.unidata.ucar.edu, worked fine.

The combination of these two things strongly suggest that the firewall
on storm5 is not setup to allow inbound traffic on port 388.  Please check
to see if this is the case and, if it is, add an appropriate entry to
/etc/sysconfig/iptables (and potentially to /etc/sysconfig/ip6tables).
Remember that after modifying the filrewall configuration file, the
firewall needs to be restarted ('service iptables restart' as 'root').

While looking around, I happened to notice that GEMPAK references
in 'ldm's environment referenced GEMPAK6.6.0.  This was due to
NAWIPS configuration line in /home/gempak/GEMPAK6.6.0/Gemenvion.
I changed this from:

setenv NAWIPS /home/gempak/GEMPAK6.6.0

to:

setenv NAWIPS /home/gempak/NAWIPS

Now the source '/home/gempak/NAWIPS/Gemenviron' invocation in
~ldm/.cshrc will use the NAWIPS link in /home/gempak.

The other thing I noticed was the reference to the ldm-mcidas
decoder 'pnga2area' in ~ldm/etc/pqact.conf.  I do not see the
that the ldm-mcidas decoders have been built on blizzard.
Since blizzard is running a new RedHat Enterprise version (6.3),
you will likely run into problems building the ldm-mcidas-2008
distribution.  Because of this, I copied over a statically-linked
(at least as far as the LDM libraries are concerned) distribution
of ldm-mcidas-2012 (as yet unreleased) from my CentOS 6.3 development
system.

I then copied relevant decoders and configuration files to
appropriate directories:

<as 'ldm'>
cd ~ldm
tar xvzf ldm-mcidas-2012.tar.gz

cd ldm-mcidas-2012/bin
cp area2png pnga2area zlibg2gini ~/decoders
cd ../etc
cp SAT* ~/etc

Wrap-up:

I think that the LDM is now setup correctly on blizzard and
should be able to REQUEST data from storm5 as soon as the firewall
on storm5 is configured to allow the port 388 requests.

Please let us know if you have any questions about what was done
or what remains to be done.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: VWJ-362177
Department: Support LDM
Priority: Normal
Status: Closed