[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #RCR-468652]: FW: metrics logs



Hi Clint,

I think your crontab entry

    * * * * 0 bin/ldmadmin newmetrics

will create a new file every minute on Sunday.

> Fellow LDMers,
> 
> For some time, I've been monitoring our ldm(s) (on several machines) by 
> plotting the 5-min loads in the metrics log and putting them on a web page I 
> could check periodically..  Sometime overnight, on each of the machines, 
> something strange happened.  Rather than adding to the existing log file, 
> each time the ldmadmin addmetrics command is executed by cron, a new metrics 
> file is created.  Here's what I'm seeing in the log directory now:
> 
> [ldm@idd ~]$ ls -l logs
> total 16
> -rw-r--r--. 1 ldm apps   0 Aug 24  2014 ldmd.log
> -rw-r--r--. 1 ldm apps   0 Aug 23  2014 ldm-mcidas.log
> -rw-r--r--. 1 ldm apps   0 Jun  7 10:36 metrics.txt
> -rw-r--r--. 1 ldm apps 112 Jun  7 10:35 metrics.txt.1
> -rw-r--r--. 1 ldm apps 111 Jun  7 10:30 metrics.txt.2
> -rw-r--r--. 1 ldm apps 112 Jun  7 10:25 metrics.txt.3
> -rw-r--r--. 1 ldm apps 110 Jun  7 10:20 metrics.txt.4
> 
> And here's the crontab for user ldm on that machine:
> 
> [ldm@idd ~]$ crontab -l
> #
> # Monitor system performance
> #
> */5 * * * * bin/ldmadmin addmetrics
> */5 * * * * ./plot_load.sh &> /dev/null
> #
> # New metrics file every week
> * * * * 0 bin/ldmadmin newmetrics
> 
> and here's the ldm registry:
> 
> [ldm@idd ~]$ regutil
> /delete-info-files : 0
> /hostname : idd.unl.edu
> /insertion-check-interval : 300
> /reconciliation-mode : do nothing
> /check-time/enabled : 1
> /check-time/limit : 10
> /check-time/warn-if-disabled : 1
> /check-time/ntpdate/command : /usr/sbin/ntpdate
> /check-time/ntpdate/servers : ntp.ucsd.edu ntp1.cs.wisc.edu ntppub.tamu.edu 
> otc1.psu.edu timeserver.unidata.ucar.edu
> /check-time/ntpdate/timeout : 5
> /metrics/count : 4
> /metrics/file : /usr/local/ldm/logs/metrics.txt
> /metrics/files : /usr/local/ldm/logs/metrics.txt*
> /metrics/netstat-command : /bin/netstat -A inet -t -n
> /metrics/top-command : /usr/bin/top -b -n 1
> /log/count : 7
> /log/file : /usr/local/ldm/var/logs/ldmd.log
> /log/rotate : 1
> /pqsurf/config-path : /usr/local/ldm/etc/pqsurf.conf
> /pqsurf/datadir-path : /usr/local/ldm/var/data
> /scour/config-path : /usr/local/ldm/etc/scour.conf
> /surf-queue/path : /usr/local/ldm/var/queues/pqsurf.pq
> /surf-queue/size : 2000000
> /server/config-path : /usr/local/ldm/etc/ldmd.conf
> /server/ip-addr : 0.0.0.0
> /server/max-clients : 256
> /server/max-latency : 3600
> /server/port : 388
> /server/time-offset : 3600
> /queue/path : /usr/local/ldm/var/queues/ldm.pq
> /queue/size : 4G
> /queue/slots : default
> /pqact/config-path : /usr/local/ldm/etc/pqact.conf
> /pqact/datadir-path : /usr/local/ldm/var/data
> 
> If I run ldmadmin addmetrics manually, it appends to the logs/metrics.txt 
> file normally.
> 
> Has anyone else every experienced this type of behavior?  As I said, it 
> occurred on more than one machine (6, to be exact) overnight; some running 
> CentOS 6, some running CentOS 7..  It had been working fine for months up 
> until today. I've attached a sample plot from one of the machines.
> 
> Thanks,
> Clint
> 
> ====================================================================
> Clinton M. Rowe
> Professor and Graduate Chair                     phone:(402)472-1946
> Earth & Atmospheric Sciences                       fax:(402)472-4917
> University of Nebraska- Lincoln                       
> address@hidden<mailto:address@hidden>

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: RCR-468652
Department: Support LDM
Priority: Normal
Status: Closed