[ldm-users] 20200424: Re: ldm data dir question

Hi Everyone,

The easiest way to get a lot of information about an LDM is to:

<as 'ldm'>
ldmadmin config

The next thing is to see about disk space:

df -h

The next thing is to see about RAM:

cat /proc/meminfo

I think it would be most useful to see the output from all of these
commands on Jack's system.

Cheers,

Tom

On 4/24/20 3:45 PM, Gerry Creager - NOAA Affiliate via ldm-users wrote:
I'm also interested in the size of the product queue (look in ~ldm/etc/registry.xml for the queue size) vs the amount of ram available. It sounds like you could be hammering system memory.

gerry

On Fri, Apr 24, 2020 at 8:44 PM Mike Zuranski <zuranski.wx@xxxxxxxxx <mailto:zuranski.wx@xxxxxxxxx>> wrote:

    Hi Jack,

    First thing I want to point out is (barring any symlink or similar
shenanigans) your product queue is not under /home/ldm/var/data/. As shown by LDM's error message, the product queue is the
    /home/ldm/var/queues/ldm.pq file.  That single file will house the
    entire queue, so you wouldn't see excessive files from that.

    That being said, the times I've had issues like yours with not being
    able to log in or issue commands, it was usually because of either a
    full root partition ("/"), full /tmp partition (unlikely that's
    relevant here, but just FYI), full memory, or full inodes on a
    partition.  I see Tom already asked about "df -h" output, and you
    already checked inodes and that appears fine.  But those have been
    some of my experiences as well.

    So what IS in /home/ldm/var/data ?  My guess is that's where LDM is
    saving data to, and that configuration would be found in your pqact
    file(s).  One thing you could try is running the following command
    to see what LDM will attempt to save in that directory (assuming
    your pqact file(s) are named "pqact..." and in that dir, otherwise
adjust accordingly):  "grep var/data ~/etc/pqact* | grep -i file" (without quotes)

    Side-note to the above:  By default, relative paths with the FILE
    action will start in the "/home/ldm" directory.  This is set in
    ~/etc/registry.xml under /pqact/datadir-path, and you can check it
    with "regutil /pqact/datadir-path" (without quotes).  If that points
    straight to your /home/ldm/var/data/ dir then THAT becomes the
    default starting point for relative paths (and it might make the
    above grep command come back empty).

    If there are actions to save data there they should (hopefully but
    not guaranteed to) be listed by that grep command, and that could
    point you where to look next.  If it comes back empty then maybe
    something's getting PIPEd to a script which is in turn saving data
    there, but that might be harder to track down.  Either way, it's
    hard to know without looking in that directory or your pqact(s) what
might be happening, but hopefully this will yield a clue or two. It's possible you're getting more than you think you're asking for,
    and it's leading to that directory filling up... and if that's on
    the root partition it could explain the log in / lock up issues.

You also mentioned ldmadmin scour doesn't seem to be doing much. Check ~/etc/scour.conf to see where it's doing actual scouring. Maybe it's not looking in that data directory, or maybe it is
    letting files stay too long.

    I'd also be curious about the size of your product queue vs. the
    size of the partition it's on.  If it's able to get made and LDM
    starts at all it's probably fine, but it is worth paying attention
    to.  The size of the queue gets defined in ~/etc/registry.xml, then
    just compare "ls -lh /home/ldm/var/queues/ldm.pq" and "df -h" to see
    how the partition is filling up the disk.  I try to ensure the
    partition it's on stays at 75% or less, though I don't think that's
    a true hard/fast rule, just guidance.

    Some reference pages that may be useful to you if you haven't seen
    these already:
    https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/ldmd.conf.html

    https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/pqact.conf.html
    https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/scour.conf.html

    
https://www.unidata.ucar.edu/software/ldm/ldm-current/basics/LDM-registry.html


    Per your last email:
     >  just to confirm... find and rm on the data dir won't mess up /
    confuse the ldm queue stuff?

    It shouldn't.  Again, from what I've seen in your original email
    that's not where the queue is.  And even if it were, scour shouldn't
    touch it as long as it keeps updating (though rm -rf would).  I'd
    double-check ~/etc/registry.xml to verify the queue is housed
    elsewhere, but it sounds like you should be fine on this.

    Hope some of this helps you out,

    -Mike

    ======================
    Mike Zuranski
    Meteorology Support Analyst
    College of DuPage - Nexlab
    Weather.cod.edu <http://weather.cod.edu/>
    ======================


    On Fri, Apr 24, 2020 at 1:32 PM Jack Snodgrass <jack@xxxxxxxxxxxxxx
    <mailto:jack@xxxxxxxxxxxxxx>> wrote:

        having issues with our server ( centos7 ) that runs ldm...
        locking up. It has happened 2 times in the last 3 weeks or so.
        The server is pingable... so it's not totally dead.. but you
        can't get a local or remote console to start. can't figure out
        if it is out of memory or file handles or what.... it's like a
        ghost of itself.

        After rebooting... the  /home/ldm/var/data/ has around 350,000
        files in it.  I am not sure if that is 'ok' or a bit extra.

        We are running a

        ldmadmin scour

        command... via cron but I don't know what that is doing exactly
        or it it's doing much.

        when I try and restart ldm it says:

        Checking the product-queue...
        The writer-counter of the product-queue isn't zero. Either a process
        has the product-queue open for writing or the queue might be
        corrupt.
        Terminate the process and recheck or use
             pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F -q
             /home/ldm/var/queues/ldm.pq
        to validate the queue and set the writer-counter to zero.
        LDM not started


        In the past.... during testing and what not.. I've been able to
        run:
        pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F
        -q/home/ldm/var/queues/ldm.pq

        and ldm would start after that. This time.. with the 350K files
        or so.. that pqcat stuff fails.

        I am deleting older ( than a day ) files from the
        /home/ldm/var/data/ direcory... going to see if

        pqcat -l- -s -q /home/ldm/var/queues/ldm.pq && pqcheck -F
        -q/home/ldm/var/queues/ldm.pq


        will work or if I have to rm -rf /home/ldm/var/data/ and start a
        new q.


        If  ldmadmin scour does not let us remove enough files from
        /home/ldm/var/data/ can I use find and rm to remove files or do
        they have to be removed using ldm to keep and queses or indexes
        in sync?

        - jack

-- *jack* - Southlake Texas - http://mylinuxguy.net
        <http://mylinuxguy.net/> - *817-601-7338*
        _______________________________________________
        NOTE: All exchanges posted to Unidata maintained email lists are
        recorded in the Unidata inquiry tracking system and made publicly
        available through the web.  Users who post to any of the lists we
        maintain are reminded to remove any personal information that they
        do not want to be made public.


        ldm-users mailing list
        ldm-users@xxxxxxxxxxxxxxxx <mailto:ldm-users@xxxxxxxxxxxxxxxx>
        For list information or to unsubscribe,  visit:
        https://www.unidata.ucar.edu/mailing_lists/

    _______________________________________________
    NOTE: All exchanges posted to Unidata maintained email lists are
    recorded in the Unidata inquiry tracking system and made publicly
    available through the web.  Users who post to any of the lists we
    maintain are reminded to remove any personal information that they
    do not want to be made public.


    ldm-users mailing list
    ldm-users@xxxxxxxxxxxxxxxx <mailto:ldm-users@xxxxxxxxxxxxxxxx>
    For list information or to unsubscribe,  visit:
    https://www.unidata.ucar.edu/mailing_lists/



--
Gerry Creager
NSSL/CIMMS
405.325.6371
++++++++++++++++++++++
/The way to get started is to quit talking and begin doing./
/   Walt Disney/

_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


ldm-users mailing list
ldm-users@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
https://www.unidata.ucar.edu/mailing_lists/


--
+----------------------------------------------------------------------+
* Tom Yoksas                                      UCAR Unidata Program *
* (303) 497-8642 (last resort)                           P.O. Box 3000 *
* yoksas@xxxxxxxx                                    Boulder, CO 80307 *
* Unidata WWW Service                     http://www.unidata.ucar.edu/ *
+----------------------------------------------------------------------+


  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: