[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: motherlode problems?



Tom

> Isn't this the ldm 5.0.x behavior?  I thought in 5.1.x it actually
> physically allocates the entire logical queue size upon queue creation.

No, 5.0.x and 5.1.x currently behave the same in this respect.  In
either case, if you create a product queue with pqcreate (or ldmadmin
mkqueue), but on some Unix systems "ls -l" and "du -k" (or "ls -s")
will show very different sizes for the queue.  For example, on a
Solaris system using LDM 5.0.x:

 $ pqcreate -c -s 1000M -q test.pq
 $ ls -l test.pq
 -rw-r--r--   1 russ     ustaff   1073168384 Aug 17 13:23 test.pq
 $ du -k test.pq
 24040  test.pq

So although "ls -l" says the queue takes a gigabyte of space, "du -k"
shows it only occupies 24 Mbytes of disk.  We see the same kind of
difference using LDM 5.1.x, except both sizes are a little smaller
because it assumes a larger typical product size, so has less
overhead.

This is because some Unix file systems support "sparse files" with
holes -- pages that are not allocated on disk because they contain all
zeros. When such a page is read into memory, the system fills it with
zeros; only when it is modified does the page get physical disk space
allocated.

This feature can lead to problems with the LDM if you create a large
queue in your data partition that runs out of disk space much later,
when it is filling up with data.

We were just bit by this problem, so we're considering changing LDM
5.1.2 to actually allocate all the disk blocks in pqcreate.  This will
make pqcreate run slower, but will make sure it fails early if there
isn't enough disk space to contain the queue rather than later, when
there's not enough space because data products have used up space for
the holes that it didn't use initially.

This problem could be worst with the ability to make larger than 2
gigabyte queues because the lack of disk space could happen days after
starting the LDM with out designating the the reason for the crash. 

At this point, we think the solution to actually allocating the whole
queue is the best for eliminating "sparse file" syndrome even though it's
going to take longer for making LDM queues.


Robb...