John Caron wrote:
David Robertson wrote:
Hi John,
Im guessing that you have caching disabled, or is ineffective for
some reason. Can you send me your threddsConfig.xml file to verify
that ? If thats true, is that deliberate?
NetcdfFileCache is deliberately disabled. We have a several datasets
that get replaced with identical file names on a regular basis
(hourly, daily, etc.). This setup dates back to TDS 3.16 so perhaps
this is better handled in 4.1 and I can turn the cache back on.
Try turning it back on a test server and see if the change to the
identical file(s) get discovered .
I have turned NetcdfFileCache back on but I'll have to wait until
tomorrow to see if the changes to identical files are picked up. We no
longer update every few hours like I thought.
I have a couple more questions about NetcdfFileCache. (1)What format
will the NetCDF cache files have. The dataset in question is a
collection of NetCDF-4 files because we needed to compress them to
save space. (2)Will TDS 4.1 cache the entire dataset in question? Each
yearly aggregation is ~3000 files and if the files are cached in
NetCDF-3 format our scratch space will fill up quite quick.
NetcdfFileCache is an object cache, it keeps Java objects, including the
aggregation lists, in memory for fast response. The data itself is not
cached so the file size or format doesn't matter.
Under those circumstances, any access to an aggregation has to
rebuild the aggregation, no matter what the recheckEvery setting is.
TDS 4.1 now has a file system cache using ehcache, which will only do
an OS file scan when the directory changes.
This could be what's causing the shorter delays (10 seconds) as that
is about how long it takes to list the directory in question (~29,000
files) when it hasn't been accessed in a while.
Are all 29,000 files in the same directory as the updating files?
No, the updating files are a completely separate dataset. The directory
with 29,000 files is satellite data so new files are added ~12 times a
day. No files are modified in this directory; files are only added.
If so, you have a large number of files (29,000) and some of them change
without actually updating the directory entry (since a new file is not
created). We probably cant optimize both problems at the same time. But
if you can force the directory "last modified" to update (write a dummy
file?) perhaps things will work correctly.
Also, break up your files into separate directories, with most files in
unchanging directories. This should help a lot.
We've been tossing that idea around for a while. I guess the people
writing the cron scripts will have to make them smarter so the files get
in the proper directory when the calendar year changes.
Thanks,
Dave