On 5/2/2011 12:27 PM, jerry y pan wrote:
Hi John,
Our TDS (4.2) uses some compressed netcdf files (*.nc.gz) and it works
fine, except that the very first access to them were slow (relatively
large files, about 400 MB each). The subsequent accesses would be much
faster, but it would become slow again after a while of non-activity.
I can see that TDS uncompress these files to the temp data location,
my question is that if TDS cleans up these temp files, which leads to
the work to decompress them next time and hence the subsequent
slowness? If so, is there a way to keep the cache there permanently?
Or, perhaps the faster response right after the first access is due to
in memory cache? Any configuration I could twist the cache?
Thanks,
-Jerry Pan
Hi Jerry:
Yes, compressed files are uncompressed the first time they are seen, and
likely thats why you see the slowdown.
To control how these files are cached, see:
http://www.unidata.ucar.edu/projects/THREDDS/tech/tds4.2/reference/ThreddsConfigXMLFile.html#DiskCache
I would suggest that you use
<*DiskCache*>
<*alwaysUse*>true</alwaysUse>
<*scour*>1 hour</scour>
<*maxSize*>10 Gb</maxSize>
</DiskCache>
and choose maxSize carefully. The default directory is
{tomcat}/content/thredds/cache/cdm/ by default, or set it in the above xml.
monitor the cache directory closely to see what files are uncompressed,
perhaps test accessing the datasets with and without compression and
time the difference.
esentially this is a space / time tradeoff. I assume you dont want to
store the files uncompressed, so you have to pay the price of that. The
trick is to make maxSize big enough to keep the "working set"
uncompressed, ie if there is a reletively small "hot" set of files that
get accessed a lot, you want to give enough cache space to keep them
uncompressed.
John