We have a THREDDS server running on Linux here at ERD serving a lot of
data sets (http://oceanwatch.pfeg.noaa.gov:8081/thredds/catalog.html).
The data sets are aggregates, created by putting lots of individual
files in each data set's directory.
The problem is: I have noticed that the time to get data from 1 data
file in a dataset is roughly proportional to the number of files in the
directory. And access to data in directories with lots of files is very
slow.
Here are the results from a test in the order that the subtests were
done (in one run of the test program). The GAssta hday subtest was
added to test the theory that the number of files in the directory was
correlated with the thredds opendap response time.
AG ssta 3day: 190 files, 719 ms
CM usfc hday: 2138 files, 13359 ms
GA ssta hday: 1018 files, 7063 ms
MB chla 1day: 185 files, 625 ms
QN curl 8day: 537 files, 3141 ms
That looks like a great correlation to me.
Any idea why THREDDS is so slow at opening one file? Linux is slow with
so many files, but not close to this slow. Is there anything we can do
to Linux or THREDDS to improve this?
Thank you.
Sincerely,
Bob Simons
Satellite Data Product Manager
Environmental Research Division
NOAA Southwest Fisheries Science Center
1352 Lighthouse Ave
Pacific Grove, CA 93950-2079
(831)658-3205
bob.simons@xxxxxxxx
<>< <>< <>< <>< <>< <>< <>< <>< <><
==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================