Roy Mendelssohn wrote:
Hi John:
Some questions on THREDDS aggregation.
1. Is there a limit to the number of files that can be aggregated over?
nope
2. Can aggregation occur over sub-directories of a directory structure?
using "scan" i assume?
supposedly you can have multiple scan directives within the aggregation, but its not well tested. But I would try this if you need this feature.
the scan directive is still pretty primitive, we will continue to improve it, adding a
"recurse" tag might be one way.
remember that the aggregated files need to be pretty much homogenous.
3. For a lot of time periods, when aggregating fields over time. do you
have any feel for the trade-off in speed of aggregation for the size of
the netcdf file versus the number of files aggregated over (i.e. if we
have 6-hourly data, should we produce 6-hourly files, daily files,
weekly files, monthly files - and what would be the likely speed
tradeoff if we want to extract a time series of a relatively small
region?).
My intuition is that you want to create fewer large files, not lots of little
files. It costs the same to open a big or a little file. My current rule of
thumb is try to write files that are 50 - 200 Mbytes.
In the future, we may add a feature that will try to open all the needed files
in different threads. So that may argue for smaller file sizes, but its
theoretical at this point.
TIA,
-Roy