Re: [thredds] Aggregation Cache File Naming Generator Change - ID to urlPath

  • To: Michael McDonald <mcdonald@xxxxxxxxxxxxx>
  • Subject: Re: [thredds] Aggregation Cache File Naming Generator Change - ID to urlPath
  • From: Christian Ward-Garrison <cwardgar@xxxxxxxx>
  • Date: Tue, 5 Jan 2016 20:49:30 -0700
Hi Michael,

It looks like this change was made in v4.5.0, and I'm not really sure why.
The commit [1] just says "Get index file naming correct, so putting indexes
in cache works". I'll bring this up in our meeting on Thursday.

For the time being, the only way to work around this issue in v4.6 is to
ensure that none of the urlPaths of your datasets are substrings of any
others. So, for example, you're going to have trouble
with "GOMl0.04/expt_32.5", because it is a substring of another urlPath,
"GOMl0.04/expt_32.5/hrly". Perhaps you could rename the latter to something
like "GOMl0.04/expt_32.5-hrly"? Not ideal, I know.

Another solution, which would require some new code, is to allow the user
to specify how the cache files are named in threddsConfig.xml. This is
actually already possible for GRIB indexes ([2], GribIndex.policy).
Probably wouldn't be much work to add for aggregations.

With respect to the default nestedDirectory naming policy, it's not clear
to me how to avoid collisions in a general way. Maybe that's why
oneDirectory was the default for so long.

Cheers,
Christian

[1]
https://github.com/Unidata/thredds/commit/79345f770cf600c774ced0b807ec5eebc37ed9c1
[2]
http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html#GribIndexWriting

On Mon, Jan 4, 2016 at 9:20 AM, Michael McDonald <mcdonald@xxxxxxxxxxxxx>
wrote:

> THREDDS Team:
>
> Did the XML file naming generator for the aggregation cache files
> (stored in cache/agg) change/flip from the dataset "ID" value to
> "urlPath" when going from v4.3.23 to v4.6.x?
>
> If so, why was this done, as it is preventing us from upgrading to the
> latest 4.6.3 due to the urlPath structure we currently use (which
> nicely mimics our FTP listing) and is important for keeping the same
> for obvious legacy reasons.
>
> e.g., we recently went in a changed all "/" to "-" in our dataset IDs
> (only) to fix this cache/agg file naming issue on our production
> v4.3.23 TDS server. What's odd is that there seems to have been a
> "collision detector" for creating these cache files, as some dirs had
> files with a "-" replacing the "/" when conflicts occurred - not so in
> v4.6.x.
>
>
> <dataset ID="GOMl0.04-expt_32.5" urlPath="GOMl0.04/expt_32.5">...
> <dataset ID="GOMl0.04-expt_32.5-2014" urlPath="GOMl0.04/expt_32.5/2014">...
> <dataset ID="GOMl0.04-expt_32.5-2014-hrly"
> urlPath="GOMl0.04/expt_32.5/2014/hrly">...
>
> v4.3.23
> http://tds.hycom.org/thredds (agg cache works fine with no "/" in the
> dataset IDs - many flat files in the cache/agg with no directories)
>
> file naming structure (in v4.3.23) looks to be generated from the dataset
> "ID"s
>
> cache/agg/GOMl0.04-expt_32.5
> cache/agg/GOMl0.04-expt_32.5-2014
> cache/agg/GOMl0.04-expt_32.5-2014-hrly
> cache/agg/GOMl0.04-expt_32.5-2015
> cache/agg/GOMl0.04-expt_32.5-2015-hrly
>
> ::now the problems occur:
>
> v4.6.3
> http://beta.hycom.org/thredds (agg cache seems to be using the dataset
> "urlPath" for generating the XML filenames in cache/agg and there is
> no collision avoidance, as we have directories in our cache/agg, even
> though we changed all IDs to "/" to "-" in the catalogs)
>
> cache/agg/GOMl0.04/expt_32.5
> cache/agg/GOMl0.04/expt_32.5/hrly
> cache/agg/GOMl0.04/expt_32.5/2014
> cache/agg/GOMl0.04/expt_32.5/2015
> cache/agg/GOMl0.04/expt_32.5/2016
>
> so we get a "partial caching" of the datasets (i.e., the leaf datasets
> "GOMl0.04/expt_32.5/2015/hrly" are missing because the server cannot
> write a cache file due to there already being a *file*
> "GOMl0.04/expt_32.5/2015" in cache/agg.
>
> e.g., errors on our beta.hycom.org/thredds server running the latest
> v4.6.3 (identical catalogs as our v4.3.23 server)
>
> java.io.FileNotFoundException:
> /var/lib/tomcat/content/thredds/cache/agg/GOMl0.04/expt_32.5 (Is a
> directory)
>
> java.io.FileNotFoundException:
> /var/lib/tomcat/content/thredds/cache/agg/GOMl0.04/expt_32.5/2015/hrly
> (Not a directory)
>
> java.io.FileNotFoundException:
> /var/lib/tomcat/content/thredds/cache/agg/GOMl0.04/expt_32.5/2014/hrly
> (Not a directory)
>
>
> What's the fix for this?
>
> --
> Michael McDonald
> Florida State University
>
> _______________________________________________
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/
>
  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: