Re: [thredds] Pb between OpenDap and THREDDS when netcdf file are modifed

Hi Hoop,

The dynamic dataset handling in the NcML aggregation code was designed
to deal with the appearance of new datasets more than data being
appended to existing datasets. The NcML aggregations are also limited to
straight forward aggregations based on homogeneity of dimensions and
coordinate variables; they don't use any coordinate system or higher
level feature information that might be available. This makes straight
NcML aggregation somewhat fragile and hard to generalize to more complex
situations.

FeatureCollections are designed to use the CDMs understanding of
coordinate systems and feature types to both simplify configuration and
make aggregations more robust and general.

While the FMRC collection capability was designed for a time series of
forecast runs, I believe it should handle a simple time series of grids
as well. (John, can you add more information on this?)

Ethan

On 2/23/2012 3:21 PM, Hoop wrote:
> Ethan,
> 
> This reminds me of an issue we are having, with version 4.2.7.
> Here is the relevant snippet from our config:
> <dataset name="SST NOAA OISST V2 HighRes" ID="SST_OISST_V2_HighRes"
>     urlPath="Datasets/aggro/OISSThires.nc" serviceName="odap" dataType="grid">
>     <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
>         <aggregation dimName="time" type="joinExisting" recheckEvery="15 min">
>             <scan location="/Projects/Datasets/noaa.oisst.v2.highres/"
>                   regExp="sst\.day\.mean\.....\.v2\.nc$" subdirs="false"/>
>         </aggregation>
>     </netcdf>
> </dataset>
> 
> The behavior we are getting in our time series, which is based on
> NetCDF files with a year's worth of time steps (or less), is as follows:
> In between re-boots of Tomcat, new time steps added to the latest file
> are not added to the aggregation.  However, if the calendar marches along
> and a new file for a new year is added to our archive without rebooting
> Tomcat, the timesteps for the new file are added, without the ones that
> would complete the previous year, resulting in a discontinuity along the
> time axis.  And someone somewhere may e-mail us complaining that our
> OPeNDAP object is not CF-compliant because the time steps aren't all of
> the same size.  %}
> 
> I looked at the featureCollection documentation link you gave, but since
> our data are not forecasts, nor point data, nor in GRIB2 format, that
> didn't seem the right fit.  Maybe I'm wrong; I'm severely sleep-deprived
> right now....
> 
> We also have some time series in monthly files (to keep the individual
> file size under 2 Gbytes).  We have not tried aggregating any of those
> time series.  Could be an interesting challenge.
> 
> Thanks for any help.
> 
> -Hoop
> 
> On 02/23/12 14:23, thredds-request@xxxxxxxxxxxxxxxx wrote:

> Ethan Davis wrote:
>>
>> Hi Claude,
>>
>> The version of the TDS running at http://web.aria.fr:443/thredds/ is
>> several years old (Version 4.0.26 - 20090831.2140). The current stable
>> release is 4.2.9 (20111108.1758). You should probably upgrade as there
>> have been lots of improvements.
>>
>> The TDS does some dataset caching which can greatly improve the
>> performance for static datasets but causes some problems for dynamic
>> datasets. The datasetScan configuration construct you are using does not
>> deal specifically with dynamic datasets. A more recently introduced (in
>> TDS 4.2) configuration construct, featureCollection, can deal with
>> dynamic datasets. Here's a link to the featureCollection documentation:
>>
>>> http://www.unidata.ucar.edu/projects/THREDDS/tech/tds4.2/reference/collections/FeatureCollections.html
>>
>> Hope that helps,
>>
>> Ethan
>>
>> On 2/22/2012 8:20 AM, Claude DEROGNAT wrote:
>>> I have used OpenDap and THREDDS server to provide netcdf data since ? long 
>>> time.
>>> I currently develop a system that will model plumes in real time.
>>> The model runs every 30 minutes. At the beginning of each day, it creates a 
>>> result file then every 30 minutes the result file is overwritten with a new 
>>> one containing the additional time frame.
>>>
>>> I observed a strange behavior between Opendap and THREDDS in this case:
>>>     - The file is continuously updated in the OpenDap server. 
>>>     - The THREDDS catalog notices in the 'Dates' field that the file is 
>>> updated (modified) 
>>>     - but the Access/OpenDap target file are not modified and the available 
>>> time frame stay the same since the last tomcat reboot. May I have to 
>>> perform any modification in the thredds  configuration ? A attached you my 
>>> threddsConfig.xml file...
>>>
>>> Regards       
>>>
>>> Ing. Claude DEROGNAT, PhD 
>>
>> Claude also wrote:
>>> My IT said that the file I send you is an old one. My problem is
>>> probably link to my THREEDS catalog. The question is so probably
>>> to threeds mailing list. I send them my question on Monday just
>>> after your message and I still have no response back ...
>>>
>>> So, I send you my catalog if you can have a look to it you may
>>> find why there a mismatch between
>>>
>>> http://web.aria.fr:443/thredds/dodsC/CHIMERE@CAMAC@reference@p02/CF_CHIMERE_20120115_d03.nc.html
>>> and
>>> http://web.aria.fr:443/LENVIS/CHIMERE/CAMAC/reference/p02/CF_CHIMERE_20120115_d03.nc.html
>>>
>>> For instance,
>>>
>>> why the second is continuously updated but doesn't allow the
>>> access to the whole set of variables stored in NetCdf file.
>>>
>>> Why the first one are not continuously updated even if the date
>>> in the threeds catalogs presentation always mentioned the right
>>> modified time ...
>>
>> catalog.xml:
>>> [snip]
>>>   <service name="multiService" base="" serviceType="compound">
>>>     <service name="ncdods" serviceType="OPENDAP" base="/thredds/dodsC/" />
>>>     <service name="httpService" serviceType="HTTPServer" 
>>> base="/thredds/fileServer/" />
>>>     <service name="wcsService" serviceType="WCS" base="/thredds/wcs/" />
>>>   </service>
>>> [snip]
>>>     <datasetScan name="CF_CHIMERE_@YYYYMMDD@_d01.nc"
>>>                  ID="/LENVIS/CHIMERE/CAMAC/reference/p00/dataset"
>>>                  path="CHIMERE@CAMAC@reference@p00"
>>>                  location="/data/nc/LENVIS/CHIMERE/CAMAC/reference/p00"
>>>                  harvest="true">
>>>        <sort>
>>>          <lexigraphicByName increasing="false"/>
>>>        </sort>
>>>      </datasetScan>
>>> [snip]