[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ncwms-users] Still some cache or time issues for our curvilinear FMRC OpenDAP data?



That is sufficient in all cases, but may not be necessary.

In an aggregation, files are rechecked whenever the recheck_every time has been exceeded. If there is a request happening at the same time, one can see the data values shift, and an inconsistent data model.
We need to make sure to only delete files when a request is not happening. To 
do that correctly, we have to integrate the TDS with the scouring program. We 
are looking into that. Meanwhile, if there is a time when requests are unlikely 
to happen (eg 2 am), delete then. If you cant afford to allow even the 
possibility of an incorrect response, then you should stop the server, delete, 
and restart.

Note that adding files at the end of the aggregation is not a problem. Any 
current request will just not see the new files.

Richard Signell wrote:
John,

Please confirm:
For now, if providers want to make sure that their time-changing
aggregations from TDS are properly handled by clients like ncWMS, they
need to make sure that files are not removed from the start of the
aggregation.   And if they are, thredds needs to be restarted.

Correct?
Thanks,
Rich

On Wed, May 20, 2009 at 3:00 PM, Richard Signell <address@hidden> wrote:
John,

I know ALL the details of this dataset!  ;-)

This machine is running a TDS4, and the FMRC aggregation is working on
a directory that that is updated each day (if everything goes well)
with a new forecast file.   There is a cron job that runs once a day
that moves any files in this directory to an archive location.

Right now the files in this directory look like this:
[rsignell@omglnx1 his]$ ls
his_20090513.nc  his_20090515.nc  his_20090517.nc
his_20090514.nc  his_20090516.nc  his_20090520.nc

So there are a few days missing here because things *did* go wrong.
(problems with A/C in the cluster room).

Tomorrow there should be a new "his_20090521.nc" file around 9:15 am,
and at 10:00 the cronjob runs and will remove the "his_20090513.nc"
file.     The aggregation does a "recheckEvery=10min", and the ncWMS
which points to the "best time series" from the FMRC also has a
recheck of every 10 minutes.

If I know that deleting the old files is causing a problem, we can
take an alternate approach of keeping all the files for a month, not
deleting any old ones, and then start new the next month (restarting
THREDDS and NcWMS, I suppose).


Is that enough info?

-Rich



On Wed, May 20, 2009 at 10:28 AM, John Caron <address@hidden> wrote:
Jon Blower wrote:
Hi Rich,

I think what's happened here is the following:
1) ncWMS synced with your OPeNDAP server and noticed a time dimension
of length, say, 100
2) The data under the OPeNDAP server changed, reducing the length of
the time dimension to, say 80
3) Before the next sync with ncWMS, you requested a map that
corresponded to a time index of, say, 84, which was off the end of the
actual data but valid as far as ncWMS knew.

I guess this is a potentially-serious problem.  Even if you didn't get
an error, you might get the wrong data.  This could happen at anytime
when you add or remove data from the beginning or middle of a dataset.
 It won't happen if you add new data on to the end of a dataset (which
is what I assumed would almost always happen).

Any ideas as to how this could be fixed?  We could sync with the
OPeNDAP server with every single GetMap request, but this would
seriously affect performance.  Does OPeNDAP have a versioning system
whereby you can find out whether a dataset has changed without
downloading the whole metadata record?

Cheers, Jon
This is a deep problem. Opendap considers itself a stateless protocol, like 
HTTP. But for datasets that can change, this is a problem. Aggregation datasets 
are particularly susceptible.

Like HTPP, we need to use sessions to prevent state changes that cause these 
kinds of errors. We are looking to redesign the TDS to deal with this, but we 
need to be able to control the adding and removing of files, make guesses as to 
session timeout, get clients to return cookies, etc. So its a rather big and 
messy job. And that's just the TDS, dunno what other opendap servers might do. 
In a way, we've done all the easy stuff and now we're on to the harder stuff.

Rich, if you know any of the details of this dataset, how/why/when it changes, 
what kind of opendap server it is, etc, that would be useful.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Ncwms-users mailing list
address@hidden
https://lists.sourceforge.net/lists/listinfo/ncwms-users



--
Dr. Richard P. Signell   (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598