Am Wed, 28 Oct 2009 09:28:36 -0500
schrieb Rob Ross <rross@xxxxxxxxxxx>:
> It is a mistake to think that there is any rhyme or reason to the
> cache update and replacement policy in NFS. In fact it is ok for a
> client implementation to cache the file size and other metadata too,
> and return an out-of-date version to a process.
I won't argue about the (non)reliability/consistency when working over NFS. I
know there are many agents, caching on every host concerned, etc. All I'm
asking for is reducing the time until the data hits the disk... for my use case
it has been shown to be "good enough" to make the writer push its writes
towards the NFS server (call 'sync' on the shell).
> > "The function NF90_SYNC offers a way to synchronize the disk copy
> > of a netCDF dataset with in-memory buffers"
> > "For a writer, this flushes buffers to disk."
>
> It is supposedly synchronizing that process's in-memory buffers with
> the copy on the server (on disk). Actually, what is probably
> happening is that the dirty regions are being pushed to the server,
Actually not. I grepped trough netCDF source and did not find an fsync() call.
nc_sync() just seems to trigger plain write()s. I may be wrong here, as there
is some IO layering going on inside the code and I am not that familiar with it.
The situation looks to me like NetCDF only hands data over to the operating
system buffers and the NFS server, let alone the client, don't get any chance
with the new data since it is held in the writer's system buffers.
A call to 'sync' triggers sending of the data over NFS, resulting in an update
of the data in the visualization in acceptable time. That is all I'm asking
for... I know that this is no substitute for a cluster file system and there is
no guarantee, but it supposedly works 99% of times instead of getting a broken
plot in 10% of cases, and the plot very late in all cases.
> My guess is that the issue isn't with the writer (who would call
> fsync()) at all, but the reader.
Of course there are delays and possible inconsistencies on the reader side, but
practice shows that our NFS setup seems to be "good enough" for my purpose if
it gets the data at all from the writer.
> readers cache data and hand that data back to processes without
> bothering to check if the data is up-to-date with respect to the
> data on the server.
Getting old data would be a different issue: I simply would not get updates of
the plot. The plotter needs updated header data to see that there is a new
record appended. Once it has figured out that much, it needs to access newly
written data which has not been read before on that machine -- the file is
being grown just now, where should the cache come from?
Anyhow, I said I won't argue about how NFS should behave here, let's focus on
what kind of sync()ing NetCDF does / should do.
> > In any case, one should clarify the documentation...
>
> How would you propose clarifying it? Something like:
>
> "Note, the view of the file relative to other processes is file
> system dependent, so this call is not adequate to ensure that the
> most up-to-date file state is available at all processes."
As I see it -- I would still like to see a comment from someone who really
knows what NetCDF I/O code does and what it does not -- it would be more
accurate like this:
"This does flush any NetCDF-internal buffers of the calling process, handing
the data over to the operating system. This should (on any sane operating
system) make the data immediately available to other processes on the same
machine, but it does not guarantee that any actual write operation takes place
on the hard disk or network share the data file is residing on."
And I still hold my point that adding some hook to actually call fsync() (or
similar functionality on non-UNIX) on the underlying file handle would be nice.
When writing a file in plain C one has the option to tell the system to
actually commit the file changes using fsync(). With the NetCDF API one
currently does not have that option since the underlying file is hidden from
the user code, while the demand for it is valid, IMHO*.
But since it can have severe performance drawbacks, fsync() should not be
called implicitly -- only on specific request!
Alrighty then,
Thomas.
* Like, having worked on a document over night, duly saving it along the way,
but still loosing hours of work during power outage because your text editor
did _not_ use fsync() on saves and the kernel filesystem buffering was setup to
be quite patient...
--
Dipl. Phys. Thomas Orgis
Atmospheric Modelling
Alfred-Wegener-Institute for Polar and Marine Research