Hi All,
I guess one of the points that advocates of "processing near the
data" are missing is that many interesting processes involve integrating
multiple datasets which are often not in one place to begin with. You
would have to move data anyway, may be not all the datasets, but atleast
some of them.
Upendra
On 6/18/2012 5:22 PM, Ben Domenico wrote:
Hi Jeff,
I agree that, in many cases, the processing needs to be near the data,
but that does not rule out using a brokering layer. The broker, in
fact can be set up to run on the same network or even the same machine
as the data server. The idea is just that it communicates via web
services which means that it is easier to have some of the development
take place with different languages, compilers, even different
development teams. It just doesn't all have to be part of the same
server program. That's the beauty of using a third tier between client
and server.
-- Ben
On Mon, Jun 18, 2012 at 3:06 PM, Jeff McWhirter
<jeff.mcwhirter@xxxxxxxxx <mailto:jeff.mcwhirter@xxxxxxxxx>> wrote:
Hi Ben,
This is a terrific idea. One suggestion I have is to build it
so the processing services can be set up in a brokering layer
-- that is, so the input datasets can be accessed via web
services and the output can be served via web services. I
don't mean that this should be the only way to implement the
nco processing, rather just keep it in mind so it's relatively
easy to set up such a three tier architecture for the nco
processing.
I just heard from Charlie Zender and have confirmed that the NCO
routines can operate on opendap URLs. This opens up numerous
possibilities. In the context of ramadda one can have explicit
opendap links, e.g.:
http://ramadda.org/repository/alias/brokerexample
All of the ramadda data services (cataloging, metadata ingest,
subset, nco (soon), grid visualizations, etc) are available for
that opendap link.
However, we have to keep in mind performance ramifications. It
still takes a long time to move gigabytes of data across a
network. This brings up the importance of moving the computation
to the data, instead of moving the data to the computation. For
some data sets and many use cases remote access to data works very
well so things like brokering are tractable. However, for *big*
data sets (e.g., climate model output) we need to come up with
richer mechanisms (like the NCO on local data) to bring
computation to the data.
-Jeff
_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/