[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: THREDDS and iRODS - TUES JULY 20 1PM MDT



Thanks,

We have had multiple requests for Java 7 and nio support in Jargon.  This is on the radar, but I do not anticipate having that in the medium time frame, (6-9 months is a possibility, depending on other priorities).

In the meantime, Jargon does have an implementation of a RandomAccessFile, so at least we're part-way there.  I will look at ucar.unidata.io.RandomAccessFile.

Cheers,
Mike Conway


On Jul 20, 2010, at 10:29 AM, John Caron wrote:

Hi all:

Some background in anticipation of our call today.

The THREDDS Data Server (TDS) uses the Common Data Model Library (CDM, aka netcdf-Java library) to access local and remote data. The TDS is the web services layer, using servlets. The CDM is a general data access library used in many other applications also. Both are pure java.

The CDM, by default, wants a java.io.RandomAccessFile. We wrap that in a ucar.unidata.io.RandomAccessFile, and also can use subclasses ucar.unidata.io.InMemoryRandomAccessFile and ucar.unidata.io.http.HTTPRandomAccessFile. So one possibility is to provide an iRODS specific subclass of ucar.unidata.io.RandomAccessFile.

Another interesting possibility is the new java.nio.file.spi package in Java 7, which allows one to plug in file system implementations. I suspect it would be a good fit with iRODS. There's a lot of new functionality in Java 7 that we are interested in using. See http://java.sun.com/developer/technicalArticles/javase/nio/ if you havent already been following this. Im guessing we will see a Java 7 release in 6-9 months.

The CDM cant serve arbitrary files, its a subsetting service that needs to understand the details of the file. The list of file formats we currently understand is at:

http://www.unidata.ucar.edu/software/netcdf-java/formats/FileTypes.html

Random access assumes that its efficient to move around in the file and access small chunks of data. Performance depends on file layout and read access patters, both can be hard to predict in some cases.

We have a plug-in architecture for adding new formats. So one aspect of the desirability question is, what holdings are in iRods, and how much work is it to make them readable by the CDM? Also, are the web services appropriate for these files? Who are your clients? TDS is mostly oriented to earth science data.

Mike Conway wrote:
Hi John,
I think one of the basic goals would be an exploration of how we could layer THREDDS on top of the iRODS file system such that the files of whatever format could be served from the data grid.  This is at an exploratory stage, and the basic questions would be on feasibility and desirability.
iRODS does have facilities for file transfer, but I think in the simplest terms it's more about a distributed data grid, or data cloud that can manage distributed data based on policies built into iRODS (replication, metadata extraction, security, federation, etc).
There is a Java library (Jargon) that can integrate Java-based applications with the data grid, and Jargon, among other things, provides an iRODS-specific implementation of the java.io.* libraries.  Could these libraries be plugged in to THREDDS?  That was a primary line of investigation.
Note that THREDDS would see the files as java.io.File, or related streams, but underneath would use the iRODS protocol to expose data stored anywhere on the grid.   Given that that is desirable and feasible, it would allow the data served by THREDDS to gain the distributed and policy-driven management of the iRODS grid.
Regards,
Mike Conway
On Jul 8, 2010, at 8:21 AM, John Caron wrote:
Brian Etherton wrote:
Hi team,
How abouts Tuesday... July 20th?
Brian E.

Hi Brian:

I assume we converged on TUES JULY 20 1PM MDT ?

In preparation, can someone in your group summarize your current thinking about how THREDDS/TDS could be used with iRODS? My limited understanding is that iRODS is oriented towards file transfer, while the TDS is geared towards implementing subsetting protocols like OPeNDAP and WCS/WMS, which requires the ability to parse and understand the semantics in the file. Among other thing, this means that the set of files that can be served by TDS is limited. Any thoughts on that? What goals do you see as possible?

Specific technical background material on iRODS would be welcome also.

John

Mike Conway
DICE Center
Jargon, Java, Interface Developer
address@hidden <mailto:address@hidden>
------------------------------------------------
Google voice/video: address@hidden <mailto:address@hidden>
Skype: michael.c.conway



Mike Conway
DICE Center
Jargon, Java, Interface Developer

------------------------------------------------

Google voice/video: address@hidden

Skype: michael.c.conway