Seeking .nc advice for sequence data

To: netcdf-java@xxxxxxxxxxxxxxxx
Subject: Seeking .nc advice for sequence data
From: "Bob Simons" <Bob.Simons@xxxxxxxx>
Date: Wed, 21 Dec 2005 15:24:38 -0800

I am trying to figure out how to store lots of sequence-like data in .ncfiles for efficient access via OPeNDAP. In particular, I am trying todetermine if actual OPeNDAP Sequences (Structures with an unlimiteddimension in the .nc file) is not appropriate for our purposes.

Yes, I could store the data in a file on the computer where the programneeding access is running, and not have to access it via OPeNDAP, sothat network transmission time would be minimized. But this project ispartly an experiment in dealing with remotely accessed data. So I amtrying to design a solution where the data is accessed from anothercomputer via OPeNDAP.

Here's an example. Let's say I want to store all NDBC buoy data in a .ncfile. There are over 100 buoys. For each buoy, there are readings forsome time period (e.g., just 1989, or from 1990 to the present). Thereadings are an hour apart. Several variables (e.g., WindSpeed andWindDirection) are measured at each time point. Since we work withreal-time data, I plan to update this file frequently (every day, butideally every hour).


The problem is, I need to have *quick* access via OPeNDAP:

* Across all buoys at a specific time point, e.g., What is the windspeed at all buoys at 2004-12-14T09:00Z?* Or, for all time points available, what is the wind speed, forexample, at a specific buoy?

Regarding the first requirement, from what I understand, if I usesequences, there is no way to get the data for a given time pointwithout reading either the whole file up to that time point, or withoutreading a whole variable. Either of which would seem to take too longif I want the values for 100 buoys (given that I am using OPeNDAP toconnect to a remote computer and want the response quickly for myCoastWatch Browser program, which graphs the data for on-line users whowant a quick response).

Since the time range of available data for each buoy varies greatly, itseems grossly wasteful of space to have a common Time dimension for allbuoys. Doing so would probably force me over the 2GB file size, which isgenerally trouble. So I am thinking about either:* A time dimension for each buoy (e.g., time14978 for buoy 14978) and aseveral variables which use that dimension to store the data for thatbuoy (e.g., windSpeed14978, windDirection14978, etc.). This setup wouldbe replicated for each buoy.* Or, a Group for each buoy, again with a time dimension and severalvariables in each group to store the data for each buoy. (If this is anew .nc feature, does OPeNDAP deal with this yet?)* Or, an ArrayObject.1D of variables, each element of which is anArrayObject.1D of the variables for a given buoy. (I'm not sure if thiscan be done.)* Or, an ArrayObject.2D of variables, with buoys as one dimension andthe various variables (e.g., WindSpeed, WindDirection) on the otherdimension. (I'm not sure if this can be done.)

I plan to solve the updating problem by leaving rows of missing valuesat the end of the data for each active buoy. As new data comes in, Iwill replace the missing values with actual data. Then, I only have torewrite the file (to add more rows of missing values) once in a while,not every time.

Which approach sounds best? Is there another approach? Do you have anyadvice?

Are sequences the wrong way to go? Of course, that could change if onecould efficiently access specific ranges from variables in aSequence/Structure. But it my understanding that that is not currentlypossible.

Although I gave this specific example, we store a lot of sequence-likedata where I work. Whatever .nc file structure is appropriate for thebuoys will likely be appropriate for much of this other data. So I wantto get it right.


Thank you.


Sincerely,

Bob Simons
Satellite Data Product Manager
Environmental Research Division
NOAA Southwest Fisheries Science Center
1352 Lighthouse Ave
Pacific Grove, CA 93950-2079
(831)658-3205
bob.simons@xxxxxxxx
<>< <>< <>< <>< <>< <>< <>< <>< <><

Follow-Ups:
- Re: Seeking .nc advice for sequence data
  - From: John Caron

2005 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: