[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: 20050922:DODS/THREDDS confusion



I'm happy to work on it... though I'm having trouble understanding what the
STRING_AS_ARRAY stuff is there for.  Is this an attempt to fix the problem?
Turning on debugging, I don't see many of its parts being invoked when
strings are encountered (such as in NCConnect::parse_array_dims())


-----Original Message-----
From: address@hidden
[mailto:address@hidden] On Behalf Of James Gallagher
Sent: Thursday, February 23, 2006 4:52 PM
To: David Wojtowicz
Cc: 'Ethan Davis'; address@hidden;
address@hidden; 'support-thredds'
Subject: Re: 20050922:DODS/THREDDS confusion

David,

It would be fantastic if you could look into fixing this! If you need  
some help, let me know. After the 6th of March I should be able to  
really dive into this.

James

On Feb 23, 2006, at 2:35 PM, David Wojtowicz wrote:

> I've pinned the problems down somewhat.
>
>  The actual netCDF file that contains the data on the TDS, has  
> string data
> that looks like this (from dncdump on the local file):
>
> ---------------------------------
> dimensions:
>    station = 4818;
>    id_len = 4;
> variables:
>    char station_id(station, id_len) ;
>       station_id:long_name = "Station id" ;
>       station_id:reference = "sfmetar_sa.tbl" ;
> ---------------------------------
>
>
>
> However, the TDS represents it via DAP as (pertinent bits of DDS  
> and DAS):
>
>
> ---------------------------------
> Dataset {
>     String station_id[station = 4818];
> }
> Attributes {
>     station_id {
>         String long_name "Station id";
>         String reference "sfmetar_sa.tbl";
>         DODS {
>             Int32 strlen 4;
>             String dimName id_len;
>         }
>     }
> }
> ---------------------------------
>
> So, the two dimensional character array has been converted to a single
> dimensional array of strings, whose length is stored in the nested  
> attribute
> "DODS".
>
> When you do dncdump on the dods URL, you get:
>
> ---------------------------------
> dimensions:
>         station = 4818 ;
> variables:
>         char station_id(station) ;
>                 station_id:long_name = "Station id" ;
>                 station_id:reference = "sfmetar_sa.tbl" ;
>                 station_id:DODS =
> dncdump: Attribute not found  (program terminates)
> ----------------------------------
>
>
> Dncdump aborts when it gets to the nested attribute, "DODS" because  
> a nested
> attribute is represented as a container and the Attr_container type is
> something that the code in Dattr.cc can't deal with.
>
>
> Now, even if it could deal with nested attributes, there'd still be  
> the
> problem of figuring out what to do with the strings.
>
> I temporarily hacked around the Attr_container problem and had it  
> pretend it
> was an empty character array....i.e.  station_ID:DODS = "" ;    
> essentially
> making it harmless to the rest of the netCDF API.
>
> dncdump ran through then without aborting, but if I asked it to dump
> "station_id"  (-v station_id option) it'd get the strings all wrong  
> because
> the second dimension, id_len, isn't there.
>
> So, there are two problems:
>
>  1) Can't deal with nested attributes (i.e. containers)
>  2) Can't deal with DAP strings
>
>
> Ideally the output from the DAP network version should look just  
> like the
> output from the local file version... the differences become  
> transparent.
>
> This would involve fixing the attribute handling code to not pass  
> through
> the nested attribute containers to the netCDF API (like the Java  
> version
> does).
>
> It would also involve converting the string arrays back to two- 
> dimensional
> character arrays as far as the netCDF API is concerned.
>
>
> I've already been hacking around in the code and thinking about  
> this for a
> while.   If you'd like I could take a first shot at it.
>
>
> -----
> David Wojtowicz, Sr. Research Programmer, Sysadmin
> Dept of Atmospheric Sciences / Computer Services
> University of Illinois at Urbana-Champaign
> address@hidden  (217) 333-8390
>
> -----Original Message-----
> From: James Gallagher [mailto:address@hidden]
> Sent: Thursday, February 23, 2006 10:14 AM
> To: Ethan Davis
> Cc: David Wojtowicz; address@hidden;
> address@hidden; 'support-thredds'
> Subject: Re: 20050922:DODS/THREDDS confusion
>
> This is in Trac as ticket 320. I'll try to get a fix out by the 6th
> of March.
>
> James
>
> On Feb 23, 2006, at 7:02 AM, Ethan Davis wrote:
>
>> Hi David,
>>
>> Yeah, this has been an issue for some time (seems to have started
>> on this list in 2003, http://www.unidata.ucar.edu/support/help/
>> MailArchives/dods-tech/msg01910.html).
>>
>> The underlying issue is that netCDF-3 doesn't understand String, it
>> only knows char and arrays of char, and OPeNDAP knows String but
>> not char. So, the mapping between netCDF-3 and OPeNDAP is a bit
>> thorny. The current nc servers map char arrays to String arrays (so
>> each String contains one character). The TDS maps char arrays into
>> Strings. The problem is that the nc client library doesn't know how
>> to map Strings back into char arrays just from the DDS/DAS because
>> the length of Strings can vary. (The TDS contains some hints in the
>> DAS, that is how the netCDF-java library knows how to deal with  
>> them.)
>>
>> I don't know if this comes up anywhere else. But it will be an
>> issue for any OPeNDAP dataset that contains String arrays. Do any
>> of the other servers map things to String arrays?
>>
>> Sorry I don't actually have any answers but since this is in some
>> ways an issue for the TDS, I'll look into this some more.
>>
>> Ethan
>
> --
> James Gallagher                jgallagher at opendap.org
> OPeNDAP, Inc                   406.723.8663
>
>

--
James Gallagher                jgallagher at opendap.org
OPeNDAP, Inc                   406.723.8663