Re: [netcdfgroup] Sort of Inverse of netcdf diskless mode

  • To: Roy Mendelssohn - NOAA Federal <roy.mendelssohn@xxxxxxxx>
  • Subject: Re: [netcdfgroup] Sort of Inverse of netcdf diskless mode
  • From: Doug Hunt <dhunt@xxxxxxxx>
  • Date: Thu, 20 Nov 2014 11:51:33 -0700
Hi Roy: It's not too big a deal, the netCDF 3 format is nicely orthogonal and easy to parse. The attached perl library is only around 350 lines long with adequate comments, but it gives you most basic functionality you need for reading.

Regards,

  Doug

On 11/20/14 10:36, Roy Mendelssohn - NOAA Federal wrote:
Hi Doug:

Thanks for the comments. I would be interested on seeing the perl code, just to see how complicated 
it is to do this (though I don’t really know perl).  Fro my purposes. since this is something 
we will be distributing, we want to keep  what we do to code that is “native” to the 
program, in this case R.

-Roy

On Nov 20, 2014, at 8:48 AM, Doug Hunt <dhunt@xxxxxxxx> wrote:

Hi Roy:  I had a similar problem earlier this year.  I wanted to open a netCDF 
3 file and read from it using the netCDF library when the file was presented to 
me in memory instead of as a disk file.

I spent some time examining the netCDF library source and talking to UNIDATA 
folks about this and determined that the library was too hard to change.  The 
interface depended upon passing in a file name and it would be too hard to 
modify it to add an 'ncopen' that took a pointer to memory instead.

What I ended up doing was writing a reader from scratch in perl (since this 
application was in perl using the PDL::NetCDF interface) that reads the raw 
netCDF3 binary format.

This has proved to be a good lightweight solution for us, but it does restrict 
us to netCDF 3.

I'd be happy to share my perl netCDF 3 reader library if you are interested.

Incidentally, as a result of this experience, I came away with an appreciation 
for the elegance and good design of the netCDF 3 binary format and a suspicion 
of the netCDF 4 format as being too complex and having too much library 
overhead.

Regards,

  Doug

On 11/19/14 18:41, Roy Mendelssohn - NOAA Federal wrote:
Hi All:

Netcdf now has the option of  creating or reading a file entirely in memory, a 
nice option to have. I am wondering if there is any way to do sort of the 
inverse.  Some web libraries allow for a pure binary download of a file into 
memory, so what I have sitting there is essentially a binary image of the 
netcdf file.  Is there any way to open that image without writing to disk?

As an example, the R httr library allows this.  I can for example download to memory a netcdf 
file from ERDDAP, and if I then do a binary save in httr and then read it back in using the 
ncdf4 package, it all works great.   But it would be even better if I could skip the extra 
steps, and “open” the image in ncdf4 directly  (ncdf4 btw can do whatever the 
netcdf libraries can do, so it is a question of the base libraries).

Thanks,

-Roy


**********************
"The contents of this message do not reflect any position of the U.S. Government or 
NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new address and phone***
110 Shaffer Road
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: Roy.Mendelssohn@xxxxxxxx www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.

_______________________________________________
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/


_______________________________________________
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/

**********************
"The contents of this message do not reflect any position of the U.S. Government or 
NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new address and phone***
110 Shaffer Road
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: Roy.Mendelssohn@xxxxxxxx www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.

#
##  Copyright (c) 1995-2013 University Corporation for Atmospheric Research
## All rights reserved
#
#/**----------------------------------------------------------------------
# @file       NC3.pm
#
# This module allows quick reading of netCDF 3 data from perl strings.
# See https://www.unidata.ucar.edu/software/netcdf/docs/classic_format_spec.html
# for detailed format specification.
#
# Example:
#   my $nc   = NC3->new($filetext)->parse;  # $filetext contains the netCDF 
file data in a perl variable
#   my $temp = $nc->get('temp'); # fetch the temperature from the NC3 object, 
$nc
#
# @author     Doug Hunt
# @since      11/12/2013
# @version    $URL$ $Id$
# -----------------------------------------------------------------------*/

package NC3;

# Constants

use constant NC_BYTE   => 1;
use constant NC_CHAR   => 2;
use constant NC_SHORT  => 3;
use constant NC_INT    => 4;
use constant NC_FLOAT  => 5;
use constant NC_DOUBLE => 6;

use vars (%ncsize, %pdltype);

use PDL;

%ncsize  = (NC_BYTE()  => 1, NC_CHAR()   => 1, NC_SHORT() => 2, NC_INT() => 4,
            NC_FLOAT() => 4, NC_DOUBLE() => 8);
%pdltype = (NC_BYTE()  => PDL::byte,  NC_SHORT()  => PDL::short, NC_INT() => 
PDL::long,
            NC_FLOAT() => PDL::float, NC_DOUBLE() => PDL::double);

#/**----------------------------------------------------------------------
# @sub new
#
# Parse a string containing netCDF 3 data.  Return a handle for future
# operations.
#
# @parameter  $ncdata -- Perl string containing netCDF data.
# @return     NC3 object
# ----------------------------------------------------------------------*/
sub new {
  my $class    = shift;
  my $ncdata_r = shift;
  my %opts   = @_;

  return bless {DATA => $ncdata_r, %opts}, $class;
}

#/**----------------------------------------------------------------------
# @sub parse
#
# Parse the header of the netcdf file contained in the object.
#
# @parameter  $self -- NC3 object
# @return     none (header parsed, info stored in object)
# ----------------------------------------------------------------------*/
sub parse {
  my $self = shift;

  my $d = $self->{DATA}; # ref to scalar
  my $idx = 0;

  die "Not a netCDF file" if (substr($$d, $idx, 3) ne 'CDF'); $idx+=3;

  $self->{VERSION} = unpack ("C", substr($$d, $idx, 1)); $idx++;
  die "Bad version number: $self->{VERSION}" if ($self->{VERSION} != 1 && 
$self->{VERSION} != 2);

  $self->{NUMRECS} = unpack ("N", substr($$d, $idx, 4)); $idx+=4;
  die "Streaming not supported" if ($self->{NUMRECS} == 0xffffffff);

  $self->{DIMS} = parse_dimlist_ ($d, \$idx);
  $self->{GATS} = parse_attlist_ ($d, \$idx);
  ($self->{VARS}, $self->{VARNAMES}) = parse_varlist_ ($d, \$idx);

  return $self;
}


#/**----------------------------------------------------------------------
# @sub get
#
# Get a variable by name
#
# @parameter  $self -- NC3 object
# @return     $variable PDL or string
# ----------------------------------------------------------------------*/
sub get {
  my $self = shift;
  my $name = shift;

  my ($nctype, $vsize, $offset, $dims) = @{$self->{VARS}{$name}}[1, 2, 3, 5];

  # Return NC_CHAR as a perl string
  if      ($nctype == NC_CHAR) {
    return substr (${$self->{DATA}}, $offset, $vsize);
  }

  my @dimlens = map { $self->{DIMS}[$_][1] } @$dims;

  # PDL variables in which the *leftmost* dimension varies fastest (FORTRAN 
style)
  # NetCDF variables vary fastest in the *rightmost* dimension (C style).  Thus 
the 'reverse'.
  my $pdl = PDL->zeroes($pdltype{$nctype}, reverse @dimlens);

  # Create a PDL variable from the DATA field, the offset and the vsize
  $pdl->make_physical;
  ${$pdl->get_dataref} = substr(${$self->{DATA}}, $offset, $vsize);
  $pdl->upd_data;
  $pdl->bswap2 if ($ncsize{$nctype} == 2);
  $pdl->bswap4 if ($ncsize{$nctype} == 4);
  $pdl->bswap8 if ($ncsize{$nctype} == 8);
  return $pdl;
}


#/**----------------------------------------------------------------------
# @sub getatt
#
# Get an attribute by name, either global or belonging to a variable
#
# @parameter  $self -- NC3 object
# @           $name -- The name of the attribute to fetch
# @           $varname -- The name of the variable this attribute is associated
# @                       with.  UNDEF means a global attribute.
# @return     $variable -- PDL or string
# ----------------------------------------------------------------------*/
sub getatt {
  my $self    = shift;
  my $name    = shift;
  my $varname = shift // '';

  my ($offset, $nctype, $nelem);
  if ($varname) { # variable attribute
    ($offset, $nctype, $nelem) = @{$self->{VARS}{$varname}[4]{$name}};
  } else { # global attribute
    ($offset, $nctype, $nelem) = @{$self->{GATS}{$name}};
  }
  return unpack_attribute_($self->{DATA}, $offset, $nctype, $nelem);
}


#/**----------------------------------------------------------------------
# @sub getvariablenames
#
# Get a list of all variable names in the netCDF data in definition order.
#
# @parameter  $self -- NC3 object
# @return     $varnames -- Reference to perl list of variable names
# ----------------------------------------------------------------------*/
sub getvariablenames {
  my $self    = shift;
  return $self->{VARNAMES};
}

#/**----------------------------------------------------------------------
# @sub getdimensionnames
#
# Get a list of all dimension names (in order) from the entire netCDF
# data set.  If a variable name is specified, then return just the
# dimension names (in order) for that variable.
#
# @parameter  $self -- NC3 object
# @           $var  -- Name of a variable, or undef
# @return     $dimnames -- Reference to perl list of variable names
# ----------------------------------------------------------------------*/
sub getdimensionnames {
  my $self = shift;
  my $var  = shift // '';

  if ($var) { # just the dimension names for this variable
    my @dims = @{$self->{DIMS}};
    return [map { $dims[$_][0] } @{$self->{VARS}{$var}[5]}];
  } else { # all dimension names in file
    my $ndims = scalar (@{$self->{DIMS}});
    return [map { $self->{DIMS}[$_][0] } (0..$ndims-1)];
  }

}


#-------------------------------------------------------------------------
## Internal functions
#-------------------------------------------------------------------------

#/**----------------------------------------------------------------------
# @sub parse_dimlist_
#
# Internal function:  Parse a dimension list
#
# @parameter  $d    -- Data ref
# @           $$idx -- Reference to index into data string to start of 
dimension list
# @return     $dims -- [[NAME1 => dimlength1], [NAME2, dimlength2], ...]
# ----------------------------------------------------------------------*/
sub parse_dimlist_ {
  my $d   = shift;
  my $idx = shift;

  my $id  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

  if ($id == 0x00000000) {
    $$idx +=4;
    return [];  # absent dims list
  } elsif ($id != 0x0000000A) {
    die "Expecting ID = NC_DIMENSION (0x0A), found ID = $id";
  }

  my $nelems = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

  my @dims;
  for (my $i=0;$i<$nelems;$i++) {
    my $namelen = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    my $name    = substr ($$d, $$idx, $namelen); $$idx+=$namelen;
    my $padlen  = ($namelen % 4) == 0 ? 0 : 4 - ($namelen % 4);
    $$idx += $padlen;
    my $dimlen  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    push (@dims, [$name, $dimlen]);
  }

  return \@dims;
}


#/**----------------------------------------------------------------------
# @sub parse_attlist_
#
# Internal function:  Parse an attribute list
#
# @parameter  $d    -- Data ref
# @           $$idx -- Reference to index into data string to start of 
dimension list
# @return     $atts -- {NAME1 => pdl, NAME2 => pdl}
# ----------------------------------------------------------------------*/
sub parse_attlist_ {
  my $d   = shift;
  my $idx = shift;

  my $id  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

  if ($id == 0x00000000) {
    $$idx +=4;
    return [];  # absent atts list
  } elsif ($id != 0x0000000C) {
    die "Expecting ID = NC_ATTRIBUTE (0x0C), found ID = $id";
  }

  my $nelems = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

  my %atts;
  for (my $i=0;$i<$nelems;$i++) {
    my $namelen  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    my $name     = substr ($$d, $$idx, $namelen); $$idx+=$namelen;
    my $padlen  = ($namelen % 4) == 0 ? 0 : 4 - ($namelen % 4);
    $$idx += $padlen;
    my ($nctype, $nelem) = unpack ("NN", substr ($$d, $$idx, 8)); $$idx+=8;
    $atts{$name} = [$$idx, $nctype, $nelem];
    my $size = $nelem*$ncsize{$nctype};
    $padlen  = ($size % 4) == 0 ? 0 : 4 - ($size % 4);
    $$idx += ($size + $padlen);
  }

  return \%atts;
}


#/**----------------------------------------------------------------------
# @sub parse_varlist_
#
# Internal function:  Parse a variable list
#
# @parameter  $d    -- Data ref
# @           $$idx -- Reference to index into data string to start of 
dimension list
# @return     $vars -- {NAME1 => [RANK, NCTYPE, VSIZE, OFFSET, ATTR => 
{ATTR_STRUCTURE}, DIMS => [DIMLIST]], NAME2 => ...}
# ----------------------------------------------------------------------*/
sub parse_varlist_ {
  my $d   = shift;
  my $idx = shift;

  my $id  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

  if ($id == 0x00000000) {
    $$idx +=4;
    return [];  # absent var list
  } elsif ($id != 0x0000000B) {
    die "Expecting ID = NC_VARIABLE (0x0B), found ID = $id";
  }

  my $nvars = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

  my %vars;
  my @vars;
  for (my $i=0;$i<$nvars;$i++) {
    my $namelen  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    my $name     = substr ($$d, $$idx, $namelen); $$idx+=$namelen;
    my $padlen  = ($namelen % 4) == 0 ? 0 : 4 - ($namelen % 4);
    $$idx += $padlen;

    my @dims;
    my $rank     = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    for (my $j=0;$j<$rank;$j++) {
      $dims[$j] = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    }
    my $atts   = parse_attlist_($d, $idx);
    my $nctype = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    my $vsize  = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;
    my $offset = unpack ("N", substr ($$d, $$idx, 4)); $$idx+=4;

    $vars{$name} = [$rank, $nctype, $vsize, $offset, $atts, \@dims];
    push (@vars, $name); # keep an ordered list as well as a hash
  }

  return (\%vars, \@vars);
}


#/**----------------------------------------------------------------------
# @sub unpack_attribute_
#
# Internal function:  Unpack an attribute into a PDL
#
# @parameter  $d      -- Data ref
# @           $offset -- Reference to index into data string to start of 
dimension list
# @           $nctype -- NetCDF data type: NC_BYTE | NC_CHAR | NC_SHORT | 
NC_INT | NC_FLOAT | NC_DOUBLE (1-6)
# @           $nelem  -- Number of elements
# @return     $pdl    -- A PDL of the correct type containing the values or a 
string (NC_CHAR)
# ----------------------------------------------------------------------*/
sub unpack_attribute_ {
  my $d      = shift;
  my $offset = shift;
  my $nctype = shift;
  my $nelem  = shift;

  # Return NC_CHAR as a perl string
  if      ($nctype == NC_CHAR) {
    return substr ($$d, $offset, $nelem);
  }

  my $pdl = PDL->zeroes($pdltype{$nctype}, $nelem);
  $pdl->make_physical;
  my $size = $nelem*$ncsize{$nctype};
  ${$pdl->get_dataref} = substr($$d, $offset, $size);
  $pdl->upd_data;
  $pdl->bswap2 if ($ncsize{$nctype} == 2);
  $pdl->bswap4 if ($ncsize{$nctype} == 4);
  $pdl->bswap8 if ($ncsize{$nctype} == 8);

  return $pdl;
}

  • 2014 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: