[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDFJava #HIP-291542]: Question on NcML aggregation



Matt,

Let me see if I understand what you're trying to do:  For your aggregation, are 
you trying to scan a remote directory somehow, as in, somewhere on another 
machine?  Or are you trying to look somewhere else up a local directory tree?  
Depending on the operating system (Linux/Mac/Windows), the path name can be 
tricky.  What system are you on?

-Lansing

> Li,
> 
> There are four buoys and four station names. The joinExisting seems to be
> working correctly, I was able to use matlab to point at the ncml and load
> in a subset of the data. However, Matlab wouldn't allow me to load the
> entire 2290302 values for all the DOU* files.
> 
> Lansing,
> 
> I've also been working on using the scan element to use a regular
> expression to aggregate all the appropriate files. I have been able to get
> it working in a local directory where I have a copy of the .nc files. Is it
> possible to get the scan element to look somewhere else? I've tried to
> adjust the location to the appropriate path, but none of them are working.
> I was thinking of trying to get it to scan the
> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/ directory for a regular
> expression. Below is an example of the working regEx ncml, using local
> files.
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
> <aggregation dimName="time" type="joinExisting">
> <scan location="" regExp=".*DOU001_000.*" suffix=".nc" />
> 
> </aggregation>
> </netcdf>
> 
> I might be out of my league trying to do something like this.
> 
> Thanks,
> 
> Matt
> 
> 
> 
> address@hidden> wrote:
> 
> > Hi Lansing,
> >
> > It would be great if we could set up different buoys in one NcML
> > aggregation.
> >
> > Hi Matt,
> > Do you have idea about how many buoys and station names are there from all
> > the files?
> >
> >
> > Thanks,
> > Li
> >
> >
> >
> >
> >
> >
> >
> > *
> >
> >
> > *
> > address@hidden> wrote:
> >
> >> Li,
> >>
> >> After looking at the files you are serving up, I think joinExisting is
> >> the way to go.  You'll need to set up different NcML aggregations for the
> >> different buoys (DUO and SEF, it looks like).  Is there some other
> >> information you thought users would be selecting?  I'm not entirely sure
> >> what you mean in your comments below on joinExisting.
> >>
> >> -Lansing
> >>
> >> > Thanks Lansing and Matt!
> >> >
> >> > I put the three ways of aggregated files under the test THREDDS server.
> >> > Here are few comments.
> >> >
> >> http://data.nodc.noaa.gov/thredds/catalog/testdata/20130328/catalog.html
> >> >
> >> > joinNew: There is "timeagg" in the OPeNDAP selection, but the actual
> >> > temperature values are extracted from the first timeagg, no matter which
> >> > timeagg we choose.
> >> >
> >> > joinExisting: this one picks up all the values for all the joined files.
> >> > Though there is no list of station_name, plfatform1, instrument, for the
> >> > users to select.
> >> >
> >> > Union: this one picks up only the first file in the aggregation, and
> >> there
> >> > is no way to select the files after.
> >> >
> >> > The joinExisting is the way to go.
> >> >
> >> > Thanks,
> >> > Li
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > address@hidden> wrote:
> >> >
> >> > > Matt,
> >> > >
> >> > > I set up an aggregation using four of the files.  At it's simplest,
> >> this
> >> > > works to join the datasets along the existing time dimension:
> >> > >
> >> > > <?xml version="1.0" encoding="UTF-8"?>
> >> > > <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2
> >> ">
> >> > >   <aggregation dimName="time" type="joinExisting">
> >> > >     <netcdf id="1" location="DOU001_000_20070530_20070906.nc"/>
> >> > >     <netcdf id="2" location="DOU001_000_20070906_20071107.nc"/>
> >> > >     <netcdf id="3" location="DOU001_000_20080807_20081204.nc"/>
> >> > >     <netcdf id="4" location="DOU001_000_20100803_20101117.nc"/>
> >> > >   </aggregation>
> >> > > </netcdf>
> >> > >
> >> > > Since you already have a good dimension for aggregating (time), then
> >> you
> >> > > probably don't need to worry about doing a joinNew.  You would use
> >> that if
> >> > > there were no time data in the files, and you wanted to add it into
> >> the
> >> > > dataset you were serving, for instance.
> >> > >
> >> > > Let me know if there is something else you are trying to do, I'm
> >> happy to
> >> > > help.
> >> > >
> >> > > -Lansing
> >> > >
> >> > > > Lansing,
> >> > > >
> >> > > > When I set it up using the "joinExisting" on the time dimension it
> >> seems
> >> > > to
> >> > > > aggregate them. Below is the ncml:
> >> > > > <?xml version="1.0" encoding="UTF-8"?>
> >> > > > <netcdf xmlns="
> >> http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
> >> > > > <variable name="lat">
> >> > > > <attribute name="standard_name" type="String" value="latitude"/>
> >> > > > <attribute name="axis" type="string" value="Y"/>
> >> > > > <attribute name="units" type="string" value="degrees_north"/>
> >> > > > </variable>
> >> > > > <variable name="lon">
> >> > > > <attribute name="standard_name" type="string" value="longitude"/>
> >> > > > <attribute name="axis" type="string" value="X"/>
> >> > > > <attribute name="units" type="string" value="degrees_east"/>
> >> > > > </variable>
> >> > > > <variable name="time">
> >> > > > <attribute name="standard_name" type="string" value="time"/>
> >> > > > <attribute name="axis" type="string" value="T"/>
> >> > > > <attribute name="units" type="string" value="seconds since
> >> 1970-01-01
> >> > > > 00:00:00"/>
> >> > > > </variable>
> >> > > > <aggregation dimName="time" type="joinExisting">
> >> > > > <netcdf id="1" location="
> >> > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20070530_20070906.nc
> >> > > > "/>
> >> > > > <netcdf id="2" location="
> >> > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20070906_20071107.nc
> >> > > > "/>
> >> > > > <netcdf id="3" location="
> >> > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20080807_20081204.nc
> >> > > > "/>
> >> > > > <netcdf id="4" location="
> >> > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20100803_20101117.nc
> >> > > > "/>
> >> > > > <netcdf id="5" location="
> >> > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20101117_20110318.nc
> >> > > > "/>
> >> > > > </aggregation>
> >> > > > </netcdf>
> >> > > >
> >> > > > Is there a reason the joinExisting would work, but not the joinNew?
> >> I
> >> > > > thought joinNew would be able to aggregate anything, since your
> >> creating
> >> > > a
> >> > > > new variable to aggregate them with.
> >> > > >
> >> > > > I apologize if this is something simple that I am just missing, I
> >> am just
> >> > > > starting to get my bearings with this.
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Matt
> >> > > >
> >> > > >
> >> > > > address@hidden> wrote:
> >> > > >
> >> > > > > Lansing,
> >> > > > >
> >> > > > > Below are a few of the files I am trying to aggregate.
> >> > > > >
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20070530_20070906.nc.html
> >> > > > >
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20070906_20071107.nc.html
> >> > > > >
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20080807_20081204.nc.html
> >> > > > >
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20100803_20101117.nc.html
> >> > > > >
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20101117_20110318.nc.html
> >> > > > >
> >> > > > > Here is the ncml for the joinNew I've been fiddling around with:
> >> > > > >
> >> > > > > <?xml version="1.0" encoding="UTF-8"?>
> >> > > > > <netcdf xmlns="
> >> http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2
> >> > > ">
> >> > > > >  <!-- <variable name="time" type="int">
> >> > > > > <attribute name="units" type="string" value="seconds since
> >> 1970-01-01
> >> > > > > 00:00:00"/>
> >> > > > >  <attribute name="_CoordinateAxisType" value="Time" />
> >> > > > >   </variable>-->
> >> > > > >   <aggregation dimName="timeagg" type="joinNew">
> >> > > > >   <variableAgg name="A"/>
> >> > > > > <netcdf id="1" location="
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20070530_20070906.nc
> >> > > > > "/>
> >> > > > > <netcdf id="2" location="
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20070906_20071107.nc
> >> > > > > "/>
> >> > > > > <netcdf id="3" location="
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20080807_20081204.nc
> >> > > > > "/>
> >> > > > > <netcdf id="4" location="
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20100803_20101117.nc
> >> > > > > "/>
> >> > > > > <netcdf id="5" location="
> >> > > > >
> >> > >
> >> http://data.nodc.noaa.gov/thredds/dodsC/nmsp/bml/DOU001_000_20101117_20110318.nc
> >> > > > > "/>
> >> > > > >   </aggregation>
> >> > > > >  </netcdf>
> >> > > > >
> >> > > > > For some reason the Aggregation Variables are not being listed...
> >> Am I
> >> > > > > completely missing something?
> >> > > > > I truly appreciate your assistance in this matter. :)
> >> > > > >
> >> > > > > Thanks,
> >> > > > >
> >> > > > > Matt
> >> > > > >
> >> > > > >
> >> > > > > address@hidden> wrote:
> >> > > > >
> >> > > > >> Matt,
> >> > > > >>
> >> > > > >> If you send me a few of the files, I can set up the aggregation
> >> and
> >> > > see
> >> > > > >> if there are any pitfalls in the data.  It's not always
> >> > > straightforward.
> >> > > > >>
> >> > > > >> -Lansing
> >> > > > >>
> >> > > > >> > Lansing,
> >> > > > >> >
> >> > > > >> > The times "should be" unified, but I should look into that.
> >> > > Probably, to
> >> > > > >> > avoid complications a JoinNew would suffice in this case.
> >> > > > >> >
> >> > > > >> > Matt
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > address@hidden> wrote:
> >> > > > >> >
> >> > > > >> > > Li,
> >> > > > >> > >
> >> > > > >> > > Are the times in the data sets unified (i.e., all the same),
> >> or
> >> > > are
> >> > > > >> they
> >> > > > >> > > interlaced?  In other words, if you were to set up a single
> >> time
> >> > > axis
> >> > > > >> for
> >> > > > >> > > all of the buoys, would the data line up on the axis, or
> >> would it
> >> > > be
> >> > > > >> > > scattered around?  If the times are all the same, then you
> >> should
> >> > > be
> >> > > > >> able
> >> > > > >> > > to do a simple Union as described here:
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >>
> >> > >
> >> http://www.unidata.ucar.edu/software/netcdf/ncml/v2.2/Aggregation.html
> >> > > > >> > >
> >> > > > >> > > Otherwise, you will need to to a JoinNew, wherein you
> >> declare a
> >> > > new
> >> > > > >> > > dimension as the aggregation dimension.  This is described
> >> on the
> >> > > > >> same page.
> >> > > > >> > >
> >> > > > >> > > -Lansing
> >> > > > >> > >
> >> > > > >> > > > Hi Lansing,
> >> > > > >> > > >
> >> > > > >> > > > I think there are group of buoys that are reporting data
> >> > > > >> simultaneously.
> >> > > > >> > > > The data sets should be overlapping in space.
> >> > > > >> > > > I included Matt, who is the data officer for this data
> >> set, in
> >> > > our
> >> > > > >> > > > conversation.
> >> > > > >> > > >
> >> > > > >> > > > Hi Matt,
> >> > > > >> > > > Please correct me if the description about the data is not
> >> > > right.
> >> > > > >> > > >
> >> > > > >> > > > Thanks a lot,
> >> > > > >> > > > Li
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > > > On Fri, Mar 29, 2013 at 10:40 AM, Unidata netCDF Java
> >> > > > >> > > > address@hidden> wrote:
> >> > > > >> > > >
> >> > > > >> > > > > Li,
> >> > > > >> > > > >
> >> > > > >> > > > > Are the data sets from the buoys overlapping in time,
> >> space,
> >> > > or
> >> > > > >> both?
> >> > > > >> > > > >  That is, do you have many buoys that are reporting data
> >> > > > >> > > simultaneously, or
> >> > > > >> > > > > are the individual data set series generated by each buoy
> >> > > > >> temporally
> >> > > > >> > > > > distinct?
> >> > > > >> > > > >
> >> > > > >> > > > > Also, are the nc files generated by the buoys, or have
> >> they
> >> > > been
> >> > > > >> > > generated
> >> > > > >> > > > > through some post-processing from a raw data set?
> >> > > > >> > > > >
> >> > > > >> > > > > If the files are not too large, feel free to send me a
> >> few
> >> > > > >> > > representative
> >> > > > >> > > > > files to work with locally, and I will try to set up an
> >> > > > >> aggregation.
> >> > > > >> > > > >
> >> > > > >> > > > > Regards,
> >> > > > >> > > > >   Lansing Madry
> >> > > > >> > > > >   Unidata
> >> > > > >> > > > >   Boulder, Colorado
> >> > > > >> > > > >
> >> > > > >> > > > > > Dear Sir,
> >> > > > >> > > > > >
> >> > > > >> > > > > > I am trying to reach the experts about NcML
> >> aggregation.
> >> > > > >> > > > > >
> >> > > > >> > > > > > I have a group of time series buoy nc files. I tried to
> >> > > > >> aggregate
> >> > > > >> > > them
> >> > > > >> > > > > > by time, but failed since each of them holds a time
> >> series
> >> > > > >> itself.
> >> > > > >> > > > > >
> >> > > > >> > > > > > It would be great if anyone could advise me on how
> >> should I
> >> > > > >> aggregate
> >> > > > >> > > > > > these data.
> >> > > > >> > > > > >
> >> > > > >> > > > > > Thanks and Regards,
> >> > > > >> > > > > > Li
> >> > > > >> > > > > >
> >> > > > >> > > > > >
> >> > > > >> > > > >
> >> > > > >> > > > >
> >> > > > >> > > > > Ticket Details
> >> > > > >> > > > > ===================
> >> > > > >> > > > > Ticket ID: HIP-291542
> >> > > > >> > > > > Department: Support netCDF Java
> >> > > > >> > > > > Priority: Normal
> >> > > > >> > > > > Status: Open
> >> > > > >> > > > >
> >> > > > >> > > > >
> >> > > > >> > > >
> >> > > > >> > > >
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >> > > Ticket Details
> >> > > > >> > > ===================
> >> > > > >> > > Ticket ID: HIP-291542
> >> > > > >> > > Department: Support netCDF Java
> >> > > > >> > > Priority: Normal
> >> > > > >> > > Status: Open
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >> >
> >> > > > >> >
> >> > > > >> > --
> >> > > > >> > Mathew Biddle, Oceanographer
> >> > > > >> > NOAA/NODC UMD/ESSIC/CICS E/OC1
> >> > > > >> > 1315 East-West Hwy
> >> > > > >> > Silver Spring, MD 20910-3282
> >> > > > >> > Phone: (301) 713-3272 X163
> >> > > > >> > Email: address@hidden
> >> > > > >> > http://www.nodc.noaa.gov/
> >> > > > >> > http://www.facebook.com/noaa.nodc
> >> > > > >> >
> >> > > > >> >
> >> > > > >>
> >> > > > >>
> >> > > > >> Ticket Details
> >> > > > >> ===================
> >> > > > >> Ticket ID: HIP-291542
> >> > > > >> Department: Support netCDF Java
> >> > > > >> Priority: Normal
> >> > > > >> Status: Open
> >> > > > >>
> >> > > > >>
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Mathew Biddle, Oceanographer
> >> > > > > NOAA/NODC UMD/ESSIC/CICS E/OC1
> >> > > > > 1315 East-West Hwy
> >> > > > > Silver Spring, MD 20910-3282
> >> > > > > Phone: (301) 713-3272 X163
> >> > > > > Email: address@hidden
> >> > > > > http://www.nodc.noaa.gov/
> >> > > > > http://www.facebook.com/noaa.nodc
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Mathew Biddle, Oceanographer
> >> > > > NOAA/NODC UMD/ESSIC/CICS E/OC1
> >> > > > 1315 East-West Hwy
> >> > > > Silver Spring, MD 20910-3282
> >> > > > Phone: (301) 713-3272 X163
> >> > > > Email: address@hidden
> >> > > > http://www.nodc.noaa.gov/
> >> > > > http://www.facebook.com/noaa.nodc
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> > > Ticket Details
> >> > > ===================
> >> > > Ticket ID: HIP-291542
> >> > > Department: Support netCDF Java
> >> > > Priority: Normal
> >> > > Status: Open
> >> > >
> >> > >
> >> >
> >> >
> >>
> >>
> >> Ticket Details
> >> ===================
> >> Ticket ID: HIP-291542
> >> Department: Support netCDF Java
> >> Priority: Normal
> >> Status: Open
> >>
> >>
> >
> 
> 
> --
> Mathew Biddle, Oceanographer
> NOAA/NODC UMD/ESSIC/CICS E/OC1
> 1315 East-West Hwy
> Silver Spring, MD 20910-3282
> Phone: (301) 713-3272 X163
> Email: address@hidden
> http://www.nodc.noaa.gov/
> http://www.facebook.com/noaa.nodc
> 
> 


Ticket Details
===================
Ticket ID: HIP-291542
Department: Support netCDF Java
Priority: Normal
Status: Open