Dear All
I have a gridded dataset composed of annual netCDF files containing daily data
for the time period 1890-2012. These data have been exposed as a THREDDS NcML
aggregation which has been given a DOI (digital object identifier); therefore,
as the aggregation has been DOI'd it cannot be modified in anyway. The data
provider has now provided additional data for the years 2013 and 2014 and I am
trying to create a second NcML aggregation for 1890-2014 (which will also be
DOI'd). The data for 1890-2012 are in one folder whilst the data for 2013 and
2014 are in a second folder. I need to keep the files in their respective
folders as the folders have been check-summed for data auditing purposes.
Before I create the NcML aggregation on one of our THREDDS servers I have
created an NcML file which I am testing using the Java ToolsUI-4.6 application.
The NcML file consists of:
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
<aggregation type="joinExisting" dimName="time">
<scan
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/"
regExp="CEH_GEAR_daily_GB_20[0-9]{2}.nc" />
<scan location="Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/"
regExp="CEH_GEAR_daily_GB_201[0-9].nc" />
</aggregation>
</netcdf>
where I'm trying to use two folder scans, both with regular expressions, with
the first folder scan selecting netCDF files from the first folder for
2000-2012 and the second selecting netCDF files from the second folder for 2013
and 2014. Note that I'm only selecting data for 2000-2014 for
development/speed but ultimately need to create the aggregation for the whole
time period 1890-2014 hence the reason why I'm keen to use folder scans to save
having to define the individual files. When I check the aggregation using the
NcML | Aggregation tab in ToolsUI I get the following summary:
Type=joinExisting
dimName=time
Datasets (15)
Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/CEH_GEAR_daily_GB_2013.nc
range=[0:365) (365)
Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/CEH_GEAR_daily_GB_2014.nc
range=[365:730) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2000.nc
range=[730:1096) (366)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2001.nc
range=[1096:1461) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2002.nc
range=[1461:1826) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2003.nc
range=[1826:2191) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2004.nc
range=[2191:2557) (366)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2005.nc
range=[2557:2922) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2006.nc
range=[2922:3287) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2007.nc
range=[3287:3652) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2008.nc
range=[3652:4018) (366)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2009.nc
range=[4018:4383) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2010.nc
range=[4383:4748) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2011.nc
range=[4748:5113) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2012.nc
range=[5113:5479) (366)
timeUnitsChange=true
totalCoords=5479
Aggregation Variables
time(time=5479)
rainfall_amount(time=5479, y=1251, x=701)
min_dist(time=5479, y=1251, x=701)
Cache Variables
time (ucar.nc2.ncml.AggregationOuterDimension$CoordValueVar)
Variable Proxies
lat proxy ucar.nc2.ncml.Aggregation$DatasetProxyReader
lon proxy ucar.nc2.ncml.Aggregation$DatasetProxyReader
crs cached
rainfall_amount proxy ucar.nc2.ncml.AggregationExisting
min_dist proxy ucar.nc2.ncml.AggregationExisting
x proxy ucar.nc2.dataset.CoordinateAxis1D
y proxy ucar.nc2.dataset.CoordinateAxis1D
time proxy ucar.nc2.dataset.CoordinateAxis1D
Hence the files for the years 2013-2014 are given the time steps 0 to 730
whilst the files for 2000-2012 are given the time steps 730-5479, which is
incorrect; the time steps for the files for 2000-2012 should be 0 to 4749 and
for 2013-2014 4749 to 5479. I can only suggest that this is occurring because
the aggregation is combining the files alphabetically?
I have checked the netCDF files and the time coordinates in all files are
defined relative to 1800-01-01, for example:
double time(time=366);
:units = "days since 1800-1-1";
:calendar = "gregorian";
:long_name = "Time in days since 1800-1-1 (on 2012-1-1: day 77431)";
and have the expected values:
2000-01-01 to 2012-12-31: 73048 to 77796 (days since 1800-01-01)
and
2013-01-01 to 2014-12-31: 77797 to 78526 (days since 1800-01-01).
I also attempted to use the timeUnitsChange="true" flag when defining the
aggregation but this didn't appear to have any effect.
I then created an NcML file creating a joinExisting aggregation specifying the
individual netCDF files for the years 2000-2014:
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
<aggregation type="joinExisting" dimName="time">
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2000.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2001.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2002.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2003.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2004.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2005.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2006.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2007.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2008.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2009.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2010.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2011.nc"/>
<netcdf
location="Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2012.nc"/>
<netcdf
location="Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/CEH_GEAR_daily_GB_2013.nc"/>
<netcdf
location="Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/CEH_GEAR_daily_GB_2014.nc"/>
</aggregation>
</netcdf>
and I get the following summary when I check the NcML file in ToolsUI:
Type=joinExisting
dimName=time
Datasets (15)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2000.nc
range=[0:366) (366)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2001.nc
range=[366:731) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2002.nc
range=[731:1096) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2003.nc
range=[1096:1461) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2004.nc
range=[1461:1827) (366)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2005.nc
range=[1827:2192) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2006.nc
range=[2192:2557) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2007.nc
range=[2557:2922) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2008.nc
range=[2922:3288) (366)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2009.nc
range=[3288:3653) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2010.nc
range=[3653:4018) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2011.nc
range=[4018:4383) (365)
Z:/thredds/uk_rainfall/GB/daily/netCDF/uncompressed/detail/CEH_GEAR_daily_GB_2012.nc
range=[4383:4749) (366)
Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/CEH_GEAR_daily_GB_2013.nc
range=[4749:5114) (365)
Z:/eidchub/f2856ee8-da6e-4b67-bedb-590520c77b3c/CEH_GEAR_daily_GB_2014.nc
range=[5114:5479) (365)
timeUnitsChange=false
totalCoords=5479
Aggregation Variables
time(time=5479)
rainfall_amount(time=5479, y=1251, x=701)
min_dist(time=5479, y=1251, x=701)
Cache Variables
time (ucar.nc2.ncml.AggregationOuterDimension$CoordValueVar)
Variable Proxies
lat proxy ucar.nc2.ncml.Aggregation$DatasetProxyReader
lon proxy ucar.nc2.ncml.Aggregation$DatasetProxyReader
crs cached
rainfall_amount proxy ucar.nc2.ncml.AggregationExisting
min_dist proxy ucar.nc2.ncml.AggregationExisting
x proxy ucar.nc2.dataset.CoordinateAxis1D
y proxy ucar.nc2.dataset.CoordinateAxis1D
time proxy ucar.nc2.dataset.CoordinateAxis1D
which correctly aggregates the time dimension across the files.
Therefore, is it possible to use multiple folder scans when creating an NcML
joinExisting aggregation? And if it is, can anybody see what I'm doing wrong
in my NcML file?
Many thanks for any help that anyone can provide. Best wishes, Simon.
________________________________
This message (and any attachments) is for the recipient only. NERC is subject
to the Freedom of Information Act 2000 and the contents of this email and any
reply you make may be disclosed by NERC unless it is exempt from release under
the Act. Any material supplied to NERC may be stored in an electronic records
management system.
________________________________