Hi Ben,
Thanks very much indeed for this. When I foretold the overflow in my comments
I never seriously expected it to happen! Since then I've come across a few
really big datasets - out of interest, what are your datasets? High-res remote
sensing?
I'll incorporate your code into ncWMS, and it will find its way into THREDDS at
the next sync point (not sure when this will be - up to Unidata largely).
Many thanks also for providing a test case!
Cheers, Jon
-----Original Message-----
From: Ben Caradoc-Davies [mailto:Ben.Caradoc-Davies@xxxxxxxx]
Sent: 05 September 2011 10:13
To: Jon Blower
Cc: thredds mailing list
Subject: Thredds WMS support for large source grids
Jon,
I tested WMS in thredds 4.2.6 with large NetCDF source grids and encountered an
integer overflow in ncwms PixelMap. (You foretold this in the comments!) The
attached patch fixes this defect at the cost of a small increase in memory use.
You might remember writing (in PixelMap):
// Calculate a single integer representing this grid point in the source grid
// TODO: watch out for overflows (would only happen with a very large grid!)
int sourceGridIndex = j * this.sourceGridISize + i;
The integer overflow appears when the source grid has more than 2**31-1 points.
For example, this limit is exceeded with a 26 GB NetCDF file with a single
ubyte variable on a 92255x301081 grid.
The attached patch includes Xiangtan Lin's CdmUtils fix to force
DataReadingStrategy.SCANLINE for HDF5:
http://mailman.unidata.ucar.edu/mailing_lists/archives/thredds/2011/msg00312.html
The PixelMap change replaces the single integer array representing source and
target grid offsets integers packed into a single long with two long arrays,
one for source and one for target. This costs extra memory but may, in addition
to supporting large grids, improve performance by avoiding packing an unpacking.
It also includes:
- a minor CdmUtils static initialiser change to appease ecj (the Eclipse
compiler)
- access changes in HorizontalCoordSys to support unit testing
- a fix for axis sizes needed when LatLonCoordSys is explicitly instantiated in
the unit test (otherwise they can never be set)
- a unit test in which only the small() test method passes before the patch is
applied (to ensure existing behaviour is preserved for small grids); all test
methods ensure the expected source grid offset monotonicity
The patch is against the ncwms-src.jar distributed with thredds 4.2.6 (I'm
guessing the ncwms tds4.2-20101102 branch).
With this patch applied and the replacement ncwms.jar installed in WEB-INF/lib,
thredds 4.2.6 can serve a test 647 GB NetCDF4/HDF5 file via
WMS:
http://siss2.anu.edu.au/thredds/godiva2/godiva2.html?server=http://siss2.anu.edu.au/thredds/wms/ga/test/PRISM_UTM55_wgs84.nc
The test file has a single ubyte variable on a 461276x1505407 grid.
Performance is better than I expected; the aligned source and target grids plus
the nearest-point mapping from target to source seem to do the trick.
Kind regards,
--
Ben Caradoc-Davies <Ben.Caradoc-Davies@xxxxxxxx> Software Engineering Team
Leader CSIRO Earth Science and Resource Engineering Australian Resources
Research Centre