Hi,
We (Harry Mangalam and I) are benchmarking NCO in various settings and
have learned interesting things about bottlenecks to total throughput.
NCO is (not surprising) I/O-limited in situations such as file
concatenation, where data are copied straight from one or more input
files to a single output file with no intervening arithmetic.
For x86 machines, much of this bottleneck appears to be byte-swapping.
My understanding is that the swap???() routines in ncx.c convert data
from the big-endian netCDF internal format into the machine native
little-endian format.
However, no-conversion ought to be _necessary_, even on x86 machines,
if the variables are simply being copied from some hyperslab in
an input file to some hyperslab (of the same external type) in the
output file.
Currently, however, the netCDF interface provides no speedy way (that
we know of) to copy directly from the input file to the output file
without incurring the swap???() penalty of the I/O layer.
It seems there is a niche for a more efficient method of copying
variables which bypasses as many parts of the I/O layer as possible.
We suggest a new set of nc_copy_*() routines (for netCDF4?) which,
theoretically, could be implemented as simple byte streams from input
to output files without any *-endian conversions slowing things down.
int nc_copy_var(int ncid_1,int nc_id_2,int varid_1,int varid_2);
The nc_copy_vara(), nc_copy_vars(), and nc_copy_varm() routines follow
by analogy, e.g.,
int nc_copy_vara(int ncid_1,int nc_id_2,int varid_1,int varid_2,
const size_t start_1,const size_t count_1,
const size_t start_2,const size_t count_2);
Note that nc_copy_var*() routines require no type information.
Would this be a useful addition to a future version of netCDF?
Thanks,
Charlie
--
Charlie Zender, surname@xxxxxxx, (949) 824-2987, Department of Earth
System Science, University of California, Irvine CA 92697-3100