Sorry, let me add some additional information.
I have been given a product specification, with several different files but the
overall gist of the files is (The data type varies from byte to long):
<dimension name="columns" length="512"/>
<dimension name="rows" length="45000"/>
<dimension name="orphan_pixels" length="1" isUnlimited="true"/>
<variable name="var1" shape="rows columns" type="short">
<variable name=”var1_orphan" shape="orphan_pixels" type="short">
<variable name="var2" shape="rows columns" type="short">
<variable name=”var2_orphan" shape="orphan_pixels" type="short">
For instance at most we are writing to 18 files sequentially (currently only
13). Note I did try to use the NetCDF-Java library with multiple threads but it
causes a seg file (https://github.com/Unidata/thredds/issues/577).At its
fastest we’ve been seeing data getting to the writers every couple of
milliseconds, we then convert the data arrays (stored in lists) into
NetCDF-Java Arrays and then go on to write them:
public <T> void writeData(String internalName, List<T> data, int[] shape,
int[] origin, Class<T> type) {
// move data into netcdf data shape
Array rawData = Array.factory(type, shape);
for (int i = 0; i < data.size(); i++) {
rawData.setObject(i, data.get(i));
}
this.writeData(internalName, rawData, origin);
}
public void writeVariable(String name, int[] origin, Array values) throws
IOException, InvalidRangeException {
LOG.trace("Wrting Variable {} to netcdf file", name);
Variable var = netcdfFileWriter.findVariable(name);
Objects.requireNonNull(var, String.format("Variable with name: %s
cannot be found", name));
this.netcdfFileWriter.write(var, origin, values);
}
What we then end up seeing if sampled by VisualVM (working on another profiler
so I can get average call time) that `ucar.nc2.jni.netcdf.Nc4Iosp.writeData()`
is using a lot of time to run.
Hope this helps clarify my situation
From: Bob Simons - NOAA Federal [mailto:bob.simons@xxxxxxxx]
Sent: 28 June 2016 16:32
To: Robin Moss
Cc: netcdf-java@xxxxxxxxxxxxxxxx
Subject: Re: [netcdf-java] Performance Issues and Buffering
You don't say how you are writing the data, other than "a row at a time".
Is the row dimension an unlimited dimension? (That is what I would recommend
trying.)
Or have you pre-allocated space in the variables and are now writing data into
that space?
Or are you reading the entire file, adding one row of data, then writing the
entire file? (That is bound to be slow when the number of rows gets larger.)
On Tue, Jun 28, 2016 at 1:26 AM, Robin Moss
<robin.moss@xxxxxxxxxxxxxx<mailto:robin.moss@xxxxxxxxxxxxxx>> wrote:
Hello,
I’m hoping I can get some pointers to improve the way im using the NetCDF
library.
At the moment the processing I’m doing is writing to several different NetCDF
files, multiple variables a row at a time. These are not currently
multi-threaded.
When the processed data is small I don’t see any issues (100’s of rows),
however when I start running a bigger chain (10’s of thousands of rows) I see
the performance of NetCDF Java plummet, a quick look at whats happening with
VisualVM shows that most of my application times (~60%) is spent in
`Nc4Iosp.writeData()`.
Which leads me to believe I’m using the library wrong ☺, my initial thought
having worked with the C Library directly before was to adjust the write
buffer, but I don’t see any support for that in the Java lib and considering it
would likely affect the C Lib I’m not sure it would help with the write data
call.
I had briefly looked into just buffering my rows so I write every 10-100 rows
to see what effect that would have on performance and memory usage, however I
hit a bit of an issue with the variables that have an unlimited dimension of
columns (most variables I have are row x column), in that I was unable to
figure out how to create an Array that supported unlimited dimensions.
We currently use the NetcdfFileWriter to writer data to the underlying NetCDF 4
files, I know the API suggests using the FileWriter2, but I couldn’t see a way
to use that, that also allowed us to ‘stream’ data into the underlying files.
Any suggestions would be greatly appreciated.
Thanks,
Robin
WARNING: This message contains confidential and/or proprietary information
which may be subject to privilege or immunity and which is intended for the use
of its addressee only. Should you receive this message in error, you are kindly
requested to inform the sender and to definitively remove it from any paper or
electronic format. Any other use of this e-mail is strictly forbidden. Thank
you in advance for your cooperation.
Please consider the environment before printing this email.
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web. Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.
netcdf-java mailing list
netcdf-java@xxxxxxxxxxxxxxxx<mailto:netcdf-java@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/
--
Sincerely,
Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St., Suite 255A (New!)
Monterey, CA 93940 (New!)
Phone: (831)333-9878 (New!)
Fax: (831)648-8440
Email: bob.simons@xxxxxxxx<mailto:bob.simons@xxxxxxxx>
The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <><