Re: [netcdf-java] 4.0 updates: C and java speed

Hi John,

> It is possible that for various reasons, Java will
> be "several times slower" than C code, so you'll have to decide if the
> increase in productivity is worth it.

In what circumstances would Java-NetCDF be "several times slower" than
C?  I've always found nj to be very close to our old C++ code in terms
of speed, and nj4 appears to be faster than nj2.  (In a server
environment with a persistent, modern JVM - as you say, running a
one-off Java program has a big overhead in bootstrapping the JVM and
loading classes.)

I'd just like to know if there are any particular things I should look
out for that might badly affect performance.

Cheers, Jon

On Wed, May 6, 2009 at 11:21 PM, John Caron <caron@xxxxxxxxxxxxxxxx> wrote:
> Hi Bill:
>
> I made a few mods to your program (attached)
>
> 1) removed the print statements, which are notoriously slow.
> 2) did the whole open/read/close loop 100 times
> 3) added timing, and got:
>
> that took 1248.659775 millisecs
>
> which is about 13 msecs per call. When I get a chance I will try to compare
> to the C code.
>
> None of this is all that definitive, its very hard to get accurate timings
> on small programs. For one thing, the java compiler happens at runtime, and
> its somewhat indeterministic. so running a program once will very likely
> look very bad. If you are doing a CGI type server, where the java
> application starts up for each request, that will be very slow.
>
> I can pretty much promise you that java performance is within a factor of 2
> of C code, and more likely within 20% of C code, in a long-running server
> environment. There are certain things it can do faster, like memory
> allocation and multithreading.
>
> Anyway, I could look at your actual production code to see if there are some
> ways to help speed it up. It is possible that for various reasons, Java will
> be "several times slower" than C code, so you'll have to decide if the
> increase in productivity is worth it.
>
> Bill Moninger wrote:
>>
>> Hi John,
>>
>> sorry for the delay in getting back to you.
>>
>> I've attached two programs that read a netcdf file of RUC output in a
>> hybrid coordinate system, on a 40km grid. One's in C, and one's in java.
>>
>> A typical netcdf file that they read may be found at
>> http://ruc.noaa.gov/ruc_native_40.nc (53 M in size).
>>
>> Generally, I read 6 variables from this file to generate soundings (SkewT
>> plots), and my C program reads the file and generates the sounding in far
>> less than a second.  The java version takes several times longer, and since
>> we have many hits on our web page that generates on-the-fly soundings, it
>> would be a big increase in load on our server to switch to java-netCDF.
>>
>> I've stripped both the C and java programs down to the minimum. Each reads
>> one vertical column of data from the 'vpt' variable and prints it out for
>> each level. (The production program reads 6 variables from one vertical
>> column and generates a soundings for that column.)
>>
>> On my web server, the java program takes 0.58 seconds, and the C program
>> takes 0.01 seconds.
>>
>> It may be for my simple processing netcdf-java is just overkill. But if it
>> can be sped up to approach the time of C, I'd like to use it because I can
>> use the same code (with different config files) to read netCDF, grib, and
>> grib2, and things would be much easier to manage.
>>
>> Any thoughts you have will be gratefully received.
>>
>> -Bill
>>
>> On 4/4/2009 12:43 PM, John Caron wrote:
>>>
>>>
>>> Bill Moninger wrote:
>>>>
>>>> Hi Robb,
>>>>
>>>> thanks for the information.  I'll take a look at regenerating the gbx
>>>> files.
>>>>
>>>> For what its worth--the *biggest* percentage slowdown is not with grib
>>>> or grib2 files, but with netCDF files, surprisingly enough. My c routine
>>>> (using an earlier version of netCDF) reads the files almost instantly--the
>>>> java-netcdf4 arrangement reads the file much more slowly.
>>>
>>> Thats interesting. When you say "read" do you mean read all the data, or
>>> just opening the file? I assume netcdf 3 formatted files?
>>>
>>> Can you send a sample program that has this slowdown? Are you comparing
>>> against a C program or earlier versions of java-netcdf?
>>>
>>> BTW, java 1.6 should be 20-30% faster than java 1.5, particularly if you
>>> use the -server option.
>>>
>>>>
>>>> -Bill
>>>>
>>>> On 3/31/2009 12:56 PM, Robb Kambic wrote:
>>>>>
>>>>> On Fri, 27 Mar 2009, Bill Moninger wrote:
>>>>>
>>>>>> Hello netcdf-java folks,
>>>>>>
>>>>>> Thanks to good help from the netcdf-java staff, I'm now able to read
>>>>>> and generate soundings from RUC files in netCDF, grib, and grib2 format. 
>>>>>> Its
>>>>>> really nice to be able to use the same code for all three formats.
>>>>>>
>>>>>> Unfortunately, I find that, at least as I've implemented it,
>>>>>> netcdf-java is 20% to 50% slower than my previous methods (using C).
>>>>>>
>>>>>
>>>>> Bill,
>>>>>
>>>>> If you use the the grib index file, those are the files with the gbx
>>>>> suffix that are usually in the same dir as the grib file. You should 
>>>>> delete
>>>>> them all and then regenerate them. The new index file read in much 
>>>>> quicker.
>>>>> Currently, i working on grib performance issues
>>>>>
>>>>> Robb...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Moreover, it appears that java 1.6 is slower than 1.5 (though I
>>>>>> haven't recompiled the underlying UCAR code in 1.6--only my code).
>>>>>>
>>>>>> If folks have any thoughts about how to speed things up, I will be
>>>>>> much obliged to hear them.
>>>>>>
>>>>>> -Bill
>>>>>> --
>>>>>> William R. Moninger         http://www-frd.fsl.noaa.gov/~moninger/
>>>>>> NOAA / Earth Systems Research Laboratory / Global Systems Division
>>>>>> 325 Broadway, R/GSD1                           voice: 303-497-6435
>>>>>> Boulder, CO 80305                              fax:   303-497-3329
>>>>>>
>>>>>> _______________________________________________
>>>>>> netcdf-java mailing list
>>>>>> netcdf-java@xxxxxxxxxxxxxxxx
>>>>>> For list information or to unsubscribe, visit:
>>>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>>
>>>>>
>>>>> ===============================================================================
>>>>> Robb Kambic                       Unidata Program Center
>>>>> Software Engineer III               Univ. Corp for Atmospheric Research
>>>>> rkambic@xxxxxxxxxxxxxxxx           WWW: http://www.unidata.ucar.edu/
>>>>>
>>>>> ===============================================================================
>>>>
>>>
>>> _______________________________________________
>>> netcdf-java mailing list
>>> netcdf-java@xxxxxxxxxxxxxxxx
>>> For list information or to unsubscribe, visit:
>>> http://www.unidata.ucar.edu/mailing_lists/
>>
>
>
> import ucar.nc2.iosp.grib.*;
> import ucar.nc2.iosp.grib.GribServiceProvider.*;
>
> import java.*;
> import java.io.*;
> import java.util.*;
>
> import ucar.ma2.*;
> import ucar.nc2.*;
> import ucar.nc2.util.*;
> import ucar.nc2.units.DateFormatter;
>
> public class Tester {
>    private static final String VERSION = "0.01";
>
>    double[] Tvar;
>
>    public static void main(String[] args) {
>        Tester gs = new Tester(args);
>        System.exit(0);
>    }
>
>    public Tester(String[] args) {
>        String filename = "D:\\work\\moninger\\ruc_native_40.nc";
>
>        long start = System.nanoTime();
>        int tuv_levels = 0;
>
>        for (int count = 0; count < 100; count++) {
>            NetcdfFile ncfile = null;
>            try {
>
>                ncfile = NetcdfFile.open(filename);
>                //System.out.println("ncfile is "+ncfile);
>                Array data4D;
>                Variable v = null;
>                Attribute a = null;
>
>                // get grid parameters for most variables
>                Dimension d = ncfile.findDimension("z");
>                if (d == null) {
>                    System.out.println("Bad dimension for z");
>                    System.exit(1);
>                }
>                tuv_levels = d.getLength();
>
>                // get variables
>                int[] origin = new int[]{0, 0, 40, 50};
>                int[] tuv_size = new int[]{1, tuv_levels, 1, 1};
>                v = ncfile.findVariable("vpt");
>                data4D = v.read(origin, tuv_size);
>                Tvar = (double[])
> data4D.reduce().get1DJavaArray(double.class);
>                //System.out.println("successfully read " + filename);
>
>            } catch (Exception e) {
>                System.out.println("Exception: " + filename + " " + e);
>                e.printStackTrace();
>                System.exit(1);
>
>            } finally {
>                if (null != ncfile) try {
>                    ncfile.close();
>                    //System.out.println("closed file");
>                } catch (IOException ioe) {
>                    System.out.println("trying to close " + filename + " " +
> ioe);
>                }
>            }
>
>          /*  for (int i = 0; i < tuv_levels; i++) {
>                System.out.println("i: " + i + " t " + Tvar[i]);
>            }  */
>
>
>        }
>
>        long stop = System.nanoTime();
>        System.out.printf("that took %f millisecs %n", (stop - start) /
> 1000.0 / 1000.0);
>
>
>    }
> }
>
>
>
> _______________________________________________
> netcdf-java mailing list
> netcdf-java@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/
>



-- 
Dr Jon Blower
Technical Director, Reading e-Science Centre
Environmental Systems Science Centre
University of Reading
Harry Pitt Building, 3 Earley Gate
Reading RG6 6AL. UK
Tel: +44 (0)118 378 5213
Fax: +44 (0)118 378 6413
j.d.blower@xxxxxxxxxxxxx
http://www.nerc-essc.ac.uk/People/Staff/Blower_J.htm



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: