Re: [netcdfgroup] NetCDF for parallel usage



On 10/17/2014 11:25 AM, Ed Hartnett wrote:
Unless things have changed since my day, it is possible to read pnetcdf
files with the netCDF library. It must be built with --enable-pnetcdf
and with-pnetcdf=/some/location, IIRC.

Ed!

In this case, Samrat Rao was using pnetcdf to create CDF-5 (giant variable) formatted files. To refresh your memory, Argonne and Northwestern developed this file format with UCARS's signoff, with the understanding that we (ANL and NWU) would never expect UCAR to add support for it unless we did the work. I took a stab at it a few years back and Wei-keng is taking a second crack at it right now.

the classic file formats CDF-1 and CDF-2 are fully inter-operable between pnetcdf and netcdf.
==rob



On Fri, Oct 17, 2014 at 6:33 AM, Samrat Rao <samrat.rao@xxxxxxxxx
<mailto:samrat.rao@xxxxxxxxx>> wrote:

    Hi,

    I'm sorry for the late reply.

    I have no classic/netcdf-3 datasets --- datasets are to be
    generated. All my codes are also new.

    Initially i tried with pnetcdf, wrote a few variables, but found
    that the format was CDF-5 which 'normal' netcdf would not read.

    I also need to read some bits of netcdf data in Matlab, so i thought
    of sticking to the usual netcdf-4 compiled for parallel io. It is
    also likely that i will have to share my workload with others in my
    group and/or leave the code for future people to work on.

    Does matlab read cdf-5 files?

    So i preferred the usual netcdf. Rob, i hope you are not annoyed.

    But most of the above is for another day. Currently i am stuck
    elsewhere.

    With a less no of processors, 216, the single netcdf file gets
    created (i create i single netcdf file for each variable), but for
    anything above that i get these errors:
    NetCDF: Bad chunk sizes.
    Not sure where these errors come from.

    Then i shifted to dumping outputs from each processor in simple
    binary --- this works till about 1500 processors. Above this number
    the code gets stuck and eventually aborts.

    This issue is not new. My colleague too had problems with running
    his code on 1500+ procs.

    Today i came to know that opening a large number of files (each proc
    writes 1 file) can overwhelm the system --- solving this requires
    more than rudimentary techniques of writing --- or understanding the
    system's inherent parameters/bottlenecks.

    So netcdf is probably out of bounds for now --- will try again if
    the simple binary write from each processor gets sorted out.

    Does anyone have any suggestion?

    Thanks,
    Samrat.


    On Thu, Oct 2, 2014 at 7:52 PM, Rob Latham <robl@xxxxxxxxxxx
    <mailto:robl@xxxxxxxxxxx>> wrote:



        On 10/02/2014 01:24 AM, Samrat Rao wrote:

            Thanks for your replies.

            I estimate that i will be requiring approx 4000 processors
            and a total
            grid resolution of 2.5 billion for my F90 code. So i need to
            think/understand which is better - parallel netCDF or the
            'normal' one.


        There are a few specific nifty features in pnetcdf that can let
        you get really good performance, but 'normal' netCDF is a fine
        choice, too.

            Right now I do not know how to use parallel-netCDF.


        It's almost as simple as replacing every 'nf' call with 'nfmpi'
        but you will be just fine if you stick with UCAR netCDF-4

            Secondly, i hope that the netCDF-4 files created by either
            parallel
            netCDF or the 'normal' one are mutually compatible. For
            analysis I will
            be extracting data using the usual netCDF library, so in
            case i use
            parallel-netCDF then there should be no inter-compatibility
            issues.


        For truly large variables, parallel-netcdf introduced, with some
        consultation from the UCAR folks, a 'CDF-5' file format.  You
        have to request it explicitly, and then in that one case you
        would have a pnetcdf file that netcdf tools would not understand.

        In all other cases, we work hard to keep pnetcdf and "classic"
        netcdf compatible.  UCAR NetCDF has the option for an HDF5-based
        backend -- and in fact it's not an option if you want parallel
        I/O with NetCDF-4 -- is not compatible with parallel-netcdf.  By
        now, your analysis tools surely are updated to understand the
        new HDF5-based backend?

        I suppose it's possible you've got some 6 year old analysis tool
        that does not understand NetCDF-4's HDF5-based file format.
        Parallel-netcdf would allow you to simulate with parallel i/o
        and produce a classic netCDF file.  But I would be shocked and a
        little bit angry if that was actually a good reason to use
        parallel-netcdf in 2014.


        ==rob


        --
        Rob Latham
        Mathematics and Computer Science Division
        Argonne National Lab, IL USA




    --

    Samrat Rao
    Research Associate
    Engineering Mechanics Unit
    Jawaharlal Centre for Advanced Scientific Research
    Bangalore - 560064, India

    _______________________________________________
    netcdfgroup mailing list
    netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
    For list information or to unsubscribe,  visit:
    http://www.unidata.ucar.edu/mailing_lists/



--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA



  • 2014 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: