[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #VFX-630532]: Writing multiple variables at once using NetCDF4 parallel



Oops, I just noticed that I commented out the use of strides in the
attachment I sent. As far as I know, that wasn't necessary to make
things work, it was just an experiment that I forgot to back out.
I've got a meeting now, but I'll try it with the strided access restored
later this afternoon, and let you know if I'm wrong and that actually
makes a difference ...

--Russ

> Hi Charles,
> 
> I've got your example working OK in parallel if I first create the try.nc
> dimensions and variables in NC_COLLECTIVE mode (the default for
> metadata operations like creating dimensions and variables, then write
> the data to the variables in parallel in NC_INDEPENDENT mode, the
> default for data I/O operations. This at least shows that you can write
> data to multiple variables in parallel. This doesn't show you can
> intersperse creating variables, writing data, and jumping in and out of
> definition mode, but that will definitely be less efficient than defining
> the dimensions and variables first, before writing the data.
> 
> I've attached my version of your example for you to test and verify it
> does what you want.
> 
> --Russ
> 
> 
> > > Hum...
> > >
> > > Commenting out nc_eddef and nc_redef leads to a netcdf file that fails
> > > when using ncdump on it.
> > >
> > > Thanks for your help I appreciate it. Will be (anxiously) waiting for
> > > your answer.
> >
> > To get your test_append.c example to compile without warnings,
> > I had to change a statement:
> >
> > ierr = nc_put_vars_double(fh, ivar, start, count, stride,
> > ierr = nc_put_vars_double(fh, ivar[i], start, count, stride,
> >
> > I also changed the includes to:
> >
> > #include <netcdf_par.h>
> > #include <netcdf.h>
> >
> > I built statically:
> >
> > $ mpicc -o test_append test_append.c -I${NCDIR}/include -L${NCDIR}/lib 
> > -lnetcdf -L${H5DIR}/lib -lhdf5_hl -lhdf5 -L/usr/local/lib  -ldl -lm -lz 
> > -lcurl
> >
> > with no warnings or errors emitted.
> >
> > On one core, the program ran with no problems and seemed to
> > produce the expected output, whether I left the nc_redef() and
> > nc_enddef() calls in or not.
> >
> > But I'm also seeing problems when using 2 cores.
> >
> > $ mpirun -n 2 ./test_append
> >
> > So I'll try to figure out what's going on with this tomorrow ...
> >
> > > Also for full disclosure my netcdf lib was compiled with ‹enable-parallel
> > > (I.e. with the pnetcdf lib as well). Should I try to NOT do that?
> >
> > I don't think that should make any difference, since you're not calling
> > nc_create_par with NC_PNETCDF or-ed into the create-mode flag.
> >
> > --Russ
> >
> > > C.
> > >
> > > On 06/03/2015 03:08 PM, Unidata netCDF Support wrote:
> > > >> Unfortunately that does not change anything... I tried to include
> > > >> netcdf.h after netcdf_apr it doesn't change anything.
> > > >>
> > > >> I tried not including netcdf.h but then the code does not compile at 
> > > >> all
> > > >> as it needs thnings likeNC_CLOBBER etc...
> > > > Sorry for the bad advice.  I was hoping it was something simple, but now
> > > > I see even the minimal documentation about netcdf_par.h is wrong, it
> > > > obviously can't be used instead of netcdf.h.
> > > >
> > > > I'll have to reproduce the problem here with your test_append.c and see
> > > > if I can get it working. Unfortunately our parallel I/O expertise is 
> > > > gone, so
> > > > it may take some time.
> > > >
> > > >> Any other idea?
> > > > I'm not sure what you intend with nc_redef() and nc_enddef().
> > > > There is no need for either nc_redef() or nc_enddef() calls for
> > > > netCDF-4 files, unless you're setting special variable properties
> > > > such as compression. Quoting from the docs:
> > > >
> > > >   For netCDF-4 files (i.e. files created with NC_NETCDF4 in the cmode 
> > > > in their
> > > >   call to nc_create()), it is not necessary to call nc_redef() unless 
> > > > the file was
> > > >   also created with NC_STRICT_NC3. For straight-up netCDF-4 files,
> > > >   nc_redef() is called automatically, as needed.
> > > >     ...
> > > >   It's not necessary to call nc_enddef() for netCDF-4 files. With 
> > > > netCDF-4 files,
> > > >   nc_enddef() is called when needed by the netcdf-4 library. User calls 
> > > > to
> > > >   nc_enddef() for netCDF-4 files still flush the metadata to disk.
> > > >     ...
> > > >   For netCDF-4/HDF5 format files there are some variable settings (the
> > > >   compression, endianness, fletcher32 error correction, and fill value) 
> > > > which
> > > >   must be set (if they are going to be set at all) between the 
> > > > nc_def_var() and
> > > >   the next nc_enddef(). Once the nc_enddef() is called, these settings 
> > > > can no
> > > >   longer be changed for a variable.
> > > >
> > > > So, you might also try just removing the nc_enddef() and nc_redef() 
> > > > calls,
> > > > though I don't want to send you on another wild goose chase. I'll get
> > > > around to testing that here soon ...
> > > >
> > > > --Russ
> > > >
> > > >> On 06/03/2015 09:55 AM, Unidata netCDF Support wrote:
> > > >>> Hi Charles,
> > > >>>
> > > >>>> I just set out to port our software to use NetCDF4 parallel (Netcdf 
> > > >>>> 4.3.3.1 HDF5 1.8.13)
> > > >>>>
> > > >>>> I’m trying to have multiple core wrote to different variables? Is 
> > > >>>> that possible, or can you only write to ONE variable at a time.
> > > >>>>
> > > >>>> When running the attached code when ONE core, it works happily, when 
> > > >>>> running with multiple core (mpirun –n 2 or more) only the variables 
> > > >>>> written by the highest rank are actually in the file.
> > > >>>>
> > > >>>> Is it possible to achieve what I’m trying to do? If so what am I 
> > > >>>> doing wrong?
> > > >>> Yes, I think it is.  One thing you're doing wrong is including 
> > > >>> netcdf_par.h after netcdf.h.
> > > >>> According to a note in some release notes that doesn't seem to have 
> > > >>> made it into
> > > >>> the actual documentation:
> > > >>>
> > > >>>   Users of parallel I/O with netCDF-4 please note: starting with the 
> > > >>> 4.1.2 release the parallel I/O functions are prototyped in 
> > > >>> netcdf_par.h, not netcdf.h. You must include netcdf_par.h BEFORE 
> > > >>> netcdf.h to use parallel I/O with netCDF-4.
> > > >>>   http://www.unidata.ucar.edu/netcdf/release-notes-4.1.2.html
> > > >>>
> > > >>> That also seems to be in the 2012 netCDF workshop section on using 
> > > >>> parallel I/O:
> > > >>>
> > > >>>   For parallel builds you must include "netcdf_par.h" before (or 
> > > >>> instead of) netcdf.h.
> > > >>>   
> > > >>> http://www.unidata.ucar.edu/netcdf/workshops/2012/pnetcdf/BuildingParallel.html
> > > >>>
> > > >>> I don't know if that's the problem, but it's the first problem I 
> > > >>> spotted, so you might want
> > > >>> to try just deleting the "#include <netcdf.h>" statement from your 
> > > >>> test_append.c.
> > > >>>
> > > >>> --Russ
> > > >>>> Also I tried to stay in def mode only first (I.e. Two loop, one 
> > > >>>> declaring all variables, then move in redef mode and write variables 
> > > >>>> in 2nd loop) but it didn’t make a difference.
> > > >>>>
> > > >>>> Thanks for any pointer,
> > > >>>>
> > > >>>> C.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>> Russ Rew                                         UCAR Unidata Program
> > > >>> address@hidden                      http://www.unidata.ucar.edu
> > > >>>
> > > >>>
> > > >>>
> > > >>> Ticket Details
> > > >>> ===================
> > > >>> Ticket ID: VFX-630532
> > > >>> Department: Support netCDF
> > > >>> Priority: Normal
> > > >>> Status: Closed
> > > >>>
> > > >>>
> > > >>
> > > >>
> > > > Russ Rew                                         UCAR Unidata Program
> > > > address@hidden                      http://www.unidata.ucar.edu
> > > >
> > > >
> > > >
> > > > Ticket Details
> > > > ===================
> > > > Ticket ID: VFX-630532
> > > > Department: Support netCDF
> > > > Priority: Normal
> > > > Status: Closed
> > > >
> > > >
> > >
> > >
> > >
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> 
> Russ Rew                                         UCAR Unidata Program
> address@hidden                      http://www.unidata.ucar.edu
> 
> 

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: VFX-630532
Department: Support netCDF
Priority: Normal
Status: Closed