Re: [netcdfgroup] netcdf4 python

To: Chris Barker <chris.barker@xxxxxxxx>
Subject: Re: [netcdfgroup] netcdf4 python
From: "Elizabeth A. Fischer" <elizabeth.fischer@xxxxxxxxxxxx>
Date: Fri, 27 May 2016 18:03:57 -0400

Chris,



> I think there is a key different use-case. I tlooks like spack was
> designed with super computers in mind, which means, as you say, a very
> particular configuration, so collections of binary packages aren't very
> useful.
>
> conda, on the other hand ,was designed with "standard" systems in mind --
> pretty easy on Windows and OS-X, and they have taken a least common
> denominator approach that works pretty well for Linux. But yeah, the
> repositories of binaries are only useful because of this standardization.
>

This is not true.  Supercomputers are just big shared Linux systems.
Simultaneously installing multiple versions of the same thing (without a
new repo for each one) is more obviously necessary there than on PCs.  But
it's still important on PCs as well.  People are increasingly finding
Environment Modules to be a useful tool for PCs.

Spack is useful on any computer where you need control of your
configuration and need more than one version of something installed --- or
where the version YOU need installed is different from the version the
version the autobuilder authors need installed.  This is almost always the
case when you're dealing with significant software stacks.  Maybe you need
to use a little-known library that's incompatible with the latest version
of NetCDF.

Going through a set of 50 Conda packages and fixing the version numbers and
MD5 codes to get your particular software stack working is quite tedious.
Easybuild works like Conda --- one version, one package --- and the result
has been an unmanageable proliferation of recipes for each package,  many
of them differing only slightly.  Before I tried Spack, I tried EasyBuild,
it was a nightmare.

I thought of abuot-rewriting EasyBuild recipes too.  To fix EasyBuild,
you'd want a system that produces a specific repo of recipes from a repo of
multi-version "proto-recipes."  That is, you will want a system that
contains Spack's version-wrangling stuff.  Now that Spack manages my
versions, I'd hate to do it by hand again.  In a Conda repo, how do you
even know that you haven't specified incompatible versions of some library
deep down?  (Eg: Package A requires NetCDF 1 and Package B requires NetCDF
2).  Spack roots out and reports/fixes these kinds of problems.

Before using Spack, I did look at Conda, at least as described by Continuum
Analytics.  Could never find description of how to get it to do the things
you say it can do; for example, build everything from source.  The first
step in the Conda docs was to download a binary Python, which would not
match my build.  The reality of software is that even if it "can" do
something, it only CAN do that if users can find documentation describing
how.



> I'm going to extend that -- building python extensions with conda is VERY
> doable -- it really helps make it much easier (than not using conda or
> spack...). And you can build conda packages that you can then distribute to
> users that may not have the "chops" to do the building themselves.
>

Typing "spack install" or "conda install" does not require many chops.
Doesn't matter whether under the hood it's installing binaries or building
from source.  I have Spack-based builds that install smoothly without
modification on a moderate variety of Linux systems (i.e. those I've tested
on).


> IIUC, the big difference, and use case for spack is that it makes all this
> doable on a hihgly specialized system.
>

I'm running CentOS 7, is that highly specialized?  No... the use case for
Spack is if any of the following are true:

1. You need simultaneously more than one version of stuff, without having
to rebuild entire separate software repos.

2. You want a system that will be flexible in terms of which versions of
stuff you use, allowing you to selectively bypass problematic versions of
particular packages for your build.  You want the system to easily update
to new versions as they come out.

3. You don't want to manage and check version compatibility in your
software DAG by hand.

4. You want a portable, trouble-free way to build your software stack on a
wide variety of other systems, allowing others to install your stuff with
less than the 2 weeks of build time required manually.

5. You want the auto-builder to help find dependencies and set up builds
for the projects YOU create, as well as for standard projects that come in
the repo.

6. You want to distribute your software in a way that doesn't assume it
will be the top level; that allows others to build on top of it if they
like.

7. You're a supercomputer admin, and you want to get away from
hand-building packages and hand-editing module files every time someone
requests a new package be installed.

-- Elizabeth

References:
- [netcdfgroup] netcdf4 python
  - From: nuncio m
- Re: [netcdfgroup] netcdf4 python
  - From: Sudheer Joseph
- Re: [netcdfgroup] netcdf4 python
  - From: Chris Barker
- Re: [netcdfgroup] netcdf4 python
  - From: nuncio m
- Re: [netcdfgroup] netcdf4 python
  - From: nuncio m
- Re: [netcdfgroup] netcdf4 python
  - From: Chris Barker - NOAA Federal
- Re: [netcdfgroup] netcdf4 python
  - From: Elizabeth A. Fischer
- Re: [netcdfgroup] netcdf4 python
  - From: Chris Barker
- Re: [netcdfgroup] netcdf4 python
  - From: Elizabeth A. Fischer
- Re: [netcdfgroup] netcdf4 python
  - From: Chris Barker

2016 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: