Re: [thredds] How to download bulk datasets?

Hi Heiko:

We use catalog.xml exactly because theres no standard html index format. A simple java GUI app could make this easy to do, but Im not clear if that would help your case.

John

On 5/6/2010 3:16 AM, Heiko Klein wrote:
Hi John,

I don't think there is a standard format for directory index / listings.
Looking at the different implementations (Tomcat (DefaultServler,
listing = true), Jetty (dirAllowed = true), Apache (mod_dir,
DirectoryIndex)) the common pattern is, that they all have links to all
(non-hidden) files in the directory, and not much more (possibly parent
directory and some gifs/png differing between file and directory).
Thredds listings of 'datasetScan' look very similar to the tomcat
listings, except that they link to the dataset-overview page, and not to
the fileServer page.

RAMMADDA looks like a solution for a completely different type of users,
except for the embedded ftp server.

Best regards,

Heiko


On 2010-05-05 01:28, John Caron wrote:
Hi Heiko:

TDS specializes in the logical subsetting of datasets, so we havent
thought much about file downloading.

The index is provided by THREDDS catalogs, eg

view-source:http://thredds.met.no/thredds/catalog/data/met.no/ice-drift/catalog.xml


If it was me, I would write a nice little client app to make it easy to
select files and download. Perhaps we will throw one together.

If  there is some standard format for "index.html" that works with wget
and other clients, perhaps we can provide that.

Otherwise, RAMMADDA is another good solution.

John

On 5/3/2010 3:47 AM, Heiko Klein wrote:
Hi,

we are moving more and more from our ftp-solutions to thredds with http
and opendap enabled.

Some users complain about this solution, since it is no longer possible
to download bulk datasets, that is, all files in one directory. Our
ftp-server supported 'ls' and several ftp-clients have support for that
so e.g.
ftp ftp.my.server
$ cd directory
$ mget *.nc
worked well.

There are some http-downloader which support mirroring of a directory
which would be comparable, but this requires a proper directory-listing
for the http-download.

An example:
http://thredds.met.no/thredds/catalog/data/met.no/ice-drift/
contains daily files of several years. To clicks further
http://thredds.met.no/thredds/fileServer/data/met.no/ice-drift/ice-drift_ice_drift_nh_polstere-625_multi-oi_200912311200-201001021200.nc

is one of those files.

wget -r -l1 --no-parent -A.nc
'http://thredds.met.no/thredds/fileServer/data/met.no/ice-drift/'
was my best try to get all netcdf-files in the ice-drift catalog.
Unfortunately, this requires a ice-drift/index.html (or
directory-listing) which doesn't exists.


Does anybody knows about a solution to download several (hundred) files
from a thredds-server in a simple way?
I even thought about aggregation, but as far as I see, this doesn't work
with the http-downloader, but requires a opendap client (i.e. nco),
which might be to complicated, and might lead to errors if products
change of the years (better resolution, updated metadata...)

Best regards,

Heiko

_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit:
http://www.unidata.ucar.edu/mailing_lists/

_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit:
http://www.unidata.ucar.edu/mailing_lists/



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: