Re: [thredds] THREDDS directory scanning question

  • To: Georgi Kostov - NOAA Affiliate <georgi.kostov@xxxxxxxx>
  • Subject: Re: [thredds] THREDDS directory scanning question
  • From: Brian Blanton <bblanton@xxxxxxxxx>
  • Date: Wed, 16 Sep 2015 15:44:46 +0000
  • Authentication-results: spf=fail (sender IP is 152.54.5.163) smtp.mailfrom=renci.org; ucar.edu; dkim=none (message not signed) header.d=none;ucar.edu; dmarc=none action=none header.from=renci.org;
  • Spamdiagnosticmetadata: NSPM
  • Spamdiagnosticoutput: 1:23
Indeed, that is another suggestion I'll try. Thanks a billion.

Cheers,

[cid:E959C411-0500-41E2-A12F-54A30C302CFE]

Brian O. Blanton, Ph.D.
Director of Environmental Initiatives
Oceanographer
Renaissance Computing Institute
University of North Carolina at Chapel Hill
100 Europa Drive
Suite 540 Chapel Hill, NC, 27517

Brian_Blanton@xxxxxxxxx<mailto:Brian_Blanton@xxxxxxxxx>
919-445-9620 (O)


From: Georgi Kostov - NOAA Affiliate
Date: Wednesday, September 16, 2015 at 11:43 AM
To: Brian Blanton
Cc: John Caron, Christian Ward-Garrison, 
"thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>"
Subject: Re: [thredds] THREDDS directory scanning question

Brian,

One possibility in this case then might be to create a single directory 
containing symlinks to your various directories 
<simulationDir>/data/content/<possible netCDF files here>.  Maybe even name the 
symlinks with the simulation name.  Then expose that umbrella directory of 
symlinks to TDS and let it scan away. You may not even need to restart TDS when 
you add a symlink to a new simulation.

IHTH,
Georgi


On Wed, Sep 16, 2015 at 11:34 AM, Brian Blanton 
<bblanton@xxxxxxxxx<mailto:bblanton@xxxxxxxxx>> wrote:
Thank you both for these replies.  This is a great community.

Christian, re: your idea, I hadn't thought of that, so will try something like 
that.  The exclude list could get very long, but that's probably OK.  [ I don't 
know anything about java, so adding to the TDS would be impossible for me.]

John, re: #1, in this specific case, yes, I know exactly at what directory 
level the netCDF files would exist, if at all.  It's

<simulationDir>/data/content/<possible netCDF files here>

But I can also envision that generally, I might want to build a "dynamic" 
catalog where this is not known a priori.

re: #2,  the directory names are literally <simulationDir>/data/content/, where 
the <simulationDir> is some unique identifier.  In this particular case, it's a 
guid.

The script I have will make a catalog that works, I was just wondering if TDS 
could do it instead.



Cheers,

[cid:248CB8C0-D084-4A1D-B6D6-7F04B4D0F537]

Brian O. Blanton, Ph.D.
Director of Environmental Initiatives
Oceanographer
Renaissance Computing Institute
University of North Carolina at Chapel Hill
100 Europa Drive
Suite 540 Chapel Hill, NC, 27517

Brian_Blanton@xxxxxxxxx<mailto:Brian_Blanton@xxxxxxxxx>
919-445-9620<tel:919-445-9620> (O)


From: John Caron
Date: Tuesday, September 15, 2015 at 8:33 PM
To: Christian Ward-Garrison
Cc: Brian Blanton, "thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>"
Subject: Re: [thredds] THREDDS directory scanning question

Hi Brian:

1) It means we would have to scan the directory contents before deciding to 
include it. Can we assume that we only need scan one level, and not recurse, 
meaning we wouldnt find:

dirWith/dirWithout/dirWith

since dirWith would not get shown.

2) are the directories themselves named in a way that can be filtered?

John


On Tue, Sep 15, 2015 at 5:52 PM, Christian Ward-Garrison 
<cwardgar@xxxxxxxx<mailto:cwardgar@xxxxxxxx>> wrote:
Hi Brian,

There is currently no way to specify a filter that accepts a collection 
(directory) only if it contains at least one *.nc file. Well, technically there 
is (crawlableDatasetFilterImpl), but it's not documented and is slated to be 
removed in 5.0 anyway [1]. Plus, you'd have to write the logic in Java and add 
it to the classpath when you ran the TDS. Tedious.

An idea: in your catalog-building script, instead of explicitly including 
individual NetCDF files, you might instead try explicitly*excluding* 
directories with no NetCDF files in them.

Cheers,
Chrisitan

[1] 
http://www.unidata.ucar.edu/software/thredds/v5.0/tds/UpgradingTo5.html#_datasetscan

On Mon, Sep 14, 2015 at 12:31 PM, Brian Blanton 
<bblanton@xxxxxxxxx<mailto:bblanton@xxxxxxxxx>> wrote:
Hi All,  I have a directory scanning question.  I want to expose only netCDF 
files in a large directory structure that will have many directories without 
netCDF files.   If a directory does not have a netCDF file, I need to have it 
hidden.

So, if I have

dir1/subdir1/<bunch of files but NO netCDF files>
dir1/subdir2/<bunch of files and some netCDF files>

I only want dir1/subdir2/<netcdf files>  to be in the catalog, and not 
dir1/subdir1 with nothing at the endpoint.

I can make a script that builds a catalog file that explicitly contains the 
netCDF files, but would rather do this with datasetScan.  I was hoping that the 
following would exclude everything except netCDF files.

<filter>
    <include wildcard="*.nc" collection="false"/>
    <exclude wildcard="/*" collection="true"/>
</filter>

I hope I've articulated this well.  If anyone has any guidance, that would be 
great.


Cheers,

[cid:96AA222E-E037-4ADF-87BB-5F45D2B584BE]

Brian O. Blanton, Ph.D.
Director of Environmental Initiatives
Oceanographer
Renaissance Computing Institute
University of North Carolina at Chapel Hill
100 Europa Drive
Suite 540 Chapel Hill, NC, 27517

Brian_Blanton@xxxxxxxxx<mailto:Brian_Blanton@xxxxxxxxx>
919-445-9620<tel:919-445-9620> (O)


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx<mailto:thredds@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/



--
Georgi Kostov
Team ERT/STG, US Government Contractor
Data Access Branch, NOMADS/NCMP Team
NOAA's National Centers for Environmental Information (NCEI)
151 Patton Ave., Suite 468, Asheville, NC 28801-5001
georgi.kostov@xxxxxxxx<mailto:georgi.kostov@xxxxxxxx> // (828) 271-4921 // 
http://nomads.ncdc.noaa.gov/

The newly formed NCEI<http://ncei.noaa.gov/> merges the National Oceanographic 
Data Center (NODC), the National Climatic Data Center (NCDC) and the National 
Geophysical Data Center (NGDC).
Connect with us on Facebook for 
climate<http://www.facebook.com/NOAANCEIclimate> and ocean and 
geophysics<http://www.facebook.com/NOAANCEIoceangeo> information, and follow us 
on Twitter at @NOAANCEIclimate <http://www.twitter.com/NOAANCEIclimate> and 
@NOAANCEIocngeo<http://www.twitter.com/NOAANCEIocngeo>.

The contents of this message are mine personally and do not necessarily reflect 
any position of NOAA or STG. This electronic transmission contains information 
that may be internal use only, confidential, or proprietary.  If you are not 
the intended recipient, be aware that any disclosure, copying, distribution or 
use of the contents hereof is strictly prohibited.  If you have received this 
transmission in error, please notify 
Georgi.Kostov@xxxxxxxx<mailto:Georgi.Kostov@xxxxxxxx>

PNG image

Attachment: signature1_first[16].png
Description: signature1_first[16].png

Attachment: signature1_first[7].png
Description: signature1_first[7].png

  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: