Hi Bas,
Your plan sound good. Here are a few comments and answers:
1) We have some JUnit tests in the test/src directory that might help
you get going. Take a look at
test/src/thredds/cataloggen/TestCollectionLevelScanner.java. If you can
get that running and play with it some it should help you understand
CollectionLevelScanner and CrawlableDataset. Some of the files in
test/src/thredds/crawlabledataset might be helpful to look at as well.
Most of the tests use data from the test/data directory (including some
data to crawl) so you should have everything you need to get the tests
working for you. There are a few spots in the crawlabledataset tests
that use data local to my machine (sorry, we're still trying to clean up
our testing).
2) The ant build script, build.xml, contains a "makeWar" target for
building the thredds.war file. Also, "compile" and "test-setup" might
come in handy for getting the above tests running ("cleanCompileTest"
does both after a clean). If you have trouble with that, I can certainly
build a .war with your files included. Just let me know.
3) Hmm. Nothing jumps out at me as to why your "Test" class wouldn't
work. Does it actually fail to generate a catalog? Or just generate a
skeleton catalog?
4) Yeah, the collection, catalog, and current levels are kind of
confusing. Basically, if you have a hierarchical data collection (I'll
talk in terms of CrawlableDatasetFile where the collection is
represented by a local file directory structure containing the data
files), the directory at the top of your data collection would be the
collection level. CollectionLevelScanner only scans one level
(directory) at a time. The current level is the directory that you want
to scan. The catalog level is were it gets confusing. The issue is if
you want to build a catalog that shows multiple levels of the
collection. To do this, you need to construct a CollectionLevelScanner
and call generateCatalog() on it for each level to be cataloged and then
piece the resulting catalogs together (see StandardCatalogBuilder for an
example of this). Because the service base URLs might be relative, you
need to know the relative location of the datasets currently being
crawled to construct the dataset URL. The catalog level specifies the
relative location. You leave catalog level null if you are not building
a multi-level catalog.
If I ever get around to refactoring this stuff, this is one of the
things I might rethink. Sorry it is confusing. Hope this explanation helps.
5) You can see the XML representation of an InvCatalog by creating an
InvCatalogFactory
InvCatalogFactory fac = InvCatalogFactory.getDefaultFactory( false );
and calling one of the writeXML() methods on it.
Hope this all helps. Let me know if you have more questions.
Ethan
Bas Retsios wrote:
Hello Ethan,
Thank you for your e-mail.
Now I hope you have some time and patience for helping me, as I was
convinced that the best approach is to go for a CrawlableDataset.
I have come up with the following approach:
- Understand how the class CrawlableDatasetFile works
- For this, I will need to easily "run" a part of Thredds from within
Eclipse. As I can not find out if there is existing code that does
this, I will make my own "Test" class for performing some calls
involving CrawlableDatasetFile.
- After understanding the existing functionality, I will make a
similar class, named e.g. CrawlableDatasetDods. If more classes are
needed, I will also make them. As much as possible, I will copy code
from DodsDirDatasetSource and similar classes.
- I will change my "Test" class, to make calls that involve the new
CrawlableDatasetDods classes.
- All new CrawlableDatasetDods-related classes will be tested
extensively with my "Test" class, using at least two different DODS
servers as a source.
- As I do not know how to make the thredds.war file, I will submit all
classes to you, then I would like you to send me a new .war file
containing the entire thredds.
- If you think the classes I have created are useful, you can include
them in the official Thredds release. I have to discuss the copyright
with my boss, but we do have an "open-source" policy for everything we
develop.
- Unless I have made a terrible mistake in estimating the time I would
need for the above, my intention is to have everything ready within 1
week.
Please inform me if you have a better approach.
I have started making a simple "Test" class. See attachment:
Test.java. In this file, you will see that I could not figure out how
to properly use CollectionLevelScanner, CrawlableDataset, InvService
and InvCatalogImpl.
Could you please help me improve the "Test" class, and in particular
in the following aspects:
1. I have no data that CrawlableDatasetFile could crawl. Do you have a
link to some data that I can extract on my harddisk?
2. Help me understand collectionPath, collectionLevel and catalogLevel
by changing my code in Test.java (note that I already read the JavaDoc
of CollectionLevelScanner before writing Test.java)
3. Point me to some code with which I could see that a proper
InvCatalogImpl was generated (other than catalog.getName()).
Thanks in advance,
Bas Retsios.
=
Software Developer
IT Department, Sector Remote Sensing & GIS
International Institute for Geo-information Science and Earth
Observation (ITC)
P.O. Box 6, 7500 AA Enschede, The Netherlands
Phone +31 (0)53 4874 573, telefax +31 (0)53 4874 335
E-mail retsios@xxxxxx, Internet http://www.itc.nl
------------------------------------------------------------------------
package thredds.cataloggen;
import java.io.IOException;
import java.lang.reflect.InvocationTargetException;
import thredds.catalog.InvCatalogImpl;
import thredds.catalog.InvService;
import thredds.crawlabledataset.CrawlableDatasetFactory;
import thredds.crawlabledataset.CrawlableDataset;
import thredds.crawlabledataset.CrawlableDatasetFilter;
public class Test {
public static void main(String[] args) throws IOException,
IllegalArgumentException, ClassNotFoundException,
NoSuchMethodException, IllegalAccessException,
InvocationTargetException, InstantiationException {
String collectionPath = "file:///d:/thredds/";
CrawlableDataset collectionLevel = CrawlableDatasetFactory
.createCrawlableDataset("file:///d:/thredds/",
null, null);
CrawlableDataset catalogLevel = CrawlableDatasetFactory
.createCrawlableDataset("file:///d:/thredds/",
null, null);
CrawlableDataset currentLevel = null;
CrawlableDatasetFilter filter = null;
InvService service = new InvService("my_service", "File",
"file:///d:/", null, null);
CollectionLevelScanner cls = new
CollectionLevelScanner(collectionPath,
collectionLevel, catalogLevel, currentLevel,
filter, service);
cls.scan();
InvCatalogImpl catalog = cls.generateCatalog();
String s = catalog.getName();
int i = 0;
}
}
--
Ethan R. Davis Telephone: (303) 497-8155
Software Engineer Fax: (303) 497-8690
UCAR Unidata Program Center E-mail: edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO 80307-3000 http://www.unidata.ucar.edu/
---------------------------------------------------------------------------