Hi Christian,
Yes, I've discovered the same thing over the past 24 hours!
When I call Tika at the command line, using the following, I now can
successfully open and parse a .grib2 file.
*java -classpath
.:netcdfAll-4.3.jar:tika-app/target/tika-app-1.6-SNAPSHOT.jar:annie-parsers.jar
org.apache.tika.cli.TikaCLI --metadata gdas1.forecmwf.2014062612.grib2*
However, I also get the following errors (assumed because of the duplicate
versions) that print above my parsed text:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/Users/IGSWAHWSWBURGESS/Development/tikadev/tika/netcdfAll-4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/Users/IGSWAHWSWBURGESS/Development/tikadev/tika/tika-app/target/tika-app-1.6-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
So good news, I've got .grib2 parsing with Tika! Now, I'll take care of
the Java 4.3+ issue on the Tika side of things.
Many thanks for your help.
Annie
On Tue, Jul 29, 2014 at 11:53 AM, Christian Ward-Garrison <cwardgar@xxxxxxxx
> wrote:
> Hi Annie,
>
> It turns out that this is an issue with Tika. tika-app-1.6-SNAPSHOT.jar
> actually includes netcdf-4.2.20, so you have 2 different versions of the
> same library on the classpath, which is always bad. And unfortunately,
> using Tika's bundled netcdf-4.2.20 alone won't work, because support for
> reading GRIB files was only added to NetCDF-Java in 4.3+.
>
> This may be a problem that only the Tika developers know how to fix. I
> suggest opening a ticket with them.
>
> Good luck!
> Christian Ward-Garrison
>
>
> On Mon, Jul 28, 2014 at 3:19 PM, Annie Burgess <anniebryant@xxxxxxxxx>
> wrote:
>
>> Hi Christian,
>>
>> Your code words great - no problems when I compile and run. I've
>> modified my code to use NetcdfDataset rather than NetcdfFile to open the
>> .grib2 file. However, I'm still getting an error in my code:
>>
>> % javac -classpath
>> ../../../../tika-core/target/tika-core-1.6-SNAPSHOT.jar:../../../../netcdfAll-4.3.jar
>> org/apache/tika/parser/grib/GribParser.java
>>
>> % java -classpath
>> tika-app/target/tika-app-1.6-SNAPSHOT.jar:annie-parsers.jar:netcdfAll-4.3.jar
>> org.apache.tika.cli.TikaCLI --text gdas1.forecmwf.2014062612.grib2
>>
>>
>> -----------------------GripParser.java------------------------
>> import java.io.ByteArrayOutputStream;
>> import java.io.IOException;
>> import java.io.InputStream;
>> import java.util.Collections;
>> import java.util.Set;
>> import java.io.File;
>>
>> import org.apache.tika.exception.TikaException;
>>
>> import org.apache.tika.mime.MediaType;
>> import org.apache.tika.parser.AbstractParser;
>> import org.apache.tika.parser.ParseContext;
>> import org.apache.tika.parser.Parser;
>> import org.apache.tika.sax.XHTMLContentHandler;
>> import org.xml.sax.ContentHandler;
>> import org.xml.sax.SAXException;
>>
>> import ucar.nc2.NetcdfFile;
>> import ucar.nc2.dataset.NetcdfDataset;/**
>>
>>
>> public class GribParser extends AbstractParser {
>>
>> private final Set<MediaType> SUPPORTED_TYPES =
>> Collections.singleton(MediaType.application("x-grib2"));
>>
>> public Set<MediaType> getSupportedTypes(ParseContext context) {
>> return SUPPORTED_TYPES;
>> }
>>
>> public void parse(InputStream stream, ContentHandler handler,
>> Metadata metadata, ParseContext context) throws IOException,
>> SAXException, TikaException {
>>
>> System.err.println(" Check 1 ");
>>
>> File gribFile = new File("gdas1.forecmwf.2014062612.grib2");
>>
>> NetcdfFile ncFile =
>> NetcdfDataset.openFile(gribFile.getAbsolutePath(), null);
>>
>> System.err.println(" Check 2 ");
>>
>> }
>> }
>>
>> ---------------------OUTPUT---------------------
>>
>> Check 1
>> Exception in thread "main" org.apache.tika.exception.TikaException:
>> TIKA-198: Illegal IOException from
>> org.apache.tika.parser.grib.GribParser@483ad415
>> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:249)
>>
>> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
>> at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
>> at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:141)
>> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:420)
>> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
>> Caused by: java.io.IOException: Cant read
>> /Users/IGSWAHWSWBURGESS/Development/tikadev/tika/gdas1.forecmwf.
>> 2014062612.grib2: not a valid CDM file.
>> at ucar.nc2.NetcdfFile.open(NetcdfFile.java:734)
>> at ucar.nc2.NetcdfFile.open(NetcdfFile.java:384)
>> at
>> ucar.nc2.dataset.NetcdfDataset.openOrAcquireFile(NetcdfDataset.java:687)
>> at ucar.nc2.dataset.NetcdfDataset.openFile(NetcdfDataset.java:564)
>> at org.apache.tika.parser.grib.GribParser.parse(GribParser.java:82)
>>
>> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
>> ... 5 more
>>
>> ---------------------------------------------
>>
>> netcdfALL and .grib2 files are in the same director where I am running
>> GribParser.java. This could totally be a classpath issue... I'm just
>> stumped.
>>
>> Any other ideas?
>>
>> Thanks!
>> Annie
>>
>>
>> On Mon, Jul 28, 2014 at 12:40 PM, Christian Ward-Garrison <
>> cwardgar@xxxxxxxx> wrote:
>>
>>> Hi Annie,
>>>
>>> I see you're using netcdfAll-4.3.jar. That actually already contains the
>>> grib module, so it should be all you need. I created a minimal example:
>>>
>>> ------------------------------------- Foo.java
>>> -------------------------------------
>>>
>>> import java.io.IOException;
>>> import java.io.File;
>>> import ucar.nc2.NetcdfFile;
>>> import ucar.nc2.dataset.NetcdfDataset;
>>>
>>> public class Foo {
>>> public static void main(String[] args) throws IOException {
>>> File gribFile = new File("foo.grib2");
>>> NetcdfFile ncFile =
>>> NetcdfDataset.openFile(gribFile.getAbsolutePath(), null);
>>> try {
>>> System.out.println(ncFile.toString());
>>> } finally {
>>> ncFile.close();
>>> }
>>> }
>>> }
>>>
>>> ------------------------------------- Shell commands
>>> -------------------------------------
>>>
>>> javac -cp netcdfAll-4.3.jar Foo.java
>>>
>>> java -cp .;netcdfAll-4.3.jar Foo
>>>
>>>
>>> That should work as long as netcdfAll-4.3.jar and a file named
>>> "foo.grib2" are in the same directory as Foo.java. If you move things,
>>> you'll obviously need to modify the commands. Does this example work for
>>> you?
>>>
>>> Cheers,
>>> Christian
>>>
>>>
>>>
>>> On Mon, Jul 28, 2014 at 1:52 PM, Annie Burgess <anniebryant@xxxxxxxxx>
>>> wrote:
>>>
>>>> Hi Christian,
>>>>
>>>> Thanks for your response. I've cut down the code (pasted below) to a
>>>> sort of bare-bones version that is ONLY trying to open the .grib2 file as
>>>> if it were a .nc file.
>>>>
>>>> I build apache tika from:
>>>>
>>>> svn co http://svn.apache.org/repos/asf/tika/trunk tika
>>>> mvn install
>>>>
>>>> I pulled netcdfAll and toolsUI .jar files from:
>>>> http://www.unidata.ucar.edu/downloads/netcdf/netcdf-java-4/index.jsp
>>>>
>>>> I pulled the grib .jar from:
>>>> http://mvnrepository.com/artifact/edu.ucar/grib/8.0.29
>>>>
>>>> I compile the code as:
>>>> [asc-227-196:src/main/java] AB% javac -classpath
>>>> ../../../../tika-core/target/tika-core-1.6-SNAPSHOT.jar:../../../../toolsUI-4.3.jar:../../../../netcdfAll-4.3.jar:../../../../grib-8.0.29.jar
>>>> org/apache/tika/parser/grib/GribParser.java
>>>>
>>>> I run the code as:
>>>> [asc-227-196:~/Development/tikadev/tika] AB% java -classpath
>>>> tika-app/target/tika-app-1.6-SNAPSHOT.jar:annie-parsers.jar:netcdfAll-4.3.jar:grib-8.0.29.jar:toolsUI-4.3.jar
>>>> org.apache.tika.cli.TikaCLI --text gdas1.forecmwf.2014062612.grib2
>>>>
>>>> CODE:
>>>>
>>>> --------------------------------------------------
>>>> package org.apache.tika.parser.grib;
>>>>
>>>> import java.io.ByteArrayOutputStream;
>>>> import java.io.IOException;
>>>> import java.io.InputStream;
>>>> import java.util.Collections;
>>>> import java.util.Set;
>>>> import java.util.List;
>>>> import java.util.Iterator;
>>>>
>>>> //JDK imports
>>>> import org.apache.tika.exception.TikaException;
>>>> import org.apache.tika.io.IOUtils;
>>>> import org.apache.tika.metadata.Metadata;
>>>> import org.apache.tika.metadata.Property;
>>>> import org.apache.tika.metadata.TikaCoreProperties;
>>>> import org.apache.tika.mime.MediaType;
>>>> import org.apache.tika.parser.AbstractParser;
>>>> import org.apache.tika.parser.ParseContext;
>>>> import org.apache.tika.parser.Parser;
>>>> import org.apache.tika.sax.XHTMLContentHandler;
>>>> import org.xml.sax.ContentHandler;
>>>> import org.xml.sax.SAXException;
>>>>
>>>> import ucar.grib.grib2.*;
>>>> import ucar.nc2.*;
>>>>
>>>> /**
>>>> * A {@link Parser} for <a
>>>> * href="http://www.unidata.ucar.edu/software/netcdf/index.html
>>>> ">NetCDF</a>
>>>> * files using the UCAR, MIT-licensed <a
>>>> * href="http://www.unidata.ucar.edu/software/netcdf-java/">NetCDF for
>>>> Java</a>
>>>> * API.
>>>> */
>>>> public class GribParser extends AbstractParser {
>>>>
>>>> private final Set<MediaType> SUPPORTED_TYPES =
>>>> Collections.singleton(MediaType.application("x-grib2"));
>>>> /*
>>>> * (non-Javadoc)
>>>> *
>>>> * @see
>>>> *
>>>> org.apache.tika.parser.Parser#getSupportedTypes(org.apache.tika.parser
>>>> * .ParseContext)
>>>> */
>>>> public Set<MediaType> getSupportedTypes(ParseContext context) {
>>>> return SUPPORTED_TYPES;
>>>> }
>>>> /*
>>>> * (non-Javadoc)
>>>> *
>>>> * @see org.apache.tika.parser.Parser#parse(java.io.InputStream,
>>>> * org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata,
>>>> * org.apache.tika.parser.ParseContext)
>>>> */
>>>> public void parse(InputStream stream, ContentHandler handler,
>>>> Metadata metadata, ParseContext context) throws IOException,
>>>> SAXException, TikaException {
>>>>
>>>> System.err.println(" Check 1 ");
>>>> String name = "/Users/IGSWAHWSWBURGESS/POLARCYBER/gdas1.forecmwf.
>>>> 2014062612.grib2";
>>>>
>>>> if (name == null) {
>>>> name = "";
>>>> }
>>>>
>>>> NetcdfFile ncFile = NetcdfFile.open(name, null);
>>>> System.err.println(" Check 2 ");
>>>> }
>>>> }
>>>>
>>>>
>>>> OUTPUT:
>>>>
>>>> Check 1
>>>>
>>>> Exception in thread "main" org.apache.tika.exception.TikaException:
>>>> Unexpected RuntimeException from
>>>> org.apache.tika.parser.grib.GribParser@261a53b9
>>>> at
>>>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:245)
>>>> at
>>>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
>>>> at
>>>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
>>>> at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:141)
>>>> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:420)
>>>> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
>>>> Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodError:
>>>> ucar.grib.grib2.Grib2WriteIndex.writeGribIndex(Ljava/io/File;Ljava/lang/String;Lucar/unidata/io/RandomAccessFile;Z)Lucar/grid/GridIndex;
>>>> at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:1326)
>>>> at ucar.nc2.NetcdfFile.open(NetcdfFile.java:744)
>>>> at ucar.nc2.NetcdfFile.openInMemory(NetcdfFile.java:670)
>>>> at org.apache.tika.parser.grib.GribParser.parse(GribParser.java:93)
>>>> at
>>>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
>>>> ... 5 more
>>>> Caused by: java.lang.NoSuchMethodError:
>>>> ucar.grib.grib2.Grib2WriteIndex.writeGribIndex(Ljava/io/File;Ljava/lang/String;Lucar/unidata/io/RandomAccessFile;Z)Lucar/grid/GridIndex;
>>>> at
>>>> ucar.nc2.iosp.grib.GribGridServiceProvider.writeIndex(GribGridServiceProvider.java:348)
>>>> at
>>>> ucar.nc2.iosp.grib.GribGridServiceProvider.getIndex(GribGridServiceProvider.java:292)
>>>> at
>>>> ucar.nc2.iosp.grib.GribGridServiceProvider.open(GribGridServiceProvider.java:118)
>>>> at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:1308)
>>>> ... 9 more
>>>>
>>>>
>>>>
>>>> Note, if I use a .nc file the code runs successfully.
>>>>
>>>> OUTPUT:
>>>>
>>>> Check 1
>>>> Check 2
>>>>
>>>>
>>>> I am sort of a java newbie, so please let me know if I've left out any
>>>> critical information!
>>>>
>>>> Thank you for any help/insight you can give.
>>>>
>>>> Annie
>>>>
>>>>
>>>> On Sun, Jul 27, 2014 at 10:44 PM, Christian Ward-Garrison <
>>>> cwardgar@xxxxxxxx> wrote:
>>>>
>>>>> Hi Annie,
>>>>>
>>>>> This is the result of the GRIB module not being on the classpath when
>>>>> you execute your Java program. Can you give me more details about your
>>>>> setup? Can you provide you build file (Maven, Ant, Gradle, etc)?
>>>>>
>>>>> Cheers,
>>>>> Christian
>>>>>
>>>>>
>>>>> On Wed, Jul 23, 2014 at 5:06 PM, Annie Burgess <anniebryant@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> Greetings all,
>>>>>>
>>>>>> I am trying to create a script that will mimic the output of NCDump.
>>>>>> I have successfully done this for NetCDF files, and now I am trying to
>>>>>> apply it to grib2 files. I am using the NetCDF-java library in
>>>>>> conjunction
>>>>>> with Apache Tika to do this. Other posts have indicated I should be able
>>>>>> to open my grib2 files, just as if they were .nc files. However, I
>>>>>> continue to get the following error:
>>>>>>
>>>>>> "Caused by: java.io.IOException: Cant read
>>>>>> gdas1.forecmwf.2014062612.grib2:
>>>>>> not a valid CDM file."
>>>>>>
>>>>>> To open the .nc files, this is the bit of code I use (with the
>>>>>> exception of changing the .nc file to a .grib2 file):
>>>>>>
>>>>>> String name = "gdas1.forecmwf.2014062612.grib2";
>>>>>>
>>>>>> if (name == null) {
>>>>>> name = "";
>>>>>> }
>>>>>>
>>>>>> try NetcdfFile ncFile = NetcdfFile.openInMemory(name,
>>>>>> os.toByteArray());
>>>>>> // first parse out the set of global attributes
>>>>>> for (Attribute attr : ncFile.getGlobalAttributes()) {
>>>>>> Property property =
>>>>>> resolveMetadataKey(attr.getName());
>>>>>> if (attr.getDataType().isString()) {
>>>>>> metadata.add(property, attr.getStringValue());
>>>>>> } else if (attr.getDataType().isNumeric()) {
>>>>>> int value = attr.getNumericValue().intValue();
>>>>>> metadata.add(property, String.valueOf(value));
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> Also, I am using the netcdfAll-4.3.jar at the command line. Does
>>>>>> anyone have any insight as to *why *I'd be getting the 'not a valid
>>>>>> CDM' error. I have checked the file using the NetCDF (4.3) GUI and the
>>>>>> file looks good.
>>>>>>
>>>>>> Thank you for any insight you can give.
>>>>>>
>>>>>> Annie
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> netcdf-java mailing list
>>>>>> netcdf-java@xxxxxxxxxxxxxxxx
>>>>>> For list information or to unsubscribe, visit:
>>>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
--
------------------------------------------------------------------------------------------
Ann Bryant Burgess, PhD
Postdoctoral Fellow
Computer Science Department
University of Southern California
Viterbi School of Engineering
Los Angeles, CA
Alaska Science Center/USGS
Anchorage, AK
Cell: (585) 738-7549
Office: (907) 786-7059
Fax: (907) 786-7150
E-mail: anniebryant.burgess@xxxxxxxxx
Office Address: 4210 University Dr., Anchorage, AK 99508-4626
-------------------------------------------------------------------------------------------