Hi Annie,
It turns out that this is an issue with Tika. tika-app-1.6-SNAPSHOT.jar
actually includes netcdf-4.2.20, so you have 2 different versions of the
same library on the classpath, which is always bad. And unfortunately,
using Tika's bundled netcdf-4.2.20 alone won't work, because support for
reading GRIB files was only added to NetCDF-Java in 4.3+.
This may be a problem that only the Tika developers know how to fix. I
suggest opening a ticket with them.
Good luck!
Christian Ward-Garrison
On Mon, Jul 28, 2014 at 3:19 PM, Annie Burgess <anniebryant@xxxxxxxxx>
wrote:
> Hi Christian,
>
> Your code words great - no problems when I compile and run. I've modified
> my code to use NetcdfDataset rather than NetcdfFile to open the .grib2
> file. However, I'm still getting an error in my code:
>
> % javac -classpath
> ../../../../tika-core/target/tika-core-1.6-SNAPSHOT.jar:../../../../netcdfAll-4.3.jar
> org/apache/tika/parser/grib/GribParser.java
>
> % java -classpath
> tika-app/target/tika-app-1.6-SNAPSHOT.jar:annie-parsers.jar:netcdfAll-4.3.jar
> org.apache.tika.cli.TikaCLI --text gdas1.forecmwf.2014062612.grib2
>
>
> -----------------------GripParser.java------------------------
> import java.io.ByteArrayOutputStream;
> import java.io.IOException;
> import java.io.InputStream;
> import java.util.Collections;
> import java.util.Set;
> import java.io.File;
>
> import org.apache.tika.exception.TikaException;
>
> import org.apache.tika.mime.MediaType;
> import org.apache.tika.parser.AbstractParser;
> import org.apache.tika.parser.ParseContext;
> import org.apache.tika.parser.Parser;
> import org.apache.tika.sax.XHTMLContentHandler;
> import org.xml.sax.ContentHandler;
> import org.xml.sax.SAXException;
>
> import ucar.nc2.NetcdfFile;
> import ucar.nc2.dataset.NetcdfDataset;/**
>
>
> public class GribParser extends AbstractParser {
>
> private final Set<MediaType> SUPPORTED_TYPES =
> Collections.singleton(MediaType.application("x-grib2"));
>
> public Set<MediaType> getSupportedTypes(ParseContext context) {
> return SUPPORTED_TYPES;
> }
>
> public void parse(InputStream stream, ContentHandler handler,
> Metadata metadata, ParseContext context) throws IOException,
> SAXException, TikaException {
>
> System.err.println(" Check 1 ");
>
> File gribFile = new File("gdas1.forecmwf.2014062612.grib2");
>
> NetcdfFile ncFile = NetcdfDataset.openFile(gribFile.getAbsolutePath(),
> null);
>
> System.err.println(" Check 2 ");
>
> }
> }
>
> ---------------------OUTPUT---------------------
>
> Check 1
> Exception in thread "main" org.apache.tika.exception.TikaException:
> TIKA-198: Illegal IOException from
> org.apache.tika.parser.grib.GribParser@483ad415
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:249)
>
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
> at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:141)
> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:420)
> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
> Caused by: java.io.IOException: Cant read
> /Users/IGSWAHWSWBURGESS/Development/tikadev/tika/gdas1.forecmwf.2014062612.grib2:
> not a valid CDM file.
> at ucar.nc2.NetcdfFile.open(NetcdfFile.java:734)
> at ucar.nc2.NetcdfFile.open(NetcdfFile.java:384)
> at ucar.nc2.dataset.NetcdfDataset.openOrAcquireFile(NetcdfDataset.java:687)
> at ucar.nc2.dataset.NetcdfDataset.openFile(NetcdfDataset.java:564)
> at org.apache.tika.parser.grib.GribParser.parse(GribParser.java:82)
>
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
> ... 5 more
>
> ---------------------------------------------
>
> netcdfALL and .grib2 files are in the same director where I am running
> GribParser.java. This could totally be a classpath issue... I'm just
> stumped.
>
> Any other ideas?
>
> Thanks!
> Annie
>
>
> On Mon, Jul 28, 2014 at 12:40 PM, Christian Ward-Garrison <
> cwardgar@xxxxxxxx> wrote:
>
>> Hi Annie,
>>
>> I see you're using netcdfAll-4.3.jar. That actually already contains the
>> grib module, so it should be all you need. I created a minimal example:
>>
>> ------------------------------------- Foo.java
>> -------------------------------------
>>
>> import java.io.IOException;
>> import java.io.File;
>> import ucar.nc2.NetcdfFile;
>> import ucar.nc2.dataset.NetcdfDataset;
>>
>> public class Foo {
>> public static void main(String[] args) throws IOException {
>> File gribFile = new File("foo.grib2");
>> NetcdfFile ncFile =
>> NetcdfDataset.openFile(gribFile.getAbsolutePath(), null);
>> try {
>> System.out.println(ncFile.toString());
>> } finally {
>> ncFile.close();
>> }
>> }
>> }
>>
>> ------------------------------------- Shell commands
>> -------------------------------------
>>
>> javac -cp netcdfAll-4.3.jar Foo.java
>>
>> java -cp .;netcdfAll-4.3.jar Foo
>>
>>
>> That should work as long as netcdfAll-4.3.jar and a file named
>> "foo.grib2" are in the same directory as Foo.java. If you move things,
>> you'll obviously need to modify the commands. Does this example work for
>> you?
>>
>> Cheers,
>> Christian
>>
>>
>>
>> On Mon, Jul 28, 2014 at 1:52 PM, Annie Burgess <anniebryant@xxxxxxxxx>
>> wrote:
>>
>>> Hi Christian,
>>>
>>> Thanks for your response. I've cut down the code (pasted below) to a
>>> sort of bare-bones version that is ONLY trying to open the .grib2 file as
>>> if it were a .nc file.
>>>
>>> I build apache tika from:
>>>
>>> svn co http://svn.apache.org/repos/asf/tika/trunk tika
>>> mvn install
>>>
>>> I pulled netcdfAll and toolsUI .jar files from:
>>> http://www.unidata.ucar.edu/downloads/netcdf/netcdf-java-4/index.jsp
>>>
>>> I pulled the grib .jar from:
>>> http://mvnrepository.com/artifact/edu.ucar/grib/8.0.29
>>>
>>> I compile the code as:
>>> [asc-227-196:src/main/java] AB% javac -classpath
>>> ../../../../tika-core/target/tika-core-1.6-SNAPSHOT.jar:../../../../toolsUI-4.3.jar:../../../../netcdfAll-4.3.jar:../../../../grib-8.0.29.jar
>>> org/apache/tika/parser/grib/GribParser.java
>>>
>>> I run the code as:
>>> [asc-227-196:~/Development/tikadev/tika] AB% java -classpath
>>> tika-app/target/tika-app-1.6-SNAPSHOT.jar:annie-parsers.jar:netcdfAll-4.3.jar:grib-8.0.29.jar:toolsUI-4.3.jar
>>> org.apache.tika.cli.TikaCLI --text gdas1.forecmwf.2014062612.grib2
>>>
>>> CODE:
>>>
>>> --------------------------------------------------
>>> package org.apache.tika.parser.grib;
>>>
>>> import java.io.ByteArrayOutputStream;
>>> import java.io.IOException;
>>> import java.io.InputStream;
>>> import java.util.Collections;
>>> import java.util.Set;
>>> import java.util.List;
>>> import java.util.Iterator;
>>>
>>> //JDK imports
>>> import org.apache.tika.exception.TikaException;
>>> import org.apache.tika.io.IOUtils;
>>> import org.apache.tika.metadata.Metadata;
>>> import org.apache.tika.metadata.Property;
>>> import org.apache.tika.metadata.TikaCoreProperties;
>>> import org.apache.tika.mime.MediaType;
>>> import org.apache.tika.parser.AbstractParser;
>>> import org.apache.tika.parser.ParseContext;
>>> import org.apache.tika.parser.Parser;
>>> import org.apache.tika.sax.XHTMLContentHandler;
>>> import org.xml.sax.ContentHandler;
>>> import org.xml.sax.SAXException;
>>>
>>> import ucar.grib.grib2.*;
>>> import ucar.nc2.*;
>>>
>>> /**
>>> * A {@link Parser} for <a
>>> * href="http://www.unidata.ucar.edu/software/netcdf/index.html
>>> ">NetCDF</a>
>>> * files using the UCAR, MIT-licensed <a
>>> * href="http://www.unidata.ucar.edu/software/netcdf-java/">NetCDF for
>>> Java</a>
>>> * API.
>>> */
>>> public class GribParser extends AbstractParser {
>>>
>>> private final Set<MediaType> SUPPORTED_TYPES =
>>> Collections.singleton(MediaType.application("x-grib2"));
>>> /*
>>> * (non-Javadoc)
>>> *
>>> * @see
>>> *
>>> org.apache.tika.parser.Parser#getSupportedTypes(org.apache.tika.parser
>>> * .ParseContext)
>>> */
>>> public Set<MediaType> getSupportedTypes(ParseContext context) {
>>> return SUPPORTED_TYPES;
>>> }
>>> /*
>>> * (non-Javadoc)
>>> *
>>> * @see org.apache.tika.parser.Parser#parse(java.io.InputStream,
>>> * org.xml.sax.ContentHandler, org.apache.tika.metadata.Metadata,
>>> * org.apache.tika.parser.ParseContext)
>>> */
>>> public void parse(InputStream stream, ContentHandler handler,
>>> Metadata metadata, ParseContext context) throws IOException,
>>> SAXException, TikaException {
>>>
>>> System.err.println(" Check 1 ");
>>> String name = "/Users/IGSWAHWSWBURGESS/POLARCYBER/gdas1.forecmwf.
>>> 2014062612.grib2";
>>>
>>> if (name == null) {
>>> name = "";
>>> }
>>>
>>> NetcdfFile ncFile = NetcdfFile.open(name, null);
>>> System.err.println(" Check 2 ");
>>> }
>>> }
>>>
>>>
>>> OUTPUT:
>>>
>>> Check 1
>>>
>>> Exception in thread "main" org.apache.tika.exception.TikaException:
>>> Unexpected RuntimeException from
>>> org.apache.tika.parser.grib.GribParser@261a53b9
>>> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:245)
>>> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
>>> at
>>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
>>> at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:141)
>>> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:420)
>>> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
>>> Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodError:
>>> ucar.grib.grib2.Grib2WriteIndex.writeGribIndex(Ljava/io/File;Ljava/lang/String;Lucar/unidata/io/RandomAccessFile;Z)Lucar/grid/GridIndex;
>>> at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:1326)
>>> at ucar.nc2.NetcdfFile.open(NetcdfFile.java:744)
>>> at ucar.nc2.NetcdfFile.openInMemory(NetcdfFile.java:670)
>>> at org.apache.tika.parser.grib.GribParser.parse(GribParser.java:93)
>>> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:243)
>>> ... 5 more
>>> Caused by: java.lang.NoSuchMethodError:
>>> ucar.grib.grib2.Grib2WriteIndex.writeGribIndex(Ljava/io/File;Ljava/lang/String;Lucar/unidata/io/RandomAccessFile;Z)Lucar/grid/GridIndex;
>>> at
>>> ucar.nc2.iosp.grib.GribGridServiceProvider.writeIndex(GribGridServiceProvider.java:348)
>>> at
>>> ucar.nc2.iosp.grib.GribGridServiceProvider.getIndex(GribGridServiceProvider.java:292)
>>> at
>>> ucar.nc2.iosp.grib.GribGridServiceProvider.open(GribGridServiceProvider.java:118)
>>> at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:1308)
>>> ... 9 more
>>>
>>>
>>>
>>> Note, if I use a .nc file the code runs successfully.
>>>
>>> OUTPUT:
>>>
>>> Check 1
>>> Check 2
>>>
>>>
>>> I am sort of a java newbie, so please let me know if I've left out any
>>> critical information!
>>>
>>> Thank you for any help/insight you can give.
>>>
>>> Annie
>>>
>>>
>>> On Sun, Jul 27, 2014 at 10:44 PM, Christian Ward-Garrison <
>>> cwardgar@xxxxxxxx> wrote:
>>>
>>>> Hi Annie,
>>>>
>>>> This is the result of the GRIB module not being on the classpath when
>>>> you execute your Java program. Can you give me more details about your
>>>> setup? Can you provide you build file (Maven, Ant, Gradle, etc)?
>>>>
>>>> Cheers,
>>>> Christian
>>>>
>>>>
>>>> On Wed, Jul 23, 2014 at 5:06 PM, Annie Burgess <anniebryant@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Greetings all,
>>>>>
>>>>> I am trying to create a script that will mimic the output of NCDump.
>>>>> I have successfully done this for NetCDF files, and now I am trying to
>>>>> apply it to grib2 files. I am using the NetCDF-java library in
>>>>> conjunction
>>>>> with Apache Tika to do this. Other posts have indicated I should be able
>>>>> to open my grib2 files, just as if they were .nc files. However, I
>>>>> continue to get the following error:
>>>>>
>>>>> "Caused by: java.io.IOException: Cant read
>>>>> gdas1.forecmwf.2014062612.grib2:
>>>>> not a valid CDM file."
>>>>>
>>>>> To open the .nc files, this is the bit of code I use (with the
>>>>> exception of changing the .nc file to a .grib2 file):
>>>>>
>>>>> String name = "gdas1.forecmwf.2014062612.grib2";
>>>>>
>>>>> if (name == null) {
>>>>> name = "";
>>>>> }
>>>>>
>>>>> try NetcdfFile ncFile = NetcdfFile.openInMemory(name,
>>>>> os.toByteArray());
>>>>> // first parse out the set of global attributes
>>>>> for (Attribute attr : ncFile.getGlobalAttributes()) {
>>>>> Property property = resolveMetadataKey(attr.getName());
>>>>> if (attr.getDataType().isString()) {
>>>>> metadata.add(property, attr.getStringValue());
>>>>> } else if (attr.getDataType().isNumeric()) {
>>>>> int value = attr.getNumericValue().intValue();
>>>>> metadata.add(property, String.valueOf(value));
>>>>> }
>>>>> }
>>>>>
>>>>> Also, I am using the netcdfAll-4.3.jar at the command line. Does
>>>>> anyone have any insight as to *why *I'd be getting the 'not a valid
>>>>> CDM' error. I have checked the file using the NetCDF (4.3) GUI and the
>>>>> file looks good.
>>>>>
>>>>> Thank you for any insight you can give.
>>>>>
>>>>> Annie
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> netcdf-java mailing list
>>>>> netcdf-java@xxxxxxxxxxxxxxxx
>>>>> For list information or to unsubscribe, visit:
>>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>>
>>>>
>>>>
>>>
>>
>
>