Re: [thredds] Catalog example for AWS S3 resource?

  • To: "H. Joe Lee" <hyoklee@xxxxxxxxxxxx>
  • Subject: Re: [thredds] Catalog example for AWS S3 resource?
  • From: Sean Arms <sarms@xxxxxxxx>
  • Date: Tue, 10 Mar 2020 08:31:55 -0600
Greetings Joe,

Thank you for the report! I just merged a fix for the
NegativeArraySizeException issue, and have enabled ToolsUI's NcML tab
to work with S3 objects (you just need to make sure the "modes ->
NetcdfFile -> use builders" menu option is checked). Keep in mind this
is a first pass, and not optimized at all at this point.

Cheers,

Sean


On Mon, Mar 9, 2020 at 3:08 PM H. Joe Lee <hyoklee@xxxxxxxxxxxx> wrote:
>
>  Thanks, Sean!
>
>   Both .war file and sample catalog.xml worked like charm.
>   For example, I could visualize MOP03T v7 on S3 using Panoply via THREDDS 
> OPeNDAP.
>   Unidata Java team is amazing!
>
>   So far, I found two issues though:
>
>   1) toolsUI NcML tab doesn't work s3:// URL.
>   2) It can't open a huge (15G~40G) netCDF-4 file like TerraFusion [1].
>
>  Here's the error message that I got when I opened TerraFusion:
>
> Error {
>     code = 500;
>     message = "com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.NegativeArraySizeException";
> };
>
>   Sincerely,
>
>
> [1] https://registry.opendata.aws/terrafusion/
> --
> Datafy everything in HDF for faster AI.
>
>
>
>
> On Mon, Mar 9, 2020 at 2:43 PM Sean Arms <sarms@xxxxxxxx> wrote:
>>
>> Greetings Joe,
>>
>> I recently split the netCDF-Java and TDS codebased into their own
>> repositories, and the repository holding the appropriate TDS code is
>> located at:
>>
>> https://github.com/Unidata/tds
>>
>> If you build the current master branch, you'll have everything you
>> need at this point. The most recent snapshot should work as well:
>>
>> https://artifacts.unidata.ucar.edu/repository/unidata-snapshots/edu/ucar/tds/5.0.0-SNAPSHOT/tds-5.0.0-20200308.175757-566.war
>>
>> (just be sure to rename it to thredds.war before deploying it).
>>
>> The sample catalog I added to our integration tests for the TDS can be
>> found here:
>>
>> https://github.com/Unidata/tds/blob/master/tds/src/test/content/thredds/tds-s3.xml
>>
>> Cheers,
>>
>> Sean
>>
>>
>> On Mon, Mar 9, 2020 at 8:39 AM H. Joe Lee <hyoklee@xxxxxxxxxxxx> wrote:
>> >
>> >   Thanks, Ethan!
>> >
>> >   It's so cool to see toolsUI can access NASA HDF-EOS5 on S3.
>> > I hope both IDV and Panoply can use the new netCDF-Java soon, too.
>> >
>> >   By the way, will the master branch of THREDDS use the latest netCDF-java?
>> > If not, where should I modify in the THREDDS source code to build
>> >  THREDDS with netCDF-Java snapshot?
>> >
>> >   I'm very excited to try the new THREDDS catalog with S3 datasetRoot path!
>> >
>> > Sincerely,
>> >
>> > --
>> > Datafy everything in HDF for faster AI.
>> >
>> >
>> >
>> >
>> > On Wed, Mar 4, 2020 at 10:52 AM Ethan Davis <edavis@xxxxxxxx> wrote:
>> >>
>> >> Hi Joe,
>> >>
>> >> [Sorry for the delayed response.]
>> >>
>> >> The S3 work moved to the Unidata/netCDF-java repo in PR #173 ("S3 
>> >> Support"). This PR got merged into master a week or so ago and is 
>> >> available in the netCDF-Java 5.3.0-SNAPSHOT release (and will be in the 
>> >> upcoming 5.3.0 release). The latest TDS code built with netCDF-Java 
>> >> 5.3.0-SNAPSHOT can be configured to serve an individual netCDF file 
>> >> stored as an S3 object using a datasetRoot configuration, e.g.
>> >>
>> >>
>> >> <?xml version="1.0" encoding="UTF-8"?>
>> >>
>> >> <catalog name="Test TDS S3"
>> >>
>> >>   xmlns="https://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0";
>> >>
>> >>   xmlns:xlink="https://www.w3.org/1999/xlink";
>> >>
>> >>   xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance";
>> >>
>> >>   
>> >> xsi:schemaLocation="https://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0
>> >>
>> >>     https://www.unidata.ucar.edu/schemas/thredds/InvCatalog.1.0.6.xsd";>
>> >>
>> >>
>> >>   <datasetRoot path="s3-test" location="s3://noaa-goes16" />
>> >>
>> >>
>> >>   <dataset name="Test GOES-16 S3" ID="testS3Grid"
>> >>
>> >>      
>> >> urlPath="s3-test/ABI-L1b-RadC/2019/363/21/OR_ABI-L1b-RadC-M6C16_G16_s20193632101189_e20193632103574_c20193632104070.nc"
>> >>
>> >>            dataType="Grid"/>
>> >>
>> >>
>> >> </catalog>
>> >>
>> >>
>> >> In this case, the datasetRoot location is the bucket name, and the 
>> >> urlPath is the datasetRoot path combined with the key. We rely on the AWS 
>> >> Java SDK (v2) to handle credentials, setting of region, etc. For now, you 
>> >> can set the region by creating a credentials file ~/.aws/credentials that 
>> >> looked like:
>> >>
>> >>
>> >> [default]
>> >>
>> >> region=us-east-1
>> >>
>> >>
>> >> Which is how netCDF-java knows which region to use for bucket access. We 
>> >> may look at other mechanisms to make that a bit more integrated into TDS 
>> >> configuration but for now that should work.
>> >>
>> >>
>> >> Once the netCDF 5.3.0 release comes out, TDS snapshot builds will be 
>> >> built with this capability. For now, you would need to build the TDS and 
>> >> explicitly tell it to build with netCDF-Java 5.3.0-SNAPSHOT.
>> >>
>> >> Cheers,
>> >>
>> >> Ethan
>> >>
>> >> On Tue, Feb 4, 2020 at 2:30 PM H. Joe Lee <hyoklee@xxxxxxxxxxxx> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>>   Is it possible to serve netCDF data on AWS S3 using THREDDS?
>> >>>   I think it seems possible based on the S3 feature branch [1].
>> >>>
>> >>>   If so, can someone share an example THREDDS catalog configuration?
>> >>>
>> >>>   Regards,
>> >>>
>> >>> [1] https://github.com/Unidata/thredds/tree/feature/s3+hdfs
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> NOTE: All exchanges posted to Unidata maintained email lists are
>> >>> recorded in the Unidata inquiry tracking system and made publicly
>> >>> available through the web.  Users who post to any of the lists we
>> >>> maintain are reminded to remove any personal information that they
>> >>> do not want to be made public.
>> >>>
>> >>>
>> >>> thredds mailing list
>> >>> thredds@xxxxxxxxxxxxxxxx
>> >>> For list information or to unsubscribe,  visit: 
>> >>> https://www.unidata.ucar.edu/mailing_lists/
>> >
>> > _______________________________________________
>> > NOTE: All exchanges posted to Unidata maintained email lists are
>> > recorded in the Unidata inquiry tracking system and made publicly
>> > available through the web.  Users who post to any of the lists we
>> > maintain are reminded to remove any personal information that they
>> > do not want to be made public.
>> >
>> >
>> > thredds mailing list
>> > thredds@xxxxxxxxxxxxxxxx
>> > For list information or to unsubscribe,  visit: 
>> > https://www.unidata.ucar.edu/mailing_lists/


  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: