Re: [thredds] HDF5 reading strategy in Thredds WMS

Dear Jon,

Thanks for the feedback.

The scanline algorithm does give me better performance as it results in small 
memory footprint. Sorry I'm not able to provide you a sample file, it's too big 
to send (>4GB).

I agree "the algorithm to decide the DataReadingStrategy needs to take into 
account the size of the data grid" 
In my case, all my local netcdf4/hdf5 files are bigger than 4GB, and my server 
has 8GB memory, so that the scanline reading strategy is more appropriate.
However if my server had greater capacity, I wouldn't care too much about the 
reading strategy.

Therefore, a configurable option in wmsConfig.xml would be a better solution. 
And let users to decide which algorithm to use depends on their service 
capacities.


Best regards,
Lin





     

 





-----Original Message-----
From: thredds-bounces@xxxxxxxxxxxxxxxx 
[mailto:thredds-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Jon Blower
Sent: Monday, 13 June 2011 8:48 PM
To: thredds@xxxxxxxxxxxxxxxx
Subject: Re: [thredds] HDF5 reading strategy in Thredds WMS

Dear Lin,

Very sorry for the slow reply to this.  If the scanline algorithm gives you 
acceptable performance then there is no reason not to use it for your dataset.  
Would you be able to send me a sample file so that I can do some tests?

Perhaps the algorithm to decide the DataReadingStrategy needs to take into 
account the size of the data grid.  Or perhaps we should make this configurable.

Best wishes,
Jon

----------------------------------------------------------------------

Message: 1
Date: Mon, 23 May 2011 15:42:40 +1000
From: <Xiangtan.Lin@xxxxxxxx>
To: <thredds@xxxxxxxxxxxxxxxx>
Subject: [thredds] HDF5 reading strategy in Thredds WMS
Message-ID:
        <E8172DAF55455A40AD034D781B39F62285FE0B70F5@xxxxxxxxxxxxxxxxxxxxxxxxxx>
        
Content-Type: text/plain; charset="us-ascii"

Hi all,

The BOUNDING_BOX reading strategy is used for file formats other than netCDF 
and HDF4 in Thredds WMS.

uk.ac.rdg.resc.ncwms.cdm.CdmUtils
public static DataReadingStrategy getOptimumDataReadingStrategy(NetcdfDataset 
nc) { String fileType = nc.getFileTypeId(); return "netCDF".equals(fileType) || 
"HDF4".equals(fileType) ? DataReadingStrategy.SCANLINE
: DataReadingStrategy.BOUNDING_BOX;
}

I'm working with a 8GB netCDF4/HDF5 file which has a 50000 * 50000 grid. With 
the BOUNDING_BOX reading strategy, it theoretically requires 9.3GB memory (for 
creating a float array of 50000 * 50000 entries), which leads to outOfMemory 
exception on my test machine.

I've tried to use the SCANLINE reading strategy by adding another OR condition 
(|| "HDF5".equals(fileType) to the above code, and it appears the outOfMemory 
issue is gone.

According to the documentation, the SCANLINE strategy is recommended for local 
uncompressed files, while the BOUNDING_BOX is recommended for remote and 
compressed files.

Can someone please advise whether it is OK to use SCANLINE strategy for my 
local uncompressed HDF5 files by modifying the above code?
Is there anything else I should be aware of?

Regards and thanks,
Lin


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/ 



  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: