[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20000414: XCD grid decoding problems at UVa (hopefully final)



>From: Anthony James Wimmers <address@hidden>
>Organization: UVa
>Keywords: 200004091534.JAA08082 XCD DMGRID ETA MRF FTP backup RTMODELS.CFG 
>NOGRIB.CFG GRIBDEC.CFG HRS.CFG RESOLV.SRV DECINFO

Tony,

>As Dr. Frankenstein said, "It works!" Here's to hoping that it
>keeps going.

Oh ye of little faith :-)

>Thanks again for all the help, and the education. 

You are welcome, and the education was on both of our ends.

The "fix" was not as simple as you may think.  This message is intended
to bring you up to date on and review exactly what was done on your
machine to get things working correctly.  (Note that all modified files
are in a new addendum that I put together on April 13).  It is written
so that others looking through our tracking system can (hopefully) use
the information to their benefit:

o modify RTGRIDS.CFG:

  o move decoded NGM grids start from 5051 to 5501; this was done in order
    to make room in the GRID namespace for the ETA > H+48 grids

    The lines from RTMODELS.CFG that demonstrate this are:

    ETA= 1 5011 120000 240000 480000  (see next comment)
    NGM= 3 5501 120000 240000 480000

    NOTE: if one wanted to file each ETA run into its own set of GRID
    files (i.e., 0Z run in one set of files; 6 Z run in a different set
    of files; etc.), then it is advised to move the ETA grids to a
    new set of GRID numbers and leave the NGM grids where they are.

  o change how ETA grids are stored:

    change:

    ETA=  3 5011 120000 240000 480000

    to:

    ETA=  1 5011 120000 240000 480000

    This says to store all grids with forecast times > 48 into a set of
    grid files (that did not previously exist)

  o change decoding of MRF to MRF and AVN; this change required mods
    to not only RTMODELS.CFG, but also to gbtbpds001.av1.  See the section
    on gbtbpds001.av1 below for details.

o modify the ADDE dataset definitions to match the new GRID file numbers
  for ETA, NGM, and ALL (ALL is all grids, so it also needs
  modifying).  The quickest way to do this is to edit the ADDE dataset
  definition file, RESOLV.SRV (this would have to be done for each user
  that has datasets defined for him/herselves.  Sites using the remote
  ADDE server technique will only need to modify the copy in
  ~mcidas/workdata (~mcidas/uvaworkdata in your case)).  The modified
  lines from RESOLV.SRV will look like:

N1=RTGRIDS,N2=ALL,TYPE=GRID,RT=Y,K=GRID,R1=5001,R2=5540,C=Real-Time Grids,
N1=RTGRIDS,N2=ETA,TYPE=GRID,RT=Y,K=GRID,R1=5011,R2=5070,C=Real-Time ETA Grids,
N1=RTGRIDS,N2=NGM,TYPE=GRID,RT=Y,K=GRID,R1=5501,R2=5540,C=Real-Time NGM Grids,
  
o update gbtbpds001.av1: A modified version of gbtbpds001.av1 has been
  in a McIDAS-X 7.6 addendum since Feb 12, 2000, but it was not
  installed in your active McIDAS-X working directory, uvaworkdata.  In
  addition to previous changes in gbtbpds001.av1 was yet another change
  in the model ID for ECMWF grids (now referenced as 195 instead of
  194).  This model number change has been semi-continuous for at least
  a couple of years.  SSEC came up with a way to get around this: they
  provide two new 'gbtb' tables which I uploaded and installed on your
  system.  More on this in the next section.

  For reference, the lines of meaning in this context are:

068 | 80 Wave triangular, 18-layer spectral model from aviation run      | AVN
069 | 80 Wave triangular, 18-layer spectral model from MRF run           | MRF
077 | 126 wave triangular, 18-layer Spectral Model from aviation run     | AVN
078 | 126 wave triangular, 18-layer Spectral Model from MRF run          | MRF
080 | 62 wave triangular, 18-layer Spectral Model from MRF run           | MRF
081 | Spectral Statistical Interpolation from aviation run               | AVN
082 | Spectral Statistical Interpolation from final run                  | MRF
083 | ETA Model - 80 km version                                          | ETA
084 | ETA Model - 40 km version                                          | ETA
085 | ETA Model - 30 km version                                          | ETA
089 | ETA Model - 48 km version                                          | ETA
090 | 62 Wave triangular, 28 layer 'Medium Range Forecast'               | MRF
091 | 62 Wave triangular, 28 layer 'Aviation'                            | AVN
092 | 62 Wave triangular, 28 layer 'Medium Range Forecast' final         | MRF
093 | 62 Wave triangular, 28 layer 'GDAS'                                | MRF
110 | ETA Model - 15 km version                                          | ETA
195 | European Center For Medium Range Weather Forecasting Model         | ECMF

o install two new tables used by dmgrid.k:  gbtbpds001.a74v1 and
  gbtbpds001.a98v1.  This gets around having to ever modify gbtbpds001.av1
  for ECMWF model ID changes (yahoo!).

o modify GRIBDEC.CFG: change MAXGRD=2000 to MAXGRD=5000.  Since there are
  more grids for the various models, the GRID file needed to have room
  to store more.  This is especially true since the 6 Z and 18 Z ETA
  runs are, by default, stored in the 0 Z and 12 Z ETA run files,
  respectively.  The effect of increasing MAXGRD= is to increase the
  GRID file header size.

o modify NOGRIB.CFG: change '89' and '85' both to '84'.  The PDS octets 6
  part of ETA grib messages were all changed to be '84' on about March
  29, 2000.  NOGRIB.CFG needed to be updated to reflect this change.

o install code modifications for routines used in dmgrid.k (_THIS_ was
  the real fix!):

  grib.h
  Mcgribdecoder.c

  The following is part of an alert I received from SSEC the same day
  I was talking to you by phone.  I had a chance to really go over
  their update after we hung up:

    "During the week of 2 April 2000, the National Weather Service
    began transmitting what appears to be a radar product in GRIB
    format on the NOAAPORT broadcast. The product contains
    approximately 165,000 points and therefore exceeds an array size in
    the GRIB decoder. This results in an array overflow error when the
    GRIB decoder attempts to decode the product. The exact behavior of
    the error is platform dependent, but resulted in segmentation
    violations on both our Sparc system running Solaris 7 and our PC
    system running Solaris Intel 7. The error may result in groups of
    needed grids missing from your grid files.  This addendum corrects
    the overflow error and instructs the GRIB decoder to not decode
    these radar products since it is unable to due to navigation
    issues."

  NOTE the comment "The error may result in groups of needed grids missing
  from your grid files."  This was, in fact, exactly what you were
  seeing on your machine (a Sun Solaris 7 box).  We were not seeing
  the problem on our Solaris x86 box, but was probably just pure luck!

  The installation of the new code required (everything done as the user
  'mcidas'):

  o FTP the new modules, grib.h and Mcgribdecoder.c, to your
    ~mcidas/mcidas7.6/src directory

  o stop the McIDAS-XCD grid data monitor:

    cd ~mcidas/uvaworkdata   ( for other sites this would be ~mcidas/workdata)
    decinfo.k SET DMGRID INACTIVE

  o "touching" Mcmkmcgrid.c

    cd ~mcidas/mcidas7.6/src
    touch Mcmkmcgrid.c

  o making a new executable for the grid data monitor, dmgrid.k:

    make dmgrid.k

  o installing a new dmgrid.k:

    rm ~/bin/dmgrid.k
    ln dmgrid.k ~/bin

  o restarting the grid data monitor:

    cd ~/uvaworkdata   ( for other sites this would be ~mcidas/workdata)
    decinfo.k SET DMGRID ACTIVE

  By the way, the new, large product is a 10 km national radar summary
  created from Radar Coded Messages (RCM) that are generated at NEXRAD
  PUPs.  McIDAS does not _yet_ decode these into anything useful, but
  it _will_ sometime in the future.

Also, the following inquiry by Bryan Batson of the Johnson Space Flight
Center (JSFC) to SSEC makes me feel that the effort of changing the
size of the grid spool file, HRS.SPL, from 16 to 128 MB was not wasted
effort:

  "Since reconfiguring our XCD system to accommodate the new ETA data,
  I've noticed a problem in that DMGRID does not seem to be processing
  all the grids it should be. My initial thinking on this is that the
  HRS.SPL file is wrapping before DMGRID can catch up (also, my XCD
  system running near 100% CPU load..with HP 755/99 MHZ system). I've
  told Chad about this and he thinks this may be the case, also. I will
  probably be increasing the size of HRS.SPL to see if that helps."

I don't think that windfall was necessarily having a problem as extreme
as Bryan's, but his comment and proposed solution sound mighty
suspicous.  I believe that increasing HRS.SPL from 16 to 32 MB may have
been all that was necessary.  Bumping it up to 128 MB was most likely
massive overkill.  If you feel like it, and if you want to recoop some
of your disk space, then I would recommend stepping the size of your
HRS.SPL file back to 32 MB.  For reference for others, this is done as
follows:

o stop the LDM (done as user running the LDM):

  ldmadmin stop

o change the size of HRS.SPL in the HRS.CFG file (done as 'mcidas'):

  cd ~mcidas/uvaworkdata  (for most sites this would be ~mcidas/workdata)

  change:

SPOOLENG=16             # size of the spool file to use in megabytes

  to:

SPOOLENG=32             # size of the spool file to use in megabytes

o delete HRS.SPL and GRIBDEC.PRO (done as user 'mcidas'):

  cd ~mcidas/uvaworkdata  (for most sites this will be ~mcidas/workdata)
  lwu.k DEL HRS.SPL
  lwu.k DEL GRIBDEC.PRO

o restart the LDM (done as user running the LDM):

  ldmadmin start

HRS.SPL and GRIBDEC.PRO will be recreated by DMGRID after grid products
start coming in again.

Simple fix, heh ;-)

Please let me know if you see any irregularities on your system after
these changes.

Tom