NOTE: The decoders
mailing list is no longer active. The list archives are made available for historical reasons.
David, This is discouraging, I spent hours looking through raw bulletins to "try" make the decoder correct. I might try looking at the station ID before processing, don't know if that will help. My philosophy is it's better to disregard bulletins/reports rather then enter "bad" data into a file. That said, your example bulletin should be discarded. ugh. Will let you know about my new ideas. Robb... On Thu, 4 Mar 2004, David Larson wrote:
I do see non-US bulletins that are split "badly" according to this code change ... 628 SANK31 MNMG 041706 METAR MNPC 041700Z 08012KT 7000 BKN016 29/26 Q1015 MNRS 041700Z 06010KT 9999 FEW022 BKN250 30/23 Q1012 MNJG 041700Z 36004KT 7000 VCRA BKN016 22/17 Q1015 MNJU 041700Z 10006KT 9999 SCT025 32/19 Q1013 MNCH 041700Z 02010KT 9999 SCT030 33/21 Q1012 MNMG 041700Z 07016KT 9999 SCT025 32/20 Q1011 A2988 MNBL 041700Z 10008KT 9999 SCT019 SCT070 29/25 Q1014 Not only is each line a new report, but what worsens it is that the *last* entry *is* separated by an equal! Yuck. Perhaps this is just a junk bulletin? I'm suprised that it could even go out this way. Does anyone you know even make an attempt to use the perl metar decoder for non-US stations? I've tried long enough to estimate the work as a *lot*. Dave David Larson wrote: > I've looked into this problem, which I didn't know existed. > > Your code is now: > > # Separate bulletins into reports > if( /=\n/ ) { > s#=\s+\n#=\n#g ; > @reports = split( /=\n/ ) ; > } else { > #@reports = split ( /\n/ ) ; > s#\n# #g ; > next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > $reports[ 0 ] = $_ ; > } > > But based on your assumption that these bulletins will not, and > cannot, contain multiple reports (which seems and appears to be > reasonable), then there really only needs to be one split, right? > Because if there is no equal, the entire line will be placed into the > first report. This seems to be a slight simplification: > > # Separate bulletins into reports > if( /=\n/ ) { > s#=\s+\n#=\n#g ; > } else { > s#\n# #g ; > } > @reports = split( /=\n/ ) ; > ... snip ... the next line is placed down many lines > next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; > > Also, it is an error to have multiple time specifications in any > report, right? So that can be generalized as well, as I have done above. > > You asked for my comments, and well, there you have them! :-) I > might take a closer look at the rest of the changes as well, but that > will be delayed a bit. > > I sure appreciate your quick responses to all correspondence. > > Dave > > Robb Kambic wrote: > >> David, >> >> Yes, I know about the problem. The problem exists in bulletins that >> don't >> use the = sign to seperate reports. The solution is to assume that >> bulletins >> that don't use = only have one report. I scanned many raw reports and >> this >> seems to be true, so I changed the code to: >> >> < @reports = split ( /\n/ ) ; >> --- >> >> >>> #@reports = split ( /\n/ ) ; >>> s#\n# #g ; >>> next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; >>> $reports[ 0 ] = $_; >>> >> >> >> The new code is attached. I'm also working on a newer version of the >> decoder, it's in the ftp decoders directory. ie >> >> metar2nc.new and metar.cdl.new >> >> The pqact.conf entry needs to change \2:yy to \2:yyyy because it now >> uses >> the century too. The cdl is different, merges vars that have different >> units into one. ie wind knots, mph, and m/s are all store using winds >> m/s. Also, store all reports per station into one record. Take a >> look, I >> would appreciate any comments before it's released. >> >> Robb... >> >> >> On Tue, 2 Mar 2004, David Larson wrote: >> >> >> >>> Robb, >>> >>> I've been chasing down a problem that seems to cause perfectly good >>> reports to be discarded by the perl metar decoder. There is a comment >>> in the 2.4.4 decoder that reads "reports appended together wrongly", >>> the >>> code in this area takes the first line as the report to process, and >>> discards the next line. >>> >>> To walk through this, I'll refer to the following report: >>> >>> 132 >>> SAUS80 KWBC 021800 RRD >>> METAR >>> K4BL 021745Z 12005KT 3SM BR OVC008 01/M01 RMK SLP143 NOSPECI 60011 >>> 8/2// T00061006 10011 21017 51007 >>> >>> The decoder attempts to classify the report type ($rep_type on line 257 >>> of metar2nc), in doing so, it classifies this report as a "SPECI" ... >>> which isn't what you'd expect by visual inspection of the report. >>> However, perl is doing the right thing given that it is asked to match >>> on #(METAR|SPECI) \d{4,6}Z?\n# which exists in the remarks of the >>> report. >>> >>> The solution is probably to bind the text to the start of the line with >>> a caret. Seems to work pretty well so far. >>> >>> I've changed the lines (257-263) in metar2nc-v2.4.4 from: >>> >>> if( s#(METAR|SPECI) \d{4,6}Z?\n## ) { >>> $rep_type = $1 ; >>> } elsif( s#(METAR|SPECI)\s*\n## ) { >>> $rep_type = $1 ; >>> } else { >>> $rep_type = "METAR" ; >>> } >>> >>> To: >>> >>> if( s#^(METAR|SPECI) \d{4,6}Z?\n## ) { >>> $rep_type = $1 ; >>> } elsif( s#^(METAR|SPECI)\s*\n## ) { >>> $rep_type = $1 ; >>> } else { >>> $rep_type = "METAR" ; >>> } >>> >>> I simply added the caret (^) to bind the pattern to the start of the >>> report. >>> >>> Let me know what you think. >>> Dave >>> >>> >> >> ------------------------------------------------------------------------ >> >> #! /usr/local/bin/perl >> # >> # usage: metar2nc cdlfile [datatdir] [yymm] < ncfile >> # >> # >> #chdir( "/home/rkambic/code/decoders/src/metar" ) ; >> >> use NetCDF ; >> use Time::Local ; >> # process command line switches >> while ($_ = $ARGV[0], /^-/) { >> shift; >> last if /^--$/; >> /^(-v)/ && $verbose++; >> } >> # process input parameters >> if( $#ARGV == 0 ) { >> $cdlfile = $ARGV[ 0 ] ; >> } elsif( $#ARGV == 1 ) { >> $cdlfile = $ARGV[ 0 ] ; >> if( $ARGV[ 1 ] =~ /^\d/ ) { >> $yymm = $ARGV[ 1 ] ; >> } else { >> $datadir = $ARGV[ 1 ] ; >> } >> } elsif( $#ARGV == 2 ) { >> $cdlfile = $ARGV[ 0 ] ; >> $datadir = $ARGV[ 1 ] ; >> $yymm = $ARGV[ 2 ] ; >> } else { >> die "usage: metar2nc cdlfile [datatdir] [yymm] < ncfile $!\n" ; >> } >> print "Missing cdlfile file $cdlfile: $!\n" unless -e $cdlfile ; >> >> if( -e "util/ncgen" ) { >> $ncgen = "util/ncgen" ; >> } elsif( -e "/usr/local/ldm/util/ncgen" ) { >> $ncgen = "/usr/local/ldm/util/ncgen" ; >> } elsif( -e "/upc/netcdf/bin/ncgen" ) { >> $ncgen = "/upc/netcdf/bin/ncgen" ; >> } elsif( -e "./ncgen" ) { >> $ncgen = "./ncgen" ; >> } else { >> open( NCGEN, "which ncgen |" ) ; >> $ncgen = <NCGEN> ; >> close( NCGEN ) ; >> >> if( $ncgen =~ /no ncgen/ ) { >> die "Can't find NetCDF utility 'ncgen' in PATH, util/ncgen >> /usr/local/ldm/util/ncgen, /upc/netcdf/bin/ncgen, or ./ncgen : $!\n" ; >> } else { >> $ncgen = "ncgen" ; >> } >> } >> # the data and the metadata directories $datadir = "." if( ! $datadir >> ) ; >> $metadir = $datadir . "/../metadata/surface/metar" ; >> # redirect STDOUT and STDERR >> open( STDOUT, ">$datadir/metarLog.$$.log" ) || >> die "could not open $datadir/metarLog.$$.log: $!\n" ; >> open( STDERR, ">&STDOUT" ) || >> die "could not dup stdout: $!\n" ; >> select( STDERR ) ; $| = 1 ; >> select( STDOUT ) ; $| = 1 ; >> >> die "Missing cdlfile file $cdlfile: $!\n" unless -e $cdlfile ; >> >> # year and month >> if( ! $yymm ) { >> $theyear = (gmtime())[ 5 ] ; >> $theyear = ( $theyear < 100 ? $theyear : $theyear - 100 ) ; >> $theyear = sprintf( "%02d", $theyear ) ; >> $themonth = (gmtime())[ 4 ] ; >> $themonth++ ; >> $yymm = $theyear . sprintf( "%02d", $themonth ) ; >> } else { >> $theyear = substr( $yymm, 0, 2 ) ; >> $themonth = substr( $yymm, 2 ) ; >> } >> # file used for bad metars or prevention of overwrites to ncfiles >> open( OPN, ">>$datadir/rawmetars.$$.nc" ) || die "could not >> open $datadir/rawmetars.$$.nc: $!\n" ; >> # set error handling to verbose only >> $result = NetCDF::opts( VERBOSE ) ; >> >> # set interrupt handler >> $SIG{ 'INT' } = 'atexit' ; >> $SIG{ 'KILL' } = 'atexit' ; >> $SIG{ 'TERM' } = 'atexit' ; >> $SIG{ 'QUIT' } = 'atexit' ; >> >> # set defaults >> >> $F = -99999 ; >> $A = \$F ; >> $S1 = "\0" ; >> $AS1 = \$S1 ; >> $S2 = "\0\0" ; >> $AS2 = \$S2 ; >> $S3 = "\0\0\0" ; >> $AS3 = \$S3 ; >> $S4 = "\0\0\0\0" ; >> $AS4 = \$S4 ; >> $S8 = "\0" x 8 ; >> $AS8 = \$S8 ; >> $S10 = "\0" x 10 ; >> $AS10 = \$S10 ; >> $S15 = "\0" x 15 ; >> $AS15 = \$S15 ; >> $S32 = "\0" x 32 ; >> $AS32 = \$S32 ; >> $S128 = "\0" x 128 ; >> $AS128 = \$S128 ; >> >> %CDL = ( >> "rep_type", 0, "stn_name", 1, "wmo_id", 2, "lat", 3, "lon", 4, >> "elev", 5, >> "ob_hour", 6, "ob_min", 7, "ob_day", 8, "time_obs", 9, >> "time_nominal", 10, "AUTO", 11, "UNITS", 12, "DIR", 13, "SPD", 14, >> "GUST", 15, "VRB", 16, "DIRmin", 17, "DIRmax", 18, "prevail_VIS_SM", >> 19, "prevail_VIS_KM", 20, "plus_VIS_SM", 21, "plus_VIS_KM", 22, >> "prevail_VIS_M", 23, "VIS_dir", 24, "CAVOK", 25, "RVRNO", 26, >> "RV_designator", 27, "RV_above_max", 28, "RV_below_min", 29, >> "RV_vrbl", 30, "RV_min", 31, "RV_max", 32, "RV_visRange", 33, "WX", >> 34, "vert_VIS", 35, "cloud_type", 36, "cloud_hgt", 37, >> "cloud_meters", 38, "cloud_phenom", 39, "T", 40, "TD", 41, >> "hectoPasc_ALTIM", 42, "inches_ALTIM", 43, "NOSIG", 44, >> "TornadicType", 45, "TornadicLOC", 46, "TornadicDIR", 47, >> "BTornadic_hh", 48, "BTornadic_mm", 49, >> "ETornadic_hh", 50, "ETornadic_mm", 51, "AUTOindicator", 52, >> "PKWND_dir", 53, "PKWND_spd", 54, "PKWND_hh", 55, "PKWND_mm", 56, >> "WshfTime_hh", 57, "WshfTime_mm", 58, "Wshft_FROPA", 59, "VIS_TWR", 60, >> "VIS_SFC", 61, "VISmin", 62, "VISmax", 63, "VIS_2ndSite", 64, >> "VIS_2ndSite_LOC", 65, "LTG_OCNL", 66, "LTG_FRQ", 67, "LTG_CNS", 68, >> "LTG_CG", 69, "LTG_IC", 70, "LTG_CC", 71, "LTG_CA", 72, "LTG_DSNT", 73, >> "LTG_AP", 74, "LTG_VcyStn", 75, "LTG_DIR", 76, "Recent_WX", 77, >> "Recent_WX_Bhh", 78, "Recent_WX_Bmm", 79, "Recent_WX_Ehh", 80, >> "Recent_WX_Emm", 81, "Ceiling_min", 82, "Ceiling_max", 83, >> "CIG_2ndSite_meters", 84, "CIG_2ndSite_LOC", 85, "PRESFR", 86, >> "PRESRR", 87, >> "SLPNO", 88, "SLP", 89, "SectorVIS_DIR", 90, "SectorVIS", 91, "GR", 92, >> "GRsize", 93, "VIRGA", 94, "VIRGAdir", 95, "SfcObscuration", 96, >> "OctsSkyObscured", 97, "CIGNO", 98, "Ceiling_est", 99, "Ceiling", 100, >> "VrbSkyBelow", 101, "VrbSkyLayerHgt", 102, "VrbSkyAbove", 103, >> "Sign_cloud", 104, "Sign_dist", 105, "Sign_dir", 106, "ObscurAloft", >> 107, >> "ObscurAloftSkyCond", 108, "ObscurAloftHgt", 109, "ACFTMSHP", 110, >> "NOSPECI", 111, "FIRST", 112, "LAST", 113, "Cloud_low", 114, >> "Cloud_medium", 115, "Cloud_high", 116, "SNINCR", 117, >> "SNINCR_TotalDepth", 118, >> "SN_depth", 119, "SN_waterequiv", 120, "SunSensorOut", 121, >> "SunShineDur", 122, >> "PRECIP_hourly", 123, "PRECIP_amt", 124, "PRECIP_24_amt", 125, >> "T_tenths", 126, >> "TD_tenths", 127, "Tmax", 128, "Tmin", 129, "Tmax24", 130, "Tmin24", >> 131, "char_Ptend", 132, "Ptend", 133, "PWINO", 134, "FZRANO", 135, >> "TSNO", 136, "PNO", 137, "maintIndicator", 138, "PlainText", 139, >> "report", 140, "remarks", 141 ) ; >> >> # default netCDF record structure, contains all vars for the METAR >> reports >> @defaultrec = ( $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS3, >> $A, $A, >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS2, $A, $A, [( $S3, $S3, $S3, >> $S3 )], >> [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( >> $F, $F, $F, $F )], [( $F, $F, $F, $F )], [( $F, $F, $F, $F )], $AS32, >> $A, >> [( $S4, $S4, $S4, $S4, $S4, $S4 )], [( $F, $F, $F, $F, $F, $F )], >> [( $F, $F, $F, $F, $F, $F )], [( $S4, $S4, $S4, $S4, $S4, $S4 )], >> $A, $A, $A, $A, $A, $AS15, $AS10, $AS2, $A, $A, $A, $A, $AS4, $A, $A, >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS10, $A, $A, $A, $A, $A, >> $A, $A, $A, $A, $A, $AS2, [( $S8, $S8, $S8 )], [( $F, $F, $F )], >> [( $F, $F, $F )], [( $F, $F, $F )], [( $F, $F, $F )], $A, $A, $A, $A, >> $A, $A, $A, $A, $AS2, $A, $A, $A, $A, $AS2, $AS8, $A, $A, $A, $A, >> $AS3, $A, $AS3, $AS10, $AS10, $AS10, $AS8, $AS3, $A, $A, $A, $A, $A, >> $AS1, $AS1, $AS1, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, >> $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $A, $AS128, $AS128, >> $AS128 ) ; >> >> # two fold purpose array, if entry > 0, then var is requested and >> it's value >> # is the position in the record, except first entry >> @W = ( 0 ) x ( $#defaultrec +1 ) ; >> $W[ 0 ] = -1 ; >> >> # open cdl and create record structure according to variables >> open( CDL, "$cdlfile" ) || die "could not open $cdlfile: $!\n" ; >> $i = 0 ; >> while( <CDL> ) { >> if( s#^\s*(char|int|long|double|float) (\w{1,25})## ) { >> ( $number ) = $CDL{ $2 } ; >> push( @rec, $defaultrec[ $number ] ) ; >> $W[ $number ] = $i++ ; >> } >> } >> close CDL ; >> undef( @defaultrec ) ; >> undef( %CDL ) ; >> >> # read in station data >> if( -e "etc/sfmetar_sa.tbl" ) { >> $sfile = "etc/sfmetar_sa.tbl" ; >> } elsif( -e "./sfmetar_sa.tbl" ) { >> $sfile = "./sfmetar_sa.tbl" ; >> } else { >> die "Can't find sfmetar_sa.tbl station file.: $!\n" ; >> } >> open( STATION, "$sfile" ) || die "could not open $sfile: $!\n" ; >> >> while( <STATION> ) { >> s#^(\w{3,6})?\s+(\d{4,5}).{40}## ; >> $id = $1 ; >> $wmo_id = $2 ; >> $wmo_id = "0" . $wmo_id if( length( $wmo_id ) == 4 ) ; >> ( $lat, $lon, $elev ) = split ; >> $lat = sprintf( "%7.2f", $lat / 100 ) ; >> $lon = sprintf( "%7.2f", $lon / 100) ; >> >> # set these vars ( $wmo_id, $lat, $lon, $elev ) $STATIONS{ >> "$id" } = "$wmo_id $lat $lon $elev" ; >> } >> close STATION ; >> >> # read in list of already processed reports if it exists >> # open metar.lst, list of reports processed in the last 4 hours. >> if( -e "$datadir/metar.lst" ) { >> open( LST, "$datadir/metar.lst" ) || die "could not open >> $datadir/metar.lst: $!\n" ; >> while( <LST> ) { >> ( $stn, $rtptime, $hr ) = split ; >> $reportslist{ "$stn $rtptime" } = $hr ; >> } >> close LST ; >> #unlink( "$datadir/metar.lst" ) ; >> } >> # Now begin parsing file and decoding observations breaking on cntrl C >> $/ = "\cC" ; >> >> # set select processing here from STDIN >> START: >> while( 1 ) { >> open( STDIN, '-' ) ; >> vec($rin,fileno(STDIN),1) = 1; >> $timeout = 1200 ; # 20 minutes >> $nfound = select( $rout = $rin, undef, undef, $timeout ); >> # timed out >> if( ! $nfound ) { >> print "Shut down, time out 20 minutes\n" ; >> &atexit() ; >> } >> &atexit( "eof" ) if( eof( STDIN ) ) ; >> >> # Process each line of metar bulletins, header first >> $_ = <STDIN> ; >> #next unless /METAR|SPECI/ ; >> s#\cC## ; >> s#\cM##g ; >> s#\cA\n## ; >> s#\c^##g ; >> >> s#\d\d\d \n## ; >> s#\w{4}\d{1,2} \w{4} (\d{2})(\d{2})(\d{2})?.*\n## ; >> $tday = $1 ; >> $thour = $2 ; >> $thour = "23" if( $thour eq "24" ) ; >> $tmin = $3 ; >> $tmin = "00" unless( $tmin ) ; >> next unless ( $tday && defined( $thour ) ) ; >> $time_trans = thetime( "trans" ) ; >> if( s#(METAR|SPECI) \d{4,6}Z?\n## ) { >> $rep_type = $1 ; >> } elsif( s#(METAR|SPECI)\s*\n## ) { >> $rep_type = $1 ; >> } else { >> $rep_type = "METAR" ; >> } >> # Separate bulletins into reports if( /=\n/ ) { >> s#=\s+\n#=\n#g ; >> @reports = split( /=\n/ ) ; >> } else { >> #@reports = split ( /\n/ ) ; >> s#\n# #g ; >> next if( /\d{4,6}Z.*\d{4,6}Z/ ) ; >> $reports[ 0 ] = $_ ; >> } >> >> >
============================================================================== Robb Kambic Unidata Program Center Software Engineer III Univ. Corp for Atmospheric Research rkambic@xxxxxxxxxxxxxxxx WWW: http://www.unidata.ucar.edu/ ==============================================================================
decoders
archives: