[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problems with motherlode and sunshine



William Noon wrote:
> 
> Is anyone else having a problem keeping a connection with motherlode and
> sunshine?  Overnight we lost about a half hour's worth of data and
> at 13Z or so we started getting disconnects to both sunshine and motherlode.
> 
> Some data has been trickling in but we are lagging by about an hour now...
> 
> Traceroutes and pings to both sites seem fine.
> 
> --Bill Noon
> Northeast Regional Climate Center
> Cornell University

Hi Bill,

I looked at the logs on motherlode.  The logs were loaded with messages
about your host, snow, like the following:

Jul 01 23:00:46 motherlode.ucar.edu snow[17156]: Connection from
snow.cit.cornell.edu
Jul 01 23:00:46 motherlode.ucar.edu snow(feed)[17156]: Starting Up:
20010701215240.249 TS_ENDT {{FSL2,
".*"},{WMO,  ".*"}}
Jul 01 23:00:47 motherlode.ucar.edu snow(feed)[17156]: topo: 
snow.cit.cornell.edu FSL2|WMO
Jul 01 23:00:47 motherlode.ucar.edu snow(feed)[17156]: RECLASS:
20010701220048.328 TS_ENDT {{FSL2,  ".*"},{WMO,  ".*"}}
Jul 01 23:10:25 motherlode.ucar.edu snow(feed)[17156]: h_clnt_call:
snow.cit.cornell.edu: BLKDATA: time
elapsed  30.416736
Jul 01 23:13:40 motherlode.ucar.edu snow(feed)[17156]: HVJI85 ECMF
011200 /mECMWF_199: RPC: Server can't decode arguments (11)
Jul 01 23:13:40 motherlode.ucar.edu snow(feed)[17156]: pq_sequence
failed: I/O error (errno = 5)
Jul 01 23:13:40 motherlode.ucar.edu snow(feed)[17156]: Exiting

There was only one other site for which these messages occurs.  This
site is in Costa Rica, and we have been working with them regarding bad
connectivity.  This, coupled with the fact that nobody else reported
having such problems, makes me thing the problem was at your site.

In looking  at the logs there were some connectivity problems on 6/29
around 2:30Z.  

Jun 29 02:30:29 motherlode.ucar.edu snow(feed)[16108]: RECLASS:
20010629013050.608 TS_ENDT {{FSL2,  ".*"},{WMO,  ".*"}}
Jun 29 03:17:06 motherlode.ucar.edu snow(feed)[16108]: pq_sequence
failed: I/O error (errno = 5)
Jun 29 03:17:06 motherlode.ucar.edu snow(feed)[16108]: Exiting

[A seven hour disconnect???]

Jun 29 11:27:11 motherlode.ucar.edu snow[17531]: Connection from
snow.cit.cornell.edu
Jun 29 11:27:11 motherlode.ucar.edu snow(feed)[17531]: Starting Up:
20010629102104.040 TS_ENDT {{FSL2,
".*"},{WMO,  ".*"}}
Jun 29 11:27:12 motherlode.ucar.edu snow(feed)[17531]: topo: 
snow.cit.cornell.edu FSL2|WMO
Jun 29 11:27:12 motherlode.ucar.edu snow(feed)[17531]: FZPN26 KWBC
291020 /pOFFPZ6: RPC: Unable to receive (4)
Jun 29 11:27:12 motherlode.ucar.edu snow(feed)[17531]: pq_sequence
failed: I/O error (errno = 5)
Jun 29 11:27:12 motherlode.ucar.edu snow(feed)[17531]: Exiting

Jun 29 11:33:09 motherlode.ucar.edu snow[18019]: Connection from
snow.cit.cornell.edu
Jun 29 11:33:09 motherlode.ucar.edu snow(feed)[18019]: Starting Up:
20010629102746.556 TS_ENDT {{FSL2,
".*"},{WMO,  ".*"}}
Jun 29 11:33:09 motherlode.ucar.edu snow(feed)[18019]: topo: 
snow.cit.cornell.edu FSL2|WMO
Jun 29 11:33:10 motherlode.ucar.edu snow(feed)[18019]: RECLASS:
20010629103331.848 TS_ENDT {{FSL2,  ".*"},{WMO,  ".*"}}

Then, around 14Z things started failing:

Jun 29 14:02:23 motherlode.ucar.edu snow(feed)[18019]: h_clnt_call:
snow.cit.cornell.edu: BLKDATA: time
elapsed  33.885993
Jun 29 14:06:01 motherlode.ucar.edu snow(feed)[18019]: pq_sequence
failed: I/O error (errno = 5)
Jun 29 14:06:01 motherlode.ucar.edu snow(feed)[18019]: Exiting

Jun 29 14:06:31 motherlode.ucar.edu snow[2814]: Connection from
snow.cit.cornell.edu
Jun 29 14:06:31 motherlode.ucar.edu snow(feed)[2814]: Starting Up:
20010629140403.619 TS_ENDT {{FSL2,  ".*"},{WMO,  ".*"}}
Jun 29 14:06:32 motherlode.ucar.edu snow(feed)[2814]: topo: 
snow.cit.cornell.edu FSL2|WMO
Jun 29 14:07:32 motherlode.ucar.edu snow(feed)[2814]: pq_sequence
failed: I/O error (errno = 5)
Jun 29 14:07:32 motherlode.ucar.edu snow(feed)[2814]: Exiting

Jun 29 14:08:10 motherlode.ucar.edu snow[3002]: Connection from
snow.cit.cornell.edu
Jun 29 14:08:10 motherlode.ucar.edu snow(feed)[3002]: Starting Up:
20010629140403.619 TS_ENDT {{FSL2,  ".*"},{WMO,  ".*"}}
Jun 29 14:08:11 motherlode.ucar.edu snow(feed)[3002]: topo: 
snow.cit.cornell.edu FSL2|WMO
Jun 29 14:08:44 motherlode.ucar.edu snow(feed)[3002]: h_clnt_call:
snow.cit.cornell.edu: BLKDATA: time elapsed  33.328610
Jun 29 14:12:02 motherlode.ucar.edu snow(feed)[3002]: h_clnt_call:
snow.cit.cornell.edu: BLKDATA: time elapsed  30.584037
Jun 29 14:14:32 motherlode.ucar.edu snow(feed)[3002]: pq_sequence
failed: I/O error (errno = 5)
Jun 29 14:14:32 motherlode.ucar.edu snow(feed)[3002]: Exiting

Jun 29 14:18:36 motherlode.ucar.edu snow[4038]: Connection from
snow.cit.cornell.edu
Jun 29 14:18:36 motherlode.ucar.edu snow(feed)[4038]: Starting Up:
20010629140741.491 TS_ENDT {{FSL2,  ".*"},{WMO,  ".*"}}
Jun 29 14:18:37 motherlode.ucar.edu snow(feed)[4038]: topo: 
snow.cit.cornell.edu FSL2|WMO
Jun 29 14:19:16 motherlode.ucar.edu snow(feed)[4038]: ZEGZ98 KRHA 291303
/mNWS_151: RPC: Server can't decode arguments (11)

After this the "can't decode arguments" problem continues up through
about 17:50Z today.  After that there are just a RECLASS and a "time
elapsed" message, but otherwise it looks as though the problem has gone
away.

Did something happen on your campus?

Anne
-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************