Robert -- Thanks for your specifications. We're planning an upgrade for our
EDEX server with a single 1.9TB SAS "Mixed Use" SSD, which uses MLC NAND flash
technology and has a reliability/endurance rating of 3 DWPD (Drive Writes Per
Day). Unfortunately, our server doesn't support an NVMe interface. The atop
utility reports that our current 10K SAS drive reaches 100% busy even when no
CAVE clients are running. In general, the highest disk occupation percentage
varies among the ldmd, java, and httpd processes (all mostly in write
operations). I'm looking forward to (hopefully) being able to report here
improved drive performance with the new SSD!
Michael -- As a workaround while we wait for this new drive to arrive, we
attempted to have eight students simultaneously connect to
edex-cloud.unidata.ucar.edu for the first time today, April 24, 2019, between
about 14:05-14:20Z (I'd removed all of their ~/caveData directories prior to
today so everyone was starting from scratch). It took an excruciatingly long
time to get past the initial splash screen, on the order of several minutes.
CAVE eventually said it was "not responding", and asked if we wanted to force
quit or wait. Students ended up needing to work in two groups in order to use
CAVE successfully. Once CAVE actually launched, loading data was relatively
responsive. I didn't have an opportunity to try turning off data caching. It
may be helpful, when looking at log files on the cloud EDEX server, that the
names of the computers that students used today begin with l-dl-asac315
immediately appended by two additional numbers.
-Jason
_________________________________________
Jason N. T. Kaiser
Atmospheric Sciences Data Systems Administrator
Northern Vermont University-Lyndon
http://atmos.NorthernVermont.edu
-----Original Message-----
From: Haley, Robert E <haley787@xxxxxxxx>
Sent: Monday, April 8, 2019 11:55 AM
To: Kaiser, Jason N. <jason.kaiser@xxxxxxxxxxxxxxxxxxx>; Michael James
<mjames@xxxxxxxx>
Cc: awips2-users@xxxxxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Re: [awips2-users] EDEX to CAVE latency with multiple
simultaneous users
Hello Jason,
Currently we're running two Samsung 970 EVO NVMe M.2 1TB solid state drives in
RAID 1 on the PCIe card. We wanted to try a mainstream, off-the-shelf drive to
see how it performed before looking into something fancier, and didn't really
compare the NAND architecture of different drives. The sales pitch for V-NAND
seemed good enough.
According to the RAID management utility, we're writing 1.38 terabytes of data
per day, and the 970 EVO has write endurance up to 600 terabytes. So we're
expecting only a year of service before the drives have to be replaced.
At the very least I'd recommend the 970 Pro, which has twice the write
endurance (1,200 terabytes) for "only" a 50% price increase. At the enterprise
level there are "write intensive" SSDs with ten times the write endurance, but
they start at more than $4,000. That's a pretty tough bill to swallow. We
figure replacing the drives every one or two years, in addition to saving
money, gives us the opportunity to replace old drives with higher performance,
higher capacity options as they come out, improving the capability of our AWIPS
2 server over time.
Robert Haley
Weather Systems Administrator
Applied Aviation Sciences, College of Aviation
600. S. Clyde Morris Blvd.
Daytona Beach, FL 32114
386.323.8033
haley787@xxxxxxxx
Embry-Riddle Aeronautical University
Florida | Arizona | Worldwide
-----Original Message-----
From: Kaiser, Jason N. <jason.kaiser@xxxxxxxxxxxxxxxxxxx>
Sent: Friday, April 5, 2019 2:50 PM
To: Haley, Robert E <haley787@xxxxxxxx>; Michael James <mjames@xxxxxxxx>
Cc: awips2-users@xxxxxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Re: [awips2-users] EDEX to CAVE latency with multiple
simultaneous users
Robert,
Thank you for your informative experience. Out of curiosity, would you mind
sharing what the brand and model number is for NVMe SSDs you use? With EDEX
constantly performing significant amounts of disk writing, I've read that for
SSDs, the underlying NAND flash type may be an important consideration when
determining long-term SSD reliability/endurance (i.e. Drive Writes Per Day).
Jason N. T. Kaiser
Atmospheric Sciences Data Systems Administrator Northern Vermont
University-Lyndon
-----Original Message-----
From: Haley, Robert E <haley787@xxxxxxxx>
Sent: Thursday, April 4, 2019 1:46 PM
To: Kaiser, Jason N. <jason.kaiser@xxxxxxxxxxxxxxxxxxx>; Michael James
<mjames@xxxxxxxx>
Cc: awips2-users@xxxxxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Re: [awips2-users] EDEX to CAVE latency with multiple
simultaneous users
Jason,
We experienced an issue similar to what you're describing and for us the
culprit was insufficient disk I/O on the EDEX server, even with an array of
eight 10K RPM 12G SAS hard disks.
When a class started launching CAVE and loading data not only would their
clients slow down (even menus would take time to populate), but we also saw
data processing latency on EDEX start climbing until the class was done loading
their initial data sets. In a few cases the EDEX server could not catch up
with processing and we had to stop the LDM and give EDEX a chance to clear the
backlog.
Monitoring with top we saw IO-waits typically between 5% and 10%, with
instances as high as 20%
It's worth noting we originally had possibly the most inefficient disk set up
imaginable: EDEX was running on a VM and virtual storage, so the hypervisor was
dealing with two layers of file systems on a RAID 5 array. That was a lot of
extra work...
We replaced the hard disk array with a PCIe RAID card with two NVMe SSDs,
directly attached the SSD array to the VM, and the difference was MIND BLOWING.
IO-wait stays below 1% and data processing latency messages have disappeared
entirely, even when a class of 30 students are using CAVE. We even saw CPU
usage drop significantly, probably because very little time is being wasted
waiting for read/write ops now.
Robert Haley
Weather Systems Administrator
Applied Aviation Sciences, College of Aviation 600. S. Clyde Morris Blvd.
Daytona Beach, FL 32114
386.323.8033
haley787@xxxxxxxx
Embry-Riddle Aeronautical University
Florida | Arizona | Worldwide
-----Original Message-----
From: awips2-users-bounces@xxxxxxxxxxxxxxxx
<awips2-users-bounces@xxxxxxxxxxxxxxxx> On Behalf Of Kaiser, Jason N.
Sent: Wednesday, April 3, 2019 12:36 PM
To: Michael James <mjames@xxxxxxxx>
Cc: awips2-users@xxxxxxxxxxxxxxxx
Subject: [EXTERNAL] Re: [awips2-users] EDEX to CAVE latency with multiple
simultaneous users
Hi Michael,
/awips2/cave/ is locally mounted, on each SSD. Only the home directories are
NFS-mounted. Multiple sessions of CAVE are run as different users (i.e.
students are each logged in to Linux with their own user account), meaning that
you’re correct, no two users should be reading/writing to the same ~/caveData
directory at the same time. I will try turning off data cacheing and see if
that alleviates the problem.
-Jason
From: Michael James <mjames@xxxxxxxx>
Sent: Wednesday, April 3, 2019 10:52 AM
To: Kaiser, Jason N. <jason.kaiser@xxxxxxxxxxxxxxxxxxx>
Cc: awips2-users@xxxxxxxxxxxxxxxx
Subject: Re: [awips2-users] EDEX to CAVE latency with multiple simultaneous
users
Hi Jason,
I don't believe that CAVE using an NFS-mounted user home directory should
result in the performance issues you are experiencing, but I wonder if multiple
users running the same CAVE executable over NFS could cause this... is that how
the application is being used? (meaning /awips2/cave/ is on an NFS mount and
each users is running the app from that mount?). In our classrooms we have seen
no issues with multiple CAVE clients connecting to a single server and I have
not seen network latency caused by multiple clients connecting at the same time.
Can we confirm that the multiple session of CAVE are run as different users,
meaning no two users would be reading/writing the same ~/caveData directory at
the same time?
Perhaps turning off data cacheing (CAVE > Preferences > Cache) would reduce the
latency to an acceptable level?
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are recorded in
the Unidata inquiry tracking system and made publicly available through the
web. Users who post to any of the lists we maintain are reminded to remove any
personal information that they do not want to be made public.
awips2-users mailing list
awips2-users@xxxxxxxxxxxxxxxx
For list information, to unsubscribe, or change your membership options, visit:
http://www.unidata.ucar.edu/mailing_lists/