Software pools supercomputing power
Scattered resources join to track storms
- By Doug Beizer
- Jun 08, 2006
The overwhelming devastation left by Hurricane Katrina is somewhat less shocking when taken within the context of overall storm statistics from 2005: The season logged a record 28 named storms, 15 of which reached hurricane level. Four of those rated Category 5, the strongest storms, with wind speeds exceeding 155 mph.
With the National Oceanic and Atmospheric Administration predicting that 2006 also will be "a very active hurricane season," it's more urgent than ever that supercomputers used to predict the paths of storms be ready for their task.
The Navy's Fleet Numerical Meteorology and Oceanography Center in Monterey, Calif., helps track storms.
"We're a weather and oceanography modeling organization within the Navy," said Jay Morford, an enterprise high-performance computing grid architect at Fleet Numerical.
The center provides data to Navy organizations, other Defense Department groups and some commercial entities. "For the hurricanes that hit the Gulf Coast last year, we were a big player in modeling those and sending out the hurricane tracks to the various Navy commands, and general commands and resources, along the Gulf Coast," Morford said. "The track itself is pretty simple; what goes into making that track can be very complex."
Making myriad calculations to model, in detail, the tracks of such storms requires the power and speed of high-performance computers. Using the same model, different levels of information are extracted for different users.
"What you'd see on the 5 o'clock news is condensed and translated into very simple terms," Morford said.
Looking to this year's hurricane season and anticipating there would be no money to buy new hardware for the computer center, officials at the Navy center began casting around for a way to run all necessary computer cycles under tight deadlines.
They found the answer in installing software that would let scientists in the Monterey facility access high-performance computers in the center's sister facility in Mississippi.
For the project, the Navy chose two products that Platform Computing Inc., Markham, Ontario, developed for high-performance computing. Platform LSF manages and accelerates batch workload processing for compute- and data-intensive workloads. Platform LSF MultiCluster lets the Monterey facility access the Mississippi location's computers.
"LSF is a middleware-like product that lets an end user submit a piece of work that requires multiple computers to perform," said David Tabor, an accounts manager for Platform. "The software schedules those computers, manages them, ensures that the work is done successfully and then returns the result to the end users."
If storm trackers in Monterey need to run a model, but the necessary computers aren't available there, the multicluster product lets them forward the job to computers in Mississippi.
The system itself parses the work, letting Mississippi computers that are allocated to the California center work as though they are physically a part of a California cluster.
The job can be split up, with some of the work run on California's machines and some on Mississippi's machines, depending on how the product is implemented, Tabor said.
The new system makes it fast and painless to do what once was slow and laborious.
Before installing the multicluster system, a storm tracker who needed to run a model first would have to check manually every computer in a cluster to see which ones were available, then select a subset of computers for the job.
"Now all he does is submit one long string of environmental information to our software," Tabor said. "Our software tracks each computer automatically, all that status information. We're constantly looking at and retrieving information from the individual computer logs."
The Platform software also helps with the high-performance computer challenge of monitoring jobs' progress.
A job request typically will require several computers to handle the task. Monitoring all of those to ensure they are all performing presents a substantial challenge.
During the cycle, several things can go wrong: A computer may fail, or a network attached to a computer may fail. Quickly catching any error can save hours of computing time.
The center configured Platform to monitor the status of the environment. If something goes wrong, rules written by the center's IT staff tell it what to do.
"You can either start the job all over again, or take that one component and send it to a different computer. It just depends on how the customer sets that up," Tabor said. "And it all works inside a secure environment, in the Navy's case."
The project's success could affect how the center will buy high-performance computing hardware, Morford said. With hardware set up in different time zones, work could be scheduled across centers during off-peak hours.
But for today, Morford said he and his colleagues are better prepared for the 2006 hurricane season, which began June 1.
"With the multicluster, we've been able to run additional models that we could not have run in California," he said.If you have an innovative solution that you installed in a government agency, contact Staff Writer Doug Beizer at firstname.lastname@example.org.