New Urgency for Disaster Recovery Planning
"It's Really Possible to Have Large Structures Destroyed Without Any Warning<@VM>Protect and Defend
- By James Schultz
- Oct 04, 2001
A rescuer takes a break from recovery efforts at the World Trade Center site in New York City on Sept. 13, two days after two hijacked planes crashed into the towers and destroyed them.
September's terrorist attacks on New York and Washington have forever changed the way organizations plan disaster recovery for their computer systems.
"People are aware that the ways they've traditionally been backing up data aren't sufficient. Last night's backup isn't good enough," said Christopher Midgley, chief technology officer at LiveVault Corp., an information technology backup and restoration company in Marlboro, Mass.
One of LiveVault's customers, a financial institution located near the World Trade Center, escaped physical destruction Sept. 11, but suffered potentially catastrophic damage to its computer systems. Immediately after the disaster, administrators attempted to restore internal networks, but found the loss of power corrupted mission-critical data. A systemwide restoration appeared the only option.
Fortunately, the institution had established an IT disaster recovery plan and backed up essential applications and data to a remote facility via high-speed Internet connections. Working from a satellite office outside of New York and using software provided by LiveVault, onsite specialists were able to work online to bring systems back to pre-attack condition in less than an hour.
Businesses and government have always acknowledged the need to guard against information loss, Midgley said, but the assaults in Manhattan and Washington are now focusing unprecedented attention on IT disaster recovery.
"The ideal is to have data current, up to the minute, so that if a disaster does occur, restoration is relatively immediate," he said.
IT disaster recovery planning began in earnest in the 1990s, as systems engineers anticipated the transition to and through the year 2000. Architectures were reviewed, software codes modified, legacy systems examined and potential incompatibilities rectified.
Even as the transition was successfully implemented, however, experts knew full well Y2K protections were but a first step. In a society increasingly dependent on automation and new generations of computing devices, the threat of damage to essential systems would have to be anticipated and thwarted.
"What happened Sept. 11 was devastating," said Terry Rice, principal security engineer for information assurance for CACI International Inc., an IT services and products company in Arlington, Va. "We need to re-evaluate everything we've been doing. Organizations need to look closely at the risks they face and develop strategies to mitigate those risks across the entire information-assurance domain."
The west face of the Pentagon, shown Sept. 14, sustained heavy damage when a hijacked plane slammed into it Sept. 11.
Before Sept. 11, federal and state governments were developing, revising or implementing IT disaster-recovery plans. Private-sector vendors were beginning to pitch online data-protection services to government, even as federal and state agencies moved forward with plans for their own internal, duplicate sites.
Both of those trends are now likely to accelerate, experts said, in particular because the utter devastation of the World Trade Center has demonstrated that, absent planning and redundancy, a concentrated attack can annihilate IT capabilities for any organization or company.
"The last few weeks have really put a new perspective on disaster recovery," said Norm Linker, head of the Disaster Recovery Project for the New Jersey Office of Information Technology. "You can be attacked with no warning whatsoever. It underscores how vulnerable we all can be."
A key concern is having the requisite physical space and computing capability to continue operations even after partial or complete destruction of a primary data-handling facility. Disaster plans must include provisions for in-depth resiliency, from computational capacity to high-speed telecommunications links. Staff must also be trained and drills conducted to confirm that backup equipment at offsite locations is up the task.
"After this [attack], you'd better exercise your [disaster-recovery] plan," said Tim Atkin, vice president of critical infrastructure programs for SRA International Inc. of Fairfax, Va., an IT-services provider. "You must test your ability to get to your backup systems and your data after a simulated disaster. Disaster recovery planning is not only resuming IT operations, but it's also personnel and physical location. Can you move your people to your backup site, and is your IT infrastructure such that you can continue mission-critical operations there?"
SRA has about 24 people at the Pentagon, Atkin said. Some are helping the Defense Department with data recovery, while others are helping maintain data and communications flow. Atkin said specialists are attempting to understand how the Washington attack may have affected the Pentagon's IT interoperability and interdependency.
Nevertheless, he said, the Defense Department has robust disaster recovery programs in place, part of which is post-event assessment. Within the next 60 days, Atkin expects that lessons learned will be lessons applied and are likely to be adopted by all government agencies.
"In the aftermath of Sept. 11, agencies are taking a comprehensive look at security. IT protection should intensify," he said. "In the past, people have stovepiped those protections in different parts of the IT organization. That won't work. You have to integrate your physical, personnel and IT security within an overall framework."
Before the terrorist attacks, New Jersey's Office of Information Technology was already planning to ramp up its own disaster recovery efforts. The office is seeking funding from New Jersey's governor and legislature for a state-owned, state-run backup facility that would be physically separate from its primary site in Trenton. Although tied into local power grids and telephone systems, the backup location would have its own internal power-generating capabilities in the form of diesel generators and batteries.
New Jersey's disaster recovery plan also provides for installing high-availability servers that can immediately switch to the backup network in the event of disaster. Compatible servers with concurrent updating will also be used to ensure little or no loss of critical data or applications.
In addition, the IT office is making provisions for data protection for information stored on the state's mainframes, which will be electronically vaulted to the backup site. Depending on funding availability, the IT office hopes to complete the project within 18 months.
"This is intended to protect the state's [IT] infrastructure. After the events of September 11, we believe we've made the right decision," said Kathy Krepcio, information technology office chief of staff. "This has happened very close to home. It has shaken us all."
For planners in government and business looking to safeguard critical information, the future of disaster recovery may be remote. That is, more to-the-minute backups, online, and with storage in physically separate locations at heavily protected, secure facilities, such as those run by LiveVault partner Iron Mountain of Boston.
Networking between critical sites should also increase. Experts such as Robert Manchise, chief scientist for Anteon Corp. of Fairfax, Va., predict the maturation of layered telecomputing systems that include applications servers connected to Web servers connected to database servers. He believes the only sure way to protect critical IT is to ensure deep redundancy.
"After [Sept. 11], people realize it is really possible to have large structures destroyed without any warning," Manchise said "Most people are reconsidering whether it's enough to make weekly or biweekly system backups and then move it offsite. Maybe there needs to be hot backups: streaming data constantly sent to offsite, secure areas. It's costly to implement, but so is data loss."
As far as cost is a concern, according to Ron Salluzzo, a senior vice president for the state and local practice at KPMG Consulting Inc., McLean, Va., a good disaster recovery plan shouldn't add too much to the bottom line of the deployment of any system.
KPMG advocates more efficient resource management by identifying existing computer systems capable of assuming the duties of those which may be impaired or destroyed. The key to this approach, Salluzzo said, is classifying an organization's IT functions into mission-critical and nonmission-critical. In an emergency, the noncritical operations can be put on hold while the essential ones are carried out.
"If you think it through ? think about how many mainframe and servers you have, have a consistent technology platform, have pretty good standards in terms of how things go off ? then you have a likelihood of managing problems within your own organization. And that's the cheapest option," Salluzzo said, adding that KPMG advises clients to "foresee situations where they could not see access to a building for a month," and plan accordingly.
"We put a lot of attention into putting a system up, but we're also very concerned with the infrastructure behind it," he said. "Certainly a disaster recovery plan is one of the things we work on, as we do on security and privacy, how front-end systems connect to back-end systems. At the end of the day, being positioned for the unexpected has to be part of everyone's business strategy."
Whether agencies opt for "hot" or even "warm" backups, or secondary sites, the terrorist attacks have guaranteed that IT assurance strategies will never again be the same.
In and out of government, disaster recovery has assumed an urgency unprecedented in an information age that suddenly seems to have transitioned from a promising adolescence to a more troubled young adulthood.
"This is a wakeup call," said New Jersey's Krepcio. "We're dealing with scenarios we've never had to deal with before. We're meeting every day to discuss those issues and take actions that have a timetable. You never know when another attack will occur. We don't want to be caught in complacency."