Tech success: Netcool walks the network beat
It used to be sufficient to manage networks ad hoc, using a stable of applications. But after the Sept. 11 terrorist attacks, that approach was no longer an option for the Air Force.
The service had to get a comprehensive view of its wired and wireless networks deployed over nine major commands across the globe. Commanders knew that their computer networks would play an important role in the war on terror.
"In the Defense Department, they're not going to get to network-centric warfare and assure that information gets to the right warfighter at the right time without proactively managing the whole information chain from beginning to end," said Rick Miller, general manager for the federal business of Micromuse Inc. of San Francisco.
To build that operational picture, the Air Force is using Micromuse's suite of Netcool solutions. EDS Corp. is the systems integrator on the project that is part of the Combat Information Transport System, said Randy Warner, Netcop project manager for EDS. The Air Force and its contractors call the project Netcop, for network common operation picture, he said.
Achieving a common operational picture generally means having a real-time view of the networks' health. It's basically an umbrella management solution above specific and local management tools.
"It means they needed to get a consolidated operational management view of what is happening across their wired and wireless infrastructure," Miller said. "And they want to be notified as quickly as possible when something is either not available or performance is degraded."
Lt. Col. Terry Gold, an Air Force program manager, said achieving uniformity within the force's networks was a priority.
"We had network management and security systems with individual interfaces and individual consoles," Gold said. "That made it difficult to get an overall picture of what was going on in our network."
At Air Force bases, IT managers use a variety of tools to monitor local infrastructures. The information they glean is then forwarded to the Micromuse solution.
"Micromuse's suite of products is basically the engine into which everything else integrates," Warner said. "All those disparate systems, whether they're Cisco-
Works, Remedy, Veritas or even a government homegrown solution, are integrated into this manager of managers."
For example, administrators at the Air Force's network operations security center for the Pacific region at Hawaii's Hickam Air Force base can monitor all the bases under their charge. Using Micromuse, the operator sees a world map with Hawaii highlighted, Warner said. A mouse click on Hawaii highlights the bases.
From there, a navigation tree lets subject matter experts view areas of the network that are important to them.
"If I'm a security person, I click on that tab and get all the events that I would need to know about and act upon," Warner said.
If a network problem does arise, Micromuse comes with tools, called Virtual Operator, to automate fixing some issues. Using pre-written scripts, Virtual Operator can address common events, such as a router port becoming unavailable. The first thing Virtual Operator might do in correcting the problem is ping the port.
"If the port failed due to an operating system glitch, there could be a reboot of the port or an operations command to clear the port," Miller said. "You also can configure the system, so if you did have a port failure, you could suppress the alarm until trouble-shooting steps have been completed."
By the time the alert reaches a network operator, he or she knows all those other remedies have been tried and the problem lies deeper. Alerts also can be customized, so they can be suppressed for certain events that have been deemed not urgent.
When Micromuse is first installed, its auto-discovery tool finds all the network pieces. The process can take as little as a couple of hours or up to two to three day in a large network environment.
An applications-discovery tool performs a similar search on all the applications found on a network. "It discovers all the servers that are running applications, pulls down configuration information and comes back and does the analysis," Miller said. "It also draws an applications dependency map."
When outages occur, that map and data are used to help root out the problem's source.
Air Force officials are already reaping benefits from the project, Warner said.
"The information grid is becoming part of our weapon system of the future," Miller said. "The availability of this information and being able to manage when a source cannot get to the warfighter are critical. ... We need to know why the information didn't get through and fix the problem as quickly as possible."
If you have an innovative solution that you recently installed in a government agency, contact Staff Writer Doug Beizer at firstname.lastname@example.org.