Data mining gets a makeover
Call it fusion as new tools expand hunt for terrorists
- By Alice Lipowicz
- Sep 15, 2006
New software is making it easier for police at a recently opened intelligence fusion center in Los Angeles to investigate terrorist networks. It could be a success for counter-terrorism, information-sharing and data mining on the local level, though similar efforts have had a rocky path on the federal level.
The Los Angeles Joint Regional Intelligence Center opened in July with new software that lets officers do much more powerful searches, looking across multiple databases in a region for criminals and terrorists.
For example, an officer can run one search to get all reports of suspicious activity at public water plants in all Los Angeles County cities and towns, and all recent arrests and traffic citations in those jurisdictions for people named in those reports.
Previously, it would have required multiple searches to get the same information.
"We can now do what everyone calls 'connecting the dots,' " said Lt. Robert Fox, co-program manager for the intelligence center, a joint project of the city police, county sheriff, FBI and Homeland Security Department. The new analysis tools, developed by Memex Inc., could reveal a pattern that would require further investigation, he said.
Los Angeles is not alone in relying on newly developed data-mining tools to bolster intelligence operations. Other state and city fusion centers are installing similar software.
There is a caveat, however: the software often is not officially termed "data mining" because of privacy controversies associated with the label.
"Data mining is politically volatile," said Drew Ladner, a member of the Markle Foundation Task Force on National Security in the Information Age, and general manager of the JBoss division of Red Hat Inc., an open-source software company.
Nonetheless, state and local centers are enhancing their intelligence fusion capabilities apparently without encountering the red tape, turf battles and privacy concerns that have caused similar federal programs to falter.Worthy goal
Since the Sept. 11, 2001, terrorist attacks, the intelligence community's ability to connect the dots against terrorism has been viewed as a valuable goal. Failure to gather and fuse intelligence from disparate sources has been cited repeatedly as one of the shortcomings of pre-9/11 intelligence.
To correct those problems, the Homeland Security Department, National Security Agency and other agencies have initiated information-sharing and data mining for counter-terrorism.
However, NSA's National Information-Sharing Environment has yet to begin operations, and the Homeland Security Information Network for sharing data nationwide was criticized by the DHS inspector general as ineffective and underused.
Data mining describes computer searches for words, numbers or other data, and links between such elements across several large databases. A distinction is made between searches originated by a human's inquiry into a system, and those originating from machine intelligence and from analytic, predictive programs. Law enforcement programs are the first type, while the second type is being used in commercial applications, such as identity theft prevention and marketing.
Several federal data mining projects have stumbled over privacy concerns. The Defense Department's Total Information Awareness and DHS' Computer-Assisted Passenger Prescreening System II were both discontinued. The Multi-State Anti-Terrorism Information Exchange (Matrix), a high-profile data-mining program based on Seisint Inc. technology and initially run by 13 states, saw its membership dwindle and its DHS funding disappear.States pick up slack
Nonetheless, information-sharing and data mining remain key activities at a growing number of state and local intelligence fusion centers. Forty-two states have either set up such centers or plan to do so, said Charles Allen, chief intelligence officer for DHS, at a Sept. 7 House hearing.
By October 2007, DHS intends to deploy federal intelligence officers to 18 state fusion centers to enhance collaboration, he said.
Several fusion centers recently have begun using software that can search separate, often large databases to identify, link, mine and share information about suspected criminals and terrorists.
At the Los Angeles intelligence center, for example, software developed by Memex, whose parent company is Memex Technology Ltd. of Kilbride, Scotland, enables both information-sharing and data mining. It makes databases searchable, regardless of size and structure. It also ensures compliance with all laws and requirements pertaining to privacy, disclosure and data maintenance, said John McCarthy, Memex' director of law enforcement solutions.
"It is working really well," the intelligence center's Fox said of the Memex software. "We've had access to these data sources before, but you had to log in separately for each one. Memex makes it easier, faster and eliminates some of the human error."
At the Maryland Coordination and Analysis Center, officials this year deployed information-sharing and searching software known as the Digital Information Gateway (Dig) from Visual Analytics Inc. of Poolesville, Md., said Jim Pettit, a spokesman for the state's homeland security division.
"It is a tool in our belt to help connect the dots," Pettit said. "The goal is to use software to help connect disparate databases with one common user interface."
The Florida Law Enforcement Department is using the Facts (Factual Analysis Criminal Threat) data discovery tool developed by Seisint, acquired in 2004 by LexisNexis U.S., a subsidiary of Reed Elsevier Group PLC. Facts software offers capabilities similar to those of Matrix, said Mark Zadra, chief of investigations for the department.
Using both public and commercial databases, the Facts software lets an investigator quickly generate leads. For example, if a white van is spotted in connection with a kidnapping attempt, the system can quickly search for all owners of white vans within the 25-mile radius and can assemble drivers' license photographs for the owners of those vans to be shown to a witness, Zadra said.
Matrix was criticized for its intrusiveness, but according to Zadra, the LexisNexis software is simply a tool for investigation, not a basis for an arrest.
"A lot of people view data mining negatively," Zadra said. "They think you can go have the computer run and predict and look for patterns, and tell me the top 10 terrorists in the state. Facts has no capability to do that. It only responds to a query. We call it data discovery." LexisNexis officials declined to respond to a request for comment.
Use of data mining against terrorists theoretically could harness the same predictive tools that are being used in preventing identity theft and fraud. The analytics programs cull through millions of pieces of data to find patterns suggesting theft, such as a person who applies for credit cards using multiple addresses and multiple Social Security numbers. The software in real time then generates searches for additional examples of those patterns.
Those same techniques could be used for finding terrorists, said Xuhui Shao, vice president of analytics at ID Analytics Inc. of San Diego. "We've seen increased awareness within DHS of how advanced analytics can help with homeland security," Shao said.
But it might be better to delay new data mining programs until regulations are written to address privacy and civil liberties concerns, cautioned the Markle Foundation's Ladner.
"There is going to be greater interest and pressure to use and mine data," he said. "The challenge, and the way forward, is to ensure that the regulatory framework develops more rapidly than implementation of the technologies."
Staff Writer Alice Lipowicz can be reached at firstname.lastname@example.org.
Alice Lipowicz is a staff writer covering government 2.0, homeland security and other IT policies for Federal Computer Week.