Report: Homeland security data mining generates concern

The National Security Agency is now sponsoring intelligence data mining with massive databases that are growing as fast as four petabytes per month, according to a new report published by the Congressional Research Service.

The report, "Data Mining and Homeland Security," written by Jeffrey Seifert, highlights the growing popularity of data mining and its benefits while also outlining limitations and possibly privacy and mission creep concerns. It was posted online by the Federation of American Scientists.

Although data mining has become more common in recent years, its rapid expansion in homeland security raises concerns related to how it is implemented and monitored.

Data mining, which can help reveal patterns and relationships, requires skilled personnel to determine the value and significance of the information. What's more, data mining on its own does not show causal relationships. Issues of data quality, interoperability, mission creep and privacy also have been obstacles in applying data mining, the report said.

The National Security Agency's recently disclosed surveillance of alleged domestic terrorists has sparked privacy concerns in Congress, as has the former Terrorism Information Awareness project and the Homeland Security Department's Computer-Assisted Passenger Prescreening System II, both discontinued. The Multi-State Anti-Terrorism Information Exchange formerly operated by Florida and several other states, and the Defense Department's Able Danger project, also have attracted attention.

Lesser known programs for data mining include the NSA's Novel Intelligence from Massive Data program, which is being developed by grants under its Advanced Research Development Activity arm.

The massive data program refers to data that is especially challenging to common data analysis tools because of its unusually large size, such as a petabyte or greater, as well as databases with great complexity and a variety of formats, such as those that include unstructured text, audio, video, graphs, diagrams, images, maps, equations, chemical formulas or tables.

"Some intelligence data sources grow at a rate of four petabytes per month now, and the rate of growth is increasing," the congressional research report states, quoting from the advanced research activity's former Web site. The huge expansion in electronic communications means that NSA requires more sophisticated data tools.

"Whereas NSA once predicted it was in danger of becoming proverbially deaf due to the spreading use of encrypted communications, it appears that NSA may now be at greater risk of being 'drowned' in information," the report states.

About the Author

Alice Lipowicz is a staff writer covering government 2.0, homeland security and other IT policies for Federal Computer Week.

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

What is your e-mail address?

My e-mail address is:

Do you have a password?

Forgot your password? Click here
close

Trending

  • Dive into our Contract Award database

    In an exclusive for WT Insider members, we are collecting all of the contract awards we cover into a database that you can sort by contractor, agency, value and other parameters. You can also download it into a spreadsheet. Our databases track awards back to 2013. Read More

  • Navigating the trends and issues of 2016 Nick Wakeman

    In our latest WT Insider Report, we pull together our best advice, insights and reporting on the trends and issues that will shape the market in 2016 and beyond. Read More

contracts DB

Washington Technology Daily

Sign up for our newsletter.

I agree to this site's Privacy Policy.