A closer look at search engines
- By Doug Beizer
- Jul 01, 2005
For many tasks, the Google method of Web search ?
organizing results by relevancy ? is the way to go. But when it came to
managing $3 billion worth of research funds at the Office of AIDS Research at
the National Institutes of Health, a more sophisticated search solution was
With that in mind, the AIDS Research Office, as well as
other institutes and offices at NIH, are using search technology developed for
commercial industry to manage scientific programs.
Used for Web-based Yellow Pages and at business information
provider Dun & Bradstreet Corp., i411 Inc.'s information search and
discovery technology differs from some search engines in that it organizes
results by facets and by categories within those facets, said Amin Hassan, vice
president of government strategies and solutions at i411 of Herndon, Va.
For example, "location" could be a facet, and
categories under that could be states or cities.
"Facets give you wide views, whereas categories and
subcategories are more detailed within a facet," Hassan said.
"Google does a fantastic job at bringing back re-sults.
But when the results comeback, it is a massive, long list, and you may not be
able to find the results you're looking for."
At NIH, i411's technology has been integrated into a
solution used to monitor grants, said Ken Fang, chief executive officer of Altum
Inc., the Reston, Va., software developer coordinating the project.
NIH officials turned to Altum and i411 to get the tools to
better manage several of the organizations' grant portfolios. Last year, the
federal budget included nearly $400 billion for grants, Fang said.
Grant money that makes it to NIH is distributed as
individual grants to schools and
organizations to study diseases such as breast cancer and Alzheimer's disease.
"They need tools to be able to figure out what's the
latest going on with a particular research program," Fang said.
The search and discovery engine helps grant managers decide
how much certain programs should be funded in the future, and if there are any
gaps in how money is being allocated.
"It is like how a mutual-fund manger would manage their
investment portfolio," Fang said. "It gives directors better access to the
information they need."
In the Office of AIDS Research, for example, there are
about 7,000 active projects and 1,500 grantees, including
and Merck & Co. Inc. Without the i411/Altum tool, monitoring all those
programs was a challenge.
"Before we came in, grant managers would have to ask each
of these institutes and centers for listings of all the projects," Fang said.
"They would collect them manually, via Excel spreadsheets and that sort of
thing. Basically, that's how they responded to requests for information or to
do their program analysis."
The i411 technology can work with any type, structured or
unstructured, or number of databases, Hassan said.
"What we do then is drive connectors into these databases
without really intruding into them, and create a common index," he said. "It
is through that index that our search technology comes in to search the
Because i411 relies on the database index, a systems
integrator has to perform front-end work to use the technology. The amount of
work depends on the number of databases, what type of format they are in and
whether or not they are classified, Hassan said.
After the index is created, it has to be updated regularly
for i411 to work properly. How often the index is updated depends on how fast
the data changes.
For Yellow Pages listings, for instance, updates are made
twice a day. "That is because a 14 million-record database changes quite
fast," Hassan said. "By contrast, the Dun & Bradstreet data volatility
isn't much, so we get those updates every two weeks."
In addition to incremental index updates, databases also
sometimes need to be updated from top to bottom, Hassan said.
"It really depends on the customer's requirements, the
size of their databases and the databases' volatility," he said.
The data in the NIH grants is fairly stable, so the updates
do not have to be made daily, Hassan said.
Under a General Services Administration schedule, there are
two types of licenses for i411: one for an internal enterprise and another for
public information. Regardless of the license, the product uses a Web-based
The tool also is used for preparing for conferences and
seminars, Fang said. If researchers are planning a conference that correlates
condom use and the spread of AIDS in
, the Altum tool can be used to collect the relevant data. The process used to
take weeks and could easily overlook important data.
"Instead of taking months to find out information, it is
now at their fingertips, literally within seconds," Fang said. "It increases
productivity and lets them better maximize the public research dollars."
Doug Beizer is a staff writer for Washington Technology.