The name game

<FONT SIZE=2>For 16 years, the National Library of Medicine has been trying to solve the biggest interface problem between computers and medicine: vocabulary.</FONT>

Carolyn B. Tilley, head of the MEDLARS Management Section at the National Library of Medicine, says many hospitals, health care billing systems, libraries and databases use vocabularies and concepts from the Unified Medical Language System.

For 16 years, the National Library of Medicine has been trying to solve the biggest interface problem between computers and medicine: vocabulary.

A computer doesn't necessarily know whether a heart attack is the same thing as a myocardial infarction. Does "cold" mean the common cold or chronic obstructive lung disease?

Although NLM's Unified Medical Language System has long operated under the public radar, the connections it makes are helping workers at all health care levels, said Carolyn Tilley, head of NLM's Medical Literature Analysis and Retrieval System Management Section.

The Unified Medical Language System got its start in 1986, when NLM director Donald Lindberg asked Congress for funds because the vocabulary problem was hindering use of computers in medicine.

The same vocabulary roadblock exists in areas besides medicine, Tilley said. There are often different names for the same thing: for example, TV and television. In medicine, hypertension is the same thing as high blood pressure.

Conversely, the same word or phrase can mean vastly different things. An internist could use the word ventilation to mean respiration, as in breathing, while an occupational-health specialist might use ventilation to mean the flow of fresh air within a building.

Human minds can grasp the different connotations, but disparate computer systems and databases need help to decipher the linguistic nuances.

In the first few years of the unified-language project, NLM hired a number of universities and other organizations, through task-order research contracts, to define possible solutions and then build them.

Apelon Inc. of Ridgefield, Conn., formerly Lexical Technologies Inc. of Alameda, Calif., was one of the original grantees and recently signed a five-year, $14.9 million contract to continue working on the unified medical language.

The first edition of the Metathesaurus had just eight vocabularies. It now has 95, of which 65 are in English. Within those vocabularies are 871,584 concepts and 2.1 million concept names.

All synonyms map to a unique identifying number. For each term, the researchers chose one word or phrase as the primary concept name: for example, hypertension. All other synonyms, such as high blood pressure, map to the primary concept name.

Today the Metathesaurus includes the American Medical Association's Current Procedural Terminology, commonly used in medical billing; the College of American Pathologists' Systematized Nomenclature of Medicine; and the vocabulary of the American Psychiatric Association's Diagnostic and Statistical Manual of Mental and Behavioral Disorders.

Other Metathesaurus vocabularies include the National Drug File from the Veterans Health Administration, the National Cancer Institute Thesaurus, the University of Washington's Digital Anatomist, Britain's Read Clinical Classification and NLM's own Medical Subject Headings.

Besides the Metathesaurus, the Unified Medical Language System has two other so-called knowledge sources: the Semantic Network and the Specialist Lexicon.

The Semantic Network consists of 134 high-level terms for grouping concepts, Tilley said. The semantic types are organized in a parent-and-child hierarchy. For example, the term "biologic function" has two children, physiologic function and pathologic function, plus numerous grandchildren.

The third component, which was developed later than the other two knowledge bases, is the Specialist Lexicon. It relates words by their parts of speech and their rules for inflection, plurals and so forth.

"Specialist Lexicon is useful when going through medical records or doing a Web search," Tilley said. The rules within Specialist Lexicon can normalize words -- a search for "diabetic children" would be the same search as "diabetes AND child."

These three knowledge sources are "huge, honking ASCII files" designed for systems developers, Tilley said. "It's not rocket science, but it's very complex."

NLM distributes all three knowledge sources free to users who sign a license agreement. Metathesaurus users need individual licenses for some of the individual source vocabularies, however. About half of the vocabularies are completely free, but some require end users to notify or request permission if they plan to translate one into another language or incorporate it into a computer system.

The field of medical informatics is growing fast, because drug information systems and computerized records save lives and cut costs, Tilley said. NLM has updated the Metathesaurus annually, but this year it started publishing quarterly updates.

"With all the new drugs, medicine is very dynamic," Tilley said.

Government Computer News Staff Writer Patricia Daukantas can be reached at pdaukantas@postnewsweektech.com.

NEXT STORY: Bad news travels fast