Old documents meet new technologies

Scanning paper, microfilm represents a growth industry for imaging companies.

Even with today's digital databases
and electronic forms, government
warehouses are filled with mountains
of paper and film documents.

For example, medical records of
military personnel must be maintained
throughout their lives. That
could mean retaining microfilm
records for about 100 years in some
cases.

Despite decades of computer-centric
processes, scanning and data
mining of old documents are still
growth industries, said several technology
vendors at this year's FOSE
show in Washington. FOSE is produced
by 1105 Government
Information Group, publisher of
Washington Technology.

NextScan's NextStar software, for
example, was designed to scan
microfilm and microfiche.

Traditionally, microfilm scanners
must center film frames in a window
to capture images, but NextStar captures
more than a window view, said Mike Oris, a
NextScan sales manager.

"We scan edge to edge of the film, from the
beginning all the way through to the end," he
said. "The ribbon image guarantees the capture
of every image on that roll ? nothing is
skipped, nothing is truncated ? and that
gives you better quality images and minimizes
the need for rescanning."

NextStar draws a box around each frame
and numbers it. If an image is not found,
tools alert the operator that something was
skipped. If an image was chopped in half, the
operator can go back and fix it. No
rescanning is necessary because all
the frames are available in the ribbon
image.

Oris said the need to scan vast
quantities of film will continue for
years. "Since the 1960s to the early
1990s, microfilm was the major
storage medium used by banks,
insurance companies and government
agencies for the long-term
preservation of information and
images," he said. "They needed a
medium that could span a long time
frame, and microfilm and fiche were
the only mediums used at the time."

The reservoir of microfilm still
exists. Microfilm is no longer manufactured,
but people are concerned
about making the information from
existing film more readily available.

"They need to scan the microfilm
and bring that information into the
digital world so they can distribute
it, manipulate it and do whatever
they need with it," Oris said.

Once old documents are scanned into a
database, government agencies need better
ways to search and manipulate the data.

Kyos Systems Inc.'s TransFORM lets
organizations scan and transfer documents
securely. Its analytics can detect information
patterns, track data, and easily audit paper-created or electronic data.
The system converts each piece of paper
into a relational database. Traditional scanning
simply turns documents into a static form such
as a PDF, said Kevin Pang, Kyos' president and
chief executive officer.

"PDFs are pretty ... difficult to search," he
said. "There's a huge data penalty
because as you scan more and more
things into your system, it begins to
slow down. The memory is more
difficult to manage, [and] search
becomes a very untenable experience
for a lot of end users."

Rather than treating a form as one
file, Kyos breaks it down into its elements,
which frees the data from the paper and allows
it to move through Kyos' system, Pang said.
The Defense Department has warehouses
full of paper that are difficult to search and
share securely. Traditional scanning methods
often rely on a Google-based search that
requires the user to visually determine if the
correct document was found.

Kyos semantically understands each document
and tags all the data elements. Then a
variety of search and security rules can be
applied and the data aggregated on demand.
"One of the things that we do for the military
is allow them to ask questions like 'Show
me the last 20 blood pressure readings taken
over the last 10 years, sorted by date,'" Pang
said. "Then, 'Show me all the data that's related
to that, for example, medications and
procedures.'"

With all that aggregated data, doctor and
patient can sit down together and monitor
the patient's progress.

To achieve that level of search capability,
Kyos used algorithms originally developed for
the study of genetics. When a new form comes
into a system, it is not considered an exception;
it is viewed as a mutation. In essence,
the algorithms allow the form to evolve.

"Forms change for a reason," Pang said.
Generally it is because someone wants to
add data to a form or create new data
relationships. The system maps all the
structural elements that constitute a form
and builds semantic knowledge about
what the form is designed to do.

Kodak Co. has also recognized the importance
of manipulating scanned files with the
release of its Capture Pro Software. It is
designed to assist in complex capture jobs, said
Craig Carlisle, a Kodak technical manager.
Compatible with nearly all Kodak document
scanners, Capture Pro Software
provides improved methods for
capturing and extracting information,
including intelligent selective
image display and post-scan
image-processing features.

The software has single-click
capture capability, which helps
condense complex capture tasks to the push of
a button or selection of a user-definable shortcut,
such as Scan and Send to E-mail, Scan to
PDF, Scan and Automatically Index, or Name
and Output Image Files.

It also has automatic page-orientation correction
to improve throughput and ensure
that images are captured accurately. Its Batch
Explorer gives a complete overview of what
has been captured, which aids in quick selections
of documents.

Doug Beizer (dbeizer@1105govinfo.com) is a staff
writer at Washington Technology.

NEXT STORY: On the edge