| Ablate |
To
remove. Used to describe the laser-readable "pits" in the recorded
layer of optical disks. |
| Acetate-base
film |
A film substrate used in microfilm
production. Considered a safety film (ANSI Standard). |
| ADC |
Analog
to Digital converter. Changes analog signals to digital representations
(numbers). |
| Additive
Color |
All
the colors in the light spectrum add up to make white light. Computer
monitors use a three additive colors, Red, Green & Blue (RGB). |
| ADF |
Automatic Document
Feeder. This is the means by which a scanner feeds the paper document. |
| AIIM |
The
Association for Information and Image Management – focused on electronic
imaging. |
| Aliasing |
When
computer graphics output has jagged edges or a stair stepped appearance when
magnified. Homonym is "anti-aliasing". |
| Alphanumeric |
Characters
composed of letters, numbers (and sometimes punctuation marks). Excludes
printer/flow control characters, (Carriage Return/XON & XOFF). |
| Analog |
The
electrical replica or waveform of a physical process caused by changes in
amplitude or frequency. Opposite of digital (Zeros & Ones). |
| Annotations |
The changes or additions made to a document using sticky
notes, a highlighter, or other electronic tools. Document images or text can
be highlighted in different colors, redacted (blacked-out or whited-out),
stamped (e.g. “FAXED” or “CONFIDENTIAL”), or have electronic sticky notes
attached. Annotations should be overlaid and not change the original
document. |
| ANSI |
American National Standards Institute |
| ANSI |
American
National Standards Institute. Member of ISO and IEC. |
| Aperture
Card |
An
IBM punch card with a window which holds a 35mm frame of microfilm. Indexing
information is punched in the card. |
| ASCII |
(pronounced
ask-ee) American Standards Committee II. An eight bit computer coding
structure for letters, numbers and characters in which seven bits are used to
identify each individual entity (128 maximum), with one bit for parity. When
no parity bit is used, all eight bits can be used to represent up to 256
characters; this character set is extended ASCII. |
| ASCII |
American Standard Computer Information Interchange. Used
to define computer text that was built on a set of 255 alphanumeric and
control characters. ASCII has been a standard, non-proprietary text format
since 1963. |
| Aspect
Ratio |
The
relationship of the height and width of any image. This must always be
preserved to prevent distortion. |
| AVI |
Audio-Video
Interleave. A Microsoft standard for Windows animation files. The format
interleaves audio and animation to provide medium quality multimedia. |
| Backfiles |
Existing
paper or microfilm files. |
| Bar
Code |
A
method of representing data by combining lines of varying width (e.g. |
| Bar Code |
A small pattern of vertical lines that is read by a laser
or an optical scanner, and which corresponds to a record in a database. An
add-on component to imaging software, this feature is designed to increase
the speed with which documents can be archived. |
| Batch Processing |
The name of the technique used to input a large amount of
information in a single step, as opposed to individual processes. |
| BBS |
Bulletin
Board System. |
| BCS |
Boston
Computer Society, one of the first associations of PC/Apple users and one of
the largest and most active. |
| Beginning Document Number or BegDoc# |
The first page of a document or record. |
| Bibliographical/Objective Coding |
Extracting information from electronic documents such as date
created, author recipient, CC and linking each image to the information in
pre-defined objective fields. In
direct opposition to Subjective coding where legal interpretations of data in
a document are linked to individual documents. |
| BIOS |
Binary
(or Basic) Input Output Specification – the specific PC input/output
"rules" and the programs which execute these to allow the transfer
of information to/from the "central processing unit" of the PC. |
| BIT |
Binary
Digit. Single position in base 2 arithmetic (2 n ) – either on (1) or off (0). |
| Bit
Map |
Creating
characters or images by creating a "picture" (matrix) of individual
bits (pixels). The individual bits may just be binary (black and white) or
high definition color. In color systems, the "z-axis" of each pixel
has a value which represents the "shade of gray" or color of the
bit. This value can be as high as 32 bits for very high resolution color.
This results in a large, uncompressed file. For instance, a 300 dpi, E-Size
drawing bit map is approximately 16MB. |
| Bitmap/Bitmapped |
See BMP and
Raster/Rasterized. |
| Bi-Tonal |
Bi-tonal (black and white only, one bit per pixel). A Bi-tonal image is created by a thresholding process from a grayscale input, either during the scanning process or subsequently. Thresholding is an irreversible process which results in speckled images with noticeably "stair-stepped" diagonal lines. |
| BMP |
Bit
Map unique format for Windows electronic graphics files. |
| BMP |
A native file
format of Windows for storing images called “bitmaps.” |
| Boolean Logic |
The use of the
terms “AND,” “OR” and “NOT” in conducting searches. Used to widen or narrow
the scope of a search. |
| Box |
A
square graphic element on a form used to enter a single character, usually
used in strings for entering constrained data. |
| BPI |
Bits
Per Inch. For instance, this defines data densities in disk and magnetic tape
systems. |
| bps |
bits
per second. |
| Briefcase |
A method to
simplify the transport of a group of documents from one computer to another. |
| Burn (CDs or DVDs) |
To record or write
data on a CD or DVD. |
| Business
Process Outsourcing |
Business
process outsourcing occurs when an organization turns over the management and
optimization of a business function, such as accounts payable or purchasing,
to a third party that conducts the activity based on a set of predetermined
performance metrics.
Business Process Outsourcing
- IT
Outsourcing: focus on life cycle
and PC uptime
- BPO
manages people and processes
- “It's important for companies to recognize their core
competencies compared with activities that could be handled more
efficiently by a third party”, says Charles Kafoglis, a partner at
PricewaterhouseCoopers in New York.
For example, “back-office functions such as payroll or accounts
receivables aren't likely to "make or break" a company, so it
might make sense to farm them out if someone else can support them more
effectively,” says Kafoglis.
- Companies outsource to streamline processes, save time or
leverage the strengths of third-party specialists. Small companies outsource in order to
cut costs and build a function like accounts receivables in a short
time. Large companies, on the
other hand, traditionally choose BPO to improve their efficiencies.
|
| Buss |
(also
Bus) The
"highway" which connects the various components of a computer
system. |
| BYTE |
Eight
bits. The ASCII standard to define letters, numbers and characters – maximum
of 256. KB –
Kilo-bytes, a thousand bytes (actually 2 10 or 1024 bytes). MB – Megabytes, a million bytes,
(actually 2 20
or 1,024 KB
or 1,048,576 bytes) GB – Gigabytes, a billion bytes (actually 2 30 or 1024 MB or
1,073,741,824 bytes) |
| Cache |
A
dedicated, high speed portion of computer memory which can be used for the
temporary storage of frequently used data to make the application run faster
(prevents having to constantly access the data from disk/tape storage). |
| Caching (of Images) |
The temporary
storage of image files on a hard disk for later migration to permanent
storage, like an optical or CD jukebox. |
| CCD |
Charge
Coupled Device. A computer chip (with say 2048 cells) whose output is
proportional to the light or color passed by it. Individual CCD's or arrays
of these are used in scanners as a high-resolution, “digital camera" to
"read" documents. These devices are micro-chip size and their
resolutions run as high as 1000 pixels per inch. |
| CCITT |
Consultative
Committee for International Telephone & Telegraphy. Sets standards for
phones, faxes, modems etc. The standard exists primarily for fax documents. |
| CCITT
Group 4 |
A
compression technique/format that reduces a file generally, about 5:1 over
RLE and 40:1 over bitmap. For example, at a 300 bpi scan rate, the
approximate storage requirements are: Size Raw RLE Group 4 A 1MB 200K 40K B 2MB 400K
75K C 4MB 820K 150K D 8MB 1.6MB 300K E 16MB 3.2MB 580K |
| CD |
Compact Disk |
| CD |
Compact
Disk. A 4 3/4" diameter device which can be read by a laser beam. |
| CD Publishing |
An alternative to
photocopying large volumes of paper documents. This method involves coupling
image and text documents with viewer software on CDs. Sometimes search
software is included on the CDs to enhance search capabilities. |
| CDMA |
Code-Division
Multiple Access – an emerging wireless communication technology for all
digital voice and data networks. |
| CDPD |
Cellular
Digital Packet Data. A data communication standard which uses the unused
capacity (bandwidth) of cellular voice providers. |
| CD-R |
Compact
Disk Recordable. The standards for recording CD-ROM disks. The digital disks
are 4" in diameter and can store 650 MB. For standard CD's. Each disk
has a layer of laser sensitive, dyed polymer plastic sandwiched against a
reflective layer between protective layers. When the laser burns a spot in
the polymer, the reflective surface shows through the hole. Dye polymer is
easier to burn and requires a much lower power laser than to burn holes
through a metal layer (such as ablative optical WORM drives.) The CD-R media
is gold in appearance, rather than silver surface of a typical CD-ROM.
The
logical format standard is ISO (International Standards Organization)
9660. There are several standard formats: 1. "Yellow Book" – for simple computer
data or images. Divides the tracks into 2,352 byte-sectors of which 2,048
hold data and 304 bytes are devoted to headers, mode selection and error
correction. 2. "CD-ROM-XA" (Extended Architecture) or "Mode
2" for interleaving data, audio and video on the same disks. Mode 2
sacrifices error correction for a larger usable data storage. Sectors have
2,336 bytes of data space. 3. "Red Book" or "CD-Digital
Audio" for digitally sampled audio, technically PCM 44.1 kHz, sampled
16-bit stereo audio. The standard for recording music. 4. "Orange
Book" or "Multisession". The standard that software follows to
encode a blank CD. Part I is the standard for "rewritable" (MO)
CD-ROMs . Groups of data can be added to the disk at different times. 5.
"Green Book" or "CD-1". For interactive games or video.
|
| CD-R |
Short for
CD-Recordable. This is a CD that can be written (or recorded) only once. It
can be copied to distribute a large amount of data. CD-Rs can be read on any
CD-ROM drive whether on a standalone computer or network system. This makes
interchange between systems easier. |
| CD-Recordable |
Often
also used as an acronym for CD-ROM's that can be written more than once. The
succeeding writings must utilize unused sections of the original, with a
library o directory of the total use. Optical storage technology using
formats compatible with CD-ROM's. CD-ROM discs must be
"pre-mastered" to insure that the data is correctly formatted.
Using a "double speed" recorder, it takes about a half hour to burn
a complete 650MB disc. |
| CD-ROM |
Compact
Disk – Read Only Memory. A type of high density optical disk with a 4"
diameter and a 650MB capacity. The information (1's or 0's) is permanently
etched by a laser into the surface of the disk and read by a laser beam. The
ISO 9660 standard defines how a CD-ROM is written for computer interface. It
is not rewritable. It is legally accepted and written on a single-side. |
| CD-ROM |
Compact Disc Read
Only Memory. Written on a large scale and not on a standard computer CD burner
(CD writer), they are an optical disk storage media popular for storing
computer files as well as digitally recorded music. |
| CD-ROM Drive |
A computer drive
that reads compact discs. |
| Centronics
Interface |
A
parallel interface standard for connecting printers and other devices to
computers. Pioneered by the Centronics Inc., a printer manufacturer in New
Hampshire. Uses a 36 pin connector. See SPP. |
| CGA |
Color
Graphics Adapter. (See VGA). |
| Character Treatment |
The use of all
caps or another standard form of treating letters in a coding project. |
| CIE |
Commission
International de l'Eclairage. The international commission on color matching
and illumination systems. |
| Cine-Mode |
Data
recorded on a film strip such that it can be read by a human when held
vertically. |
| Cinepak |
A
compression algorithm, see MPEG. |
| CITIS |
Contractor
Integrated Technical Information Service. The Department Of Defense now
requires contractors to have an electronic document image and management
system |
| Client/Server |
A
computer system functionally distributed across several nodes on a network,
sometimes called a distributed application. The basic theory is that the
various components of the system can be tailored to perform specific
functions, hopefully for the good of the entire network. Client/Server
systems are also typified by a high degree of parallel processing across
distributed nodes. Usually the clients are individual PC's connected to
server(s) which act as central storehouses and "traffic cops" for
information and applications. |
| Client-Server Architecture vs. File-Sharing |
Two common
application software architectures found on computer networks. With
file-sharing applications, all searches occur on the workstation, while the
document database resides on the server. With client-server architecture, CPU
intensive processes (such as searching and indexing) are completed on the
server, while image viewing and OCR occur on the client. File-sharing
applications are easier to develop, but they tend to generate tremendous
network data traffic in document imaging applications. They also expose the
database to corruption through workstation interruptions. Client-server
applications are harder to develop, but dramatically reduce network data
traffic and insulate the database from workstation interruptions. |
| CMYK |
Cyan,
Magenta, Yellow and Black. A subtractive method used in four color printing
and Desktop Publishing. |
| Coding |
A means of
capturing specific, standardized data from a collection of documents and
creating a database linking the data to the images. The term “coding” is generally used in the legal and medical
markets. It is similar to “indexing”
in the commercial marketplace. |
| COLD |
Computer
Output to Laser Disk. The computer system contains files of ASCII data (from
input or application programs) or bit-mapped files previously scanned from
microfilm documents or pictures. These output files are compressed by a
factor of 5-20:1 from the original documents and stored on WORM optical/laser
disks. The stored data is then available to all on the network. Generally,
the format of these databases are compatible with SQL and imaging formats. |
| COLD |
Computer Output to
Laser Disk. A computer programming process that outputs electronic records
and printed reports to laser disk instead of a printer. Can be used to
replace COM (Computer Output to Microfilm) or printed reports such as
green-bar. |
| COM |
Computer
Output to Microfilm. The computer converts and stores data directly on
microfilm/fiche from a variety of available inputs. This older technology is
cheaper and more convenient than paper, but one of the most difficult to use
in actually storing and retrieving the data. |
| COM |
Computer Output to
Microfilm. A process that outputs electronic records and computer generated
reports to microfilm. |
| Comb |
A
series of boxes with their top missing. Tick marks guide text entry. Used in
forms processing rather than boxes. |
| Comic
Mode |
Human-readable
data, recorded on a strip of film which can be read when the film is moved
horizontally to the reader. |
| Component
Video |
Separate
luminosity and color signals that provide the highest possible signal
quality. Distinct from video standards such as NTSC or PAL. |
| Composite
Video |
A
video stream that combines red, green, blue and synchronization signals into
one so it only requires one connector. Composite video is used by most
televisions and VCR's. |
| Compression |
Any
method which reduces the amount of data necessary to transmit information
from one point to another. Compression generally eliminates redundant
information and/or predicts where changes will occur. "Lossless"
compression techniques totally preserve the integrity of the input.
"Lossy" methods disregard some of the originals. |
| Compression Ratio |
The ratio of the
file sizes of a compressed file to an uncompressed file, e.g., with a 20:1
compression ratio, an uncompressed file of 1 MB is compressed to 50 KB. |
| Continuous
Tone |
An
image (e.g. a photograph) which has all the values of gray from white to
black. |
| Convergence |
Where
the RGB signals "converge" on a single pixel. That pixel should be
white at full brightness of the RGB components. |
| CPI |
Characters
Per Inch |
| CPU |
Central
Processing Unit – The portion of a computer which performs most of the
logical and arithmetic functions. |
| CPU |
Central Processing
Unit. The “brain” of the computer. |
| CRC |
Cyclical
Redundancy Checking. Used in data communications to create a checksum
character (hexadecimal) at the end of a data block. |
| CYAN |
A
colored ink. Reflects blue & green & absorbs red. |
| DAC |
Digital
to Analog Converter. Changes digital numbers to an electrical waveform. |
| DAD |
Digital
Audio Disk – "compact disk". |
| DAT |
Digital
Audio Tape – Although generally used for audio, a DAT (120 meters long) can
hold up to 10 gigabytes if used for digital data storage. Has the
disadvantage of being a serial, rather than a random access device. |
| Data Extraction |
The process of
pulling information out of either hard copy or electronic documents. The process may be manual (read and key)
or electronic via a pattern recognition methodology. |
| DB |
Data
Base. Information arranged in the computer in a rigorous, defined format to
allow ease of recording and retrieval. |
| Descenders |
the
portion of a character which falls below the main part of the letter (e.g. g,
p,q) |
| De-shading |
Removing shaded
areas to render images more easily recognizable by OCR. De-shading software
typically searches for areas with a regular pattern of tiny dots. |
| De-skewing |
The process of
straightening skewed (off-center) images. De-skewing is one of the image
enhancements that can improve OCR accuracy. Documents often become skewed
when they are scanned or faxed. |
| De-speckling |
Removing isolated
speckles from an image file. Speckles often develop when a document is
scanned or faxed. |
| DIA/DCA |
Document
Interchange Architecture. An IBM standard for transmission and storage of
voice, text or video over networks. |
| Digital |
A
system of mathematics consisting solely of zeros and ones. The mathematics
used by digital computers. Used to represent characters and numbers and to
mathematically manipulate these. Electronic Document Management 53 |
| Digitize |
The
process of converting an analog value into a digital (numeric)
representation. |
| Disc |
An
optical disc. |
| Disk |
A
magnetic floppy or hard disk. |
| Disk/Disc |
Round,
flat storage media with layers of material which enable the recording of
data. |
| Dithering |
Creating the illusion of new colors and shades by varying
the pattern of dots. Newspaper photographs, for example, are dithered. If you
look closely, you can see that different shades of gray are produced by
varying the patterns of black and white dots. There are no gray dots at all.
The more dither patterns that a device or program supports, the more shades
of gray it can represent. In printing, dithering is usually called halftoning,
and shades of gray are called halftones.
|
| Dithering |
Manipulating
the arrangement or shape of dots to simulated gray tones. (e.g. Newspaper
pictures). |
| Dithering |
The process of
converting grays to different densities of black dots, usually for the
purposes of printing or storing color or grayscale images as black and white
images. |
| Document |
One or several
single pages of images that make a logical single communication of
information. Examples include a
letter, a report, a memo or an airline ticket. |
| Document Date |
The original
creation date of a document usually noted on the document itself. In the case of a letter, when the letter
was written indicated by the date of the letter. On an email indicated by the date-stamp of the email. |
| Document Imaging Programs |
Software used to
store, manage, retrieve and distribute documents quickly and easily on the
computer. |
| Document
Sizes |
(U.S.):
A
Size 8.5" by 11" (A4)
B
Size 11" by 17" (A3)
C
Size 17" by 22" (A2)
D
Size 24" by 36" (A1)
E
Size 36" by 48" (A0)
|
| Document Type or Doc Type |
A typical field used in bibliographical
coding. Typical doc type examples
include letter, memo, report, article and others. |
| Document/Record |
A document is a
page or collection of pages that are physically or logically (or both)
linked. |
| Dot
Pitch |
Distance
of one pixel in a CRT to the next pixel on the vertical plane. The smaller
the number, the higher quality display. |
| DPI |
Dots
per inch. |
| Drag-and-Drop |
The movement of
on-screen objects by dragging them across the screen with the mouse. |
| DRAM |
Dynamic
Random Access Memory, a memory technology which is periodically
"refreshed" or updated – as opposed to "static" RAM chips
which do not require refreshing. The term is often used to refer to the
memory chips themselves. Varieties are: CDRAM Cache DRAM (contains static cache) EDODRAM
Extended data out DRAM EDRAM Enhanced DRAM (contains a static memory buffer
and cache controller) SDRAM Synchronous DRAM (added clock and burst
addressing capability) SGRAM Synchronous Graphics RAM (a single port SDRAM)
WRAM Window RAM (dual port video RAM) VRAM Video RAM (a dual ported DRAM,
good for graphics) |
| DSP |
Digital
Signal Processor (Processing) – a special purpose computer (or technique)
which digitally processes signals and electrical/analog waveforms. |
| DTP |
Desktop
Publishing. PC systems used to prepare direct print output or output suitable
for printing presses. |
| Duplex |
Two-sided page(s) |
| Duplex Scanners vs. Double-Sided Scanning |
Duplex scanners automatically
scan both sides of a double-sided page, producing two images at once.
Double-sided scanning uses a single-sided scanner to scan double-sided pages,
scanning one collated stack of paper, then flipping it over and scanning the
other side. |
| DVD |
Digital Video Disk |
| DVD |
Digital Video Disc
or Digital Versatile Disc. A plastic disc, like a CD, on which data can be
written and read. DVDs are faster, can hold more information, and can support
more data formats than CDs. |
| EDI |
Electronic
Data Interchange. Eliminating forms altogether by encoding the data as close
as possible to the point of the transaction. (e.g. Paying your phone bill
direct from your PC to the system used by the phone company.) |
| EDMS |
Electronic
Document Management Systems. |
| EGA |
Extended
Graphics Adapter. See VGA. |
| EIA |
Electronic
Industries Association – a trade association. |
| EIM |
Electronic
Image Management. |
| EISA |
Extended
Industry Standard Architecture. One of the standard busses used for PC's. |
| Electronic Data Discovery (EDD or ED) |
A process, just like that for paper documents, of documents and data that exist in a medium that can oly be accessed through the use of a computer. |
| Electronic Document Management |
Imaging,
Indexing/Coding and Archiving of scanned images |
| Electrostatic
Printing |
Paper
is exposed to electron charge. Toner sticks to the charged pixels. |
| Em |
In
any print font or size is equal to the width of the letter "M" in
that font and size. |
| En |
Half
the width of an Em. |
| Encryption |
The
coding of messages to increase security and make transmission only readable
by recipients with the ability to decode only by using the same algorithms. |
| End Document Number or End Doc# |
The last single
page image of a document |
| End User Program |
The program used
to perform searches, viewing and retrieval of a scanned and/or coded
collection of images. Examples
include Summation, Concordance, JFS Litigators Notebook, Ringtail, Paradox,
InMagic DB/Textworks and many others. |
| Endorser |
A
little printer in a scanner that adds a document-control number to each
scanned sheet. Some forms control processing software can control this
printer. |
| Enhanced Titles |
The act of reading
a document and creating a meaningful title for it in Bibliographical
coding. The opposite of Verbatim
Titles. |
| EOF |
End
of File. A distinctive code which uniquely marks the end of a data file. |
| EPP |
Enhanced
Parallel Port – also known as Fast Mode Parallel Port. A new, industry
standard parallel port, having high transfer times competitive with SCSI. |
| EPS |
Encapsulated
PostScript. Uncompressed files for images, text and objects. Only print on
PostScript printers. |
| Erasable Optical Drive |
A type of optical
drive that uses erasable optical discs. |
| ESDI |
Enhanced
Small Device Interface. A defined, common electronic interface for
transferring data between computers and peripherals, particularly disk
drives. |
| FAT |
File
Allocation Table – An internal data table on DOS-based disks that lists the
contents and address of each file on the disk. |
| FAX |
Short
for facsimile. A process of transmitting documents by scanning them to
digital, converting to analog, transmitting over phone lines and reversing
the process at the other end and printing. "Group 3" indicates the
3rd generation of faxes which transmits a page at 9600 baud in about a minute
– with a normal resolution of 203 x 98 dpi and a fine resolution of 203 x
196. |
| FLOPS |
FLOPS are floating-point operations per second. Floating-point is, according to IBM, "a method of encoding real numbers within the limits of finite precision available on computers." Using floating-point encoding, extremely long numbers can be handled relatively easily. A floating-point number is expressed as a basic number or mantissa, an exponent, and a number base or radix (which is often assumed). The number base is usually ten but may also be 2. Floating-point operations require computers with floating-point registers. The computation of floating-point numbers is often required in scientific or real-time processing applications and FLOPS is a common measure for any computer that runs these applications. |
| Fiber
Optics |
Transmitting
with light pulses over cables made from thin strands of glass. Field
Separator A code, usually a comma, that separates the fields in a record.
(Also, a "delimiter") |
| Field or Data Field |
A name for an
individual piece of standardized data to be extracted from an image
collection. Fields can be the author
of a document, a recipient, the date of a document or any other piece of data
common to most documents in an image collection. |
| Flatbed Scanner |
A flat-surface
scanner that allows users to input books and other documents. |
| Folder Browser |
A system of
on-screen folders (usually hierarchical or “stacked”) used to organize
documents. For example, the File Manager program in Microsoft Windows is a
type of folder browser that displays the directories on your disk. |
| Forensics |
In document
management terms, forensic work is comprised of:
-
Recreating “deleted” or missing
files from hard drives
-
Validating dates and logged in
authors / editors of documents
-
Certifying key elements of documents
and/or hardware for legal purposes
|
| Forms Processing |
A specialized
imaging application designed for handling pre-printed forms. Forms processing
systems often use high-end (or multiple) OCR engines and elaborate data
validation routines to extract hand-written or poor quality print from forms
that go into a database. This type of imaging application faces major
challenges, since many of the documents scanned were never designed for
imaging or OCR. |
| Forms
Routing |
The
process of routing a form throughout an organization electronically –
without any paper copies. |
| FTP |
File
Transfer Protocol. An Internet protocol to move files from one computer to
another. |
| Full
Duplex |
Data
communications devices which allow full speed transmission in both directions
at the same time. |
| Full
Text Search |
The
ability to search a data file for specified key(s) defined by the occurrence
of words, numbers and/or combinations or patterns thereof. |
| Full-text Indexing and Search |
Enables the
retrieval of documents by either their word or phrase content. Every word in
the document is indexed into a master word list with pointers to the
documents and pages where each occurrence of the word appears. |
| Fuzzy Logic |
A full-text search
procedure that looks for exact matches as well as similarities to the search
criteria, in order to compensate for spelling or OCR errors. |
| GIF |
A
compressed file format used by the CompuServe system for photographs. Limited
to 256 colors. |
| GIF |
CompuServe’s
native file format for storing images. |
| Gigabyte |
A
billion bytes or 1,000 megabytes (See "BYTE"). |
| Gigabyte |
One billion bytes.
Also expressed as one thousand megabytes. In terms of image storage capacity,
one gigabyte equals approximately 17,000 81/2" x 11" pages scanned
at 300 dpi, stored as TIFF Group IV images. |
| Gray Scale |
The use of many shades of gray to represent an image. Continuous-tone
images, such as black-and-white photographs, use an almost unlimited number
of shades of gray. Conventional computer hardware and software, however, can
only represent a limited number of shades of gray (typically 16 or 256).
Gray-scaling is the process of converting a continuous-tone image to an image
that a computer can manipulate. |
| Gray
Scale |
The
binary range of a graphic representation between pure black and pure white. A
scale of 256 shades of gray will be a better representation than 16 shades. |
| Grayscale |
See
“Scale-to-Gray.” |
| Groupware |
Software
designed to operate on a network and allow several people to work together on
the same documents and files. |
| GUI |
Graphical
User Interface, or "gooey". Presenting an interface to the computer
user comprised of pictures and icons, rather than words and numbers. |
| Half
Duplex |
Transmission
systems which can send and receive, but not at the same time. |
| Halftone |
The
graphic representation of an object by dots, which simulate continuous tones.
Usually used to represent or replicate an original photograph input. |
| Halftone dots |
Vary in size; larger appear darker, smaller appear
lighter. |
| HD |
High
Density (Floppy Disks) – A 5.25" holds 1.2 MB and a 3.5" holds 1.4
MB. |
| Hexadecimal |
A
number system with a base of 16 (2 4 ), 4 bits. The position digits are 0-9, A-F, where F equals the
decimal value, 15. |
| Hierarchical Storage Management (HSM) |
Software that
automatically migrates files from on-line to near-line storage media, usually
on the basis of the age or frequency of use of the files. |
| Holorith |
encoded data on aperature cards or old-style
punch cards that contained encoded data |
| Host |
In
a network, the central computer which controls the remote computers and holds
the central databases. |
| HP-PCL
& HPGL |
Hewlett-Packard
graphics file formats. |
| HTML |
A
Hypertext Markup Language, developed by CERN of Geneva, Switzerland. The
document standard of choice of Internet. (HTML+ adds support for
multi-media.) |
| Hub |
A
central unit that repeats and/or amplifies data signals being sent across a
network. |
| Icon |
In
a GUI, a picture or drawing which is activated by "clicking" a
mouse to command the computer program to perform a predefined series of
events. |
| ICR |
Intelligent
Character Recognition. The conversion of scanned images (bar codes or
patterns of bits) to computer recognizable codes (ASCII characters and files)
by means of software/programs which define the rules of and algorithms for
conversion. |
| ICR |
Intelligent Character Recognition. A software process
that recognizes handwritten and printed text as alphanumeric characters. |
| IDE |
Integrated
Drive Electronics – An engineering standard for interfacing PC's and hard
disks. |
| IEEE |
Institute
of Electrical and Electronic Engineers. An international association which
sponsors meetings, publishes a number of journals and establishes standards. |
| Image Enabling |
A software
function that creates links between existing applications and stored images. |
| Image Key |
The name of a file
created when a page is scanned in a collection. |
| Image
Processing |
To
capture an image or representation, enter in a computer and process and
manipulate it. |
| Image Processing Card (IPC) |
A board mounted in
either the computer, scanner or printer that facilitates the acquisition and
display of images. The primary function of most IPCs is the rapid compression
and decompression of image files. |
| Index |
Creating
a set of rules and data files which define scanned document sets and allow
easy and complete retrieval. |
| Index/Coding Fields |
Database fields
used to categorize and organize documents. Often user-defined, these fields
can be used for searches. |
| Indexing |
Universal term for
Coding and Data Entry |
| Interlaced |
TV
& CRT pictures must constantly be "refreshed". Interlace is to
refresh every other line once/refresh cycle. Since only half the
information displayed is updated each cycle, interlaced displays are less
expensive than "non-interlaced". However, interlaced displays are
subject to jitters. The human eye/brain can usually detect displayed images
which are completely refreshed at less than 30 times per second. |
| Internet |
A
worldwide computer network containing a broad array of services and
information available to any individual with a PC and the paid connection. |
| Internet Publishing |
Specialized
imaging software that allows large volumes of paper documents to be published
on the Internet or intranet. These files can be made available to other
departments, offsite colleagues or the public for searching, viewing and
printing. |
| IPX/SPX |
Communications
protocol used by Novell networks. |
| ISA |
Industry
Standard Architecture. |
| ISDN |
Integrated
Services Digital Network. An all digital network which can carry data,
video and voice. |
| ISIS and TWAIN Scanner Drivers |
Specialized
applications used for communication between scanners and computers. |
| ISO |
International
Standards Organization. |
| ISO 9660 CD Format |
The International
Standards Organization format for creating CD-ROMs that can be read
worldwide. |
| JMS |
Jukebox
Management Software. |
| JPEG |
A
compression algorithm for still images, see MPEG. |
| JPEG |
An image
compression format used for storing color photographs and images. |
| Jukebox |
A mass storage
device that holds optical disks and loads them into a drive. |
| Juke-Box |
Automated
disk changer for high-performance, centralized storage for multifunction
CD-ROM's & optical disks |
| K |
Generally
accepted as shorthand for 1,000. Actually stands for 2 10 or 1,024. |
| Kerning |
Adjusting
the spacing between two letters from the "normal" spacing. Often
done to enhance the quality of the typography – for instance in a headline. |
| Key Field |
Database fields
used for document searches and retrieval. Synonymous with “index field.” |
| Keywords |
Used in
bibliographical coding to indicate that each page in a collection must be
reviewed for certain important words and wherever they occur the database
must reference the page where they occur. |
| Kofax
Board |
The
generic term for a series of image processing boards manufactured by Kofax
Imaging Processing. These are used between the scanner and the computer, and
perform realtime image compression and decompression for faster image
viewing, image enhancement, and corrections to the input to account for
conditions such as document misalignment, "speckles," etc. |
| LAN |
Local
Area Network – usually a collection of PC's, connected by cable. Landscape
Mode The image is represented on the page or monitor such that the width is
greater than the height. |
| Laser
Disk |
Same
as an optical CD, except 12" in diameter. |
| Latency |
The
time it takes to read a disk (or jukebox), including the time to physically
position the media under the read/write head, seek the correct address and
transfer it. |
| Leading/"Ledding" |
The
amount of space between lines of printed text. |
| Level Coding |
Used in
Bibliographical coding to indicate that certain document types will get a more
thorough extraction of data than others.
Thus they get a deeper “level” of coding. |
| Line
Screen |
The
number of half-tone dots that can be printed per inch. As a general rule,
newspapers print at 65 to 85 lpi, large city newspapers at 100 or 120 lpi; magazines
at 133 or 150 lpi; and, glossy, "coffee table" books at 175 to 200. |
| Load file |
A file that
relates to a set of scanned images and indicates where individual pages
belong together as documents. |
| Lossless compression |
Exact construction of image, bit-by-bit, with no loss of
resolution or color fidelity |
| Lossy compression |
Reduces storage size of image by reducing the resolution
and color fidelity while maintaining minimum acceptable standard for general
use. |
| LZW |
Lempel-Zif
& Welch. A common, lossless compression standard for computer graphics –
used for the majority of TIFF files. Typical compression ratios are 4/1. |
| Magenta |
Used
in four color printing. Reflects blue & red and absorbs green. |
| Magneto-Optical Drive |
A drive that
combines laser and magnetic technology to create high-capacity erasable
storage. |
| MAPI |
Mail Application
Program Interface. This Windows software standard has become a popular e-mail
interface and is used by MS Exchange, GroupWise, and other e-mail packages. |
| MAPI Mail Near-Line |
Documents stored
on optical disks or compact disks that are housed in the jukebox or CD
changer and can be retrieved without human intervention. |
| Marginalia |
Handwritten notes
in the margin of the page in documents. |
| Mastering |
Making
many copies of a CD-ROM from a single master. |
| MCA |
Micro
Channel Architecture – an IBM buss standard. |
| MDE |
Magnetic
Disk Emulation. Software that makes a jukebox look and operate like a
hard-drive such that it will respond to all the I/O commands ordinarily sent
to a hard drive. |
| Megabyte |
A unit of information or coputer storage equal to approximately one million bytes. A megabyte is commonly abbreviated as MB and sometimes meg. |
| Meta Data |
The data that is
attached to files in a computerized filing system. For instance, in a word processing document, the metadata
includes: the author, date created, person and date editing the document, the
name of the document, the location stored on a hard drive, how many times and
when it has been accessed, changed or altered, etc. |
| MICR |
Magnetic
Ink Character Recognition. The process used by banks to encode checks.
Microfilm Film on which documents etc. are photographically greatly reduced
in size. |
| Microfiche |
Reduced
sized document(s) filed on sheet microfilm (4" by 6"), containing
reduced images of 270 pages or more in a grid pattern. Usually with a
human-readable title. |
| MO |
Magneto-Optical.
A disk storage technology which competes with traditional magnetic hard
disks. Form factors are 3.5", 5.25" and 12". Advantages are
that one 5.25" MO drive can store about 1.3GB (3 1/2" hold up to
230MB); media is removable and portable; and, can last for 20 years – ideal
for archival storage. The disadvantages are cost, traditionally slower disk
access and longer disk write times. The information is written on the disk by
changing the polarity with strong magnets and read by a laser by sensing the
magnetic flux changes (1's or 0's). This technology is re-usable. |
| MODEM |
Modulator/Demodulator.
A device which can take digital data from a computer, translate it into
analog signals (tones) and transmit the information over telephones lines.
Another modem at the receiving computer will receive the information,
translate it back from analog to digital and store it. Typical speeds are
from 1,200 to 14,400 bits per second. Some modems also correct any errors
which occur in the transmission process. |
| Monochrome |
Displays
capable of only two colors, usually black & white. Mosaic A program used
for finding and reading documents on the World-Wide-Web. |
| MPEG-1
& 2 |
Two
different standards for full motion video to digital
compression/decompression techniques advanced by the Moving Pictures Experts
Group. MPEG-1 compresses the bandwidth needed for 30 frames/second of
full-motion video (several hundred megabytes) down to about 1.5 Mbits/sec.
MPEG 2 only compresses to about 3 Mbits and provides for better image quality
when comparing compressed files of the same size. This industry application
competes with other compression techniques, know as JPEG, Captain Crunch,
Cinepak and Indeo. |
| MS-DOS |
Microsoft
(MS)-Disk Operating System. Used in PC's as the control system. |
| MTBF |
Mean
Time Between Failure. Average time between failures. Used to compute the
reliability of devices/equipment. |
| MTTR |
Mean
Time To Repair. Average time to repair. The higher the number, the most
costly and difficult to fix. |
| Multisynch |
Analog
video monitors which can receive a wide range of display resolutions, usually
including TV (NTSC). Color analog monitors accept separate red, green &
blue (RGB) signals. |
| NetWare Loadable Module (NLM) |
An application
that runs as part of the network operating system (NOS) of a Novell NetWare server. |
| Non-Interlace |
When
each line of the video image is scanned separately. Computer monitors use
non-interlaced video. |
| NT |
Network
Technology. Refers to Microsoft Windows NT server and workstation software. |
| NTSC |
National
Television System Committee. The North American TV standard – analog, 525
lines @ 30 frames per second. TV's line scan rate is then 15,750 lines per
second (525 lines @ 30 Hz). |
| OCR |
Optical
Character Recognition. The computer conversion of scanned input images (bar
codes or patterns of bits) to computer recognizable codes (ASCII letters,
numbers and characters). |
| OCR |
Optical Character
Recognition. A software process that recognizes printed text as alphanumeric
characters. |
| OEM |
Original
Equipment Manufacturer – Classically, a company who buys products from
another company, re-labels the products under its own name and re-sells
(usually in large quantities). Has come to define nearly any large customer
who re-sells products, branded or not. |
| Off-Line |
Archival documents
stored on optical disks or compact disks that are not connected or installed
in the computer, but instead require human intervention to be accessed. |
| OLE |
Object
Linking and Embedding. A feature in Microsoft's Windows which allows each
section of a compound document to call up its own editing tools or special
display features. This allows for combining diverse elements in compound
documents. 60
Glossary |
| OWR |
Optical word recognition is the next generation of text conversion. Available only with certain retrieval systems (including AmDoc's iCONECTnxt offering), OWR allows users to find misspelled words. Matches are done on word-by-word basis (entire string of characters) rather than on a letter-by-letter match. This allows more accurate recognition. |
| On-Line |
Documents stored
on the hard drive or magnetic disk of a computer that are available
immediately. |
| Optical Disks |
Computer media
similar to a compact disc that cannot be rewritten. An optical drive uses a
laser to read the stored data. |
| Optical Jukebox |
See “Jukebox.” |
| PackBits |
A
compression scheme which originated with the Macintosh. Suitable only for
black & white. |
| Packet |
A
fixed block of data transmission which also contains identity and routing
information. |
| Page |
A single image of
a “one piece of paper”. One or
several pages make up a “Document” |
| PAL |
Phased
Alternative Line, the TV standard used in most of European. PAL uses 625
lines per frame and 25 frames per second – versus 30 for NTSC, resulting in
more flicker. |
| Paper
Styles & Definitions. |
a. Acid Free Paper – Won't change color (yellow) for many years.
b. Brightness – The percentage of light the paper reflects. Most white
papers reflect 60% to 90%.
c. Coated Papers – "glossy" paper, coated with clay.
d. Cotton "Rag" Paper – Premium paper with 25% to 100% cotton
fibers.
e. Laid finish – Paper surface embossed with lines to resemble handmade
paper.
f. Ream – 500 sheets.
g. Vellum finish – A less smooth version of real vellum (fine parchment).
h. Wove finish – Very smooth surface. Characteristic of the majority of
papers made
|
| Parallel |
Transmission
of all the bits (e.g. in a character) at the same time. If the character has
eight bits, there are eight wires. Faster and more expensive than serial
where the eight bits would be sent, "sideways", one at a time. |
| Pattern Recognition |
An electronic
application utilizing an algorithm that searches data for like patterns and
flags or extracts the pertinent data.
For instance, in looking for addresses, alpha characters followed by a
comma and a space followed by two capital alpha characters followed by a
space followed by five or more digits are usually the city, state and zip
code. By programming the application
to look for that pattern, the information can be electronically extracted
rather than re-keyed by human intervention. |
| PCI |
Peripheral
Component Interface (Interconnect). A high-speed interconnect local bus used
to support multimedia devices. Promoted by Digital among others. |
|