WeScanFiles.com Glossary

Access Rights
A security mechanism that lets the system administrator determine which objects (folders, documents, etc.) users can open. It should be possible to set access rights should for groups and individuals

ADF
Automatic Document Feeder; this is the means by which a scanner feeds the paper document.

Annotations
The changes or additions made to a document using sticky notes, a highlighter or other electronic tools. Document images or text can be highlighted in different colors, redacted (blacked-out) or stamped [e.g., FAXED or CONFDENTIAL), or have electronic sticky notes attached. Annotations should be overlaid and not alter the original document.

ASCII
American Standard Code for Information Interchange; Used to define computer text that was built on a set of 255 alphanumeric and control characters. ASCII has been a standard, non- proprietary text format since 1963.characters. ASCII has been a standard, non-proprietary text format since 1963.
ASP Active Server Pages
Technology that simplifies customization and integration of Web applications; ASPS reside on a Web server and contain a mixture of HTML code and server-side scripts. An example of ASP usage includes having a server accept a request from a client, perform a query on a database and then return the results of the query in HTML format for v

Audit Trail
An electronic means of tracking all access to a system, document or record, including the modification, deletion and addition of documents and records.


Bar Code
A small pattern of lines read by a laser or an optical scanner, which correspond to a record in a database; An add-on component to document management software, bar-code recognition is designed to increase the speed with which documents can be stored or archived.

Batch Processing
The name of the technique used to input a large amount of information in a single step, as opposed to individual processes.

BMP
The abbreviation for a native file format of
Windows for storing images called bitmaps.

Boolean Logic
The use of the terms AND, OR and NOT in conducting searches.


Caching (of Images)
The temporary storage of image files on a hard disk for later migration to permanent storage.

CD-R
Is short for CD-Recordable; a CD that can be written (or burned) only once. It can be copied as a means to distribute a large amount of data. CD-Rs can be read on any CD-ROM drive whether on a standalone computer or network system. This makes interchange between systems easier.

CD-ROM
Compact Disc-Read Only Memory is written on a large scale and not on a standard computer CD burner (CD writer).

Client-Server Architecture vs. File- Sharing
TIYO common application software architectures found on computer networks. With file-sharing applications, all searches occur on the workstation, while the document data- base resides on the server. With client-server architecture, CPU-intensive processes (such as searching and indexing) are completed on the server, while image viewing occurs on the client. File-sharing applications are easier to develop, but they tend to generate network data traffic in document management applications. They also expose the database to corruption through workstation interruptions. Client-server applications are more difficult to develop, but dramatically reduce network data traffic and insulate the database from workstation interruptions. See also n-Tier Architecture.
Computer Output to Laser Disc is a process that outputs electronic records and printed reports to laser disc instead of a printer. Can be used to replace COM (Computer Output to Microfilm) or printed reports.

COM
Computer Output to Microfilm is a process that outputs electronic records and computer generated reports to microfilm.


Deshading
Removing shaded areas to render images more easily recognizable by OCR.

Deskewing
The process of straightening skewed (off-center) images. Documents can become skewed when they are scanned or faxed. Deskewing is one of the image enhancements that can improve OCR accuracy.

Despeckling
Involves removing isolated speckles from an image file; Speckles can develop when a document is scanned or faxed.

Dithering
The process of converting grays to different densities of black dots, usually for the pur- poses of printing or storing color or grayscale images as black and white images.

Document Management
Software used to store: manage, retrieve and distribute digital and electronic documents, as well as scanned paper documents.

Duplex Scanners vs. Double-Sided Scanning
Duplex scanners automatically scan both sides of a double-sided page, producing two images at once. Double-sided scanning uses a single-sided scanner to scan pages, scanning one collated stack of paper, then flipping it over and scanning the other side.

N/A

Flatbed Scanner
A flat-surface scanner that allows users to capture pages of bound books and other non- standard-format documents.

Folder Browser
A system of on-screen folders (usually represented as hierarchical, or stacked) used to organize documents. For example, the Windows Explorer program in Microsoft” Windows is a type of folder browser that displays the directories on your disk.

Forms Processing
A specialized document management application designed for handling preprinted forms. Forms processing systems often use multiple OCR engines and elaborate data validation routines to extract hand-written or poor quality print from forms to go into a database. With this type of application, it is essential to have good quality assurance mechanisms in place, since many of the forms that are commonly scanned were never designed for imaging or OCR.

Full-Text Indexing and Search
Enables the retrieval of documents by either word or phrase content; every word in the document is indexed into a master word list with pointers to the documents and pages where each occurrence of the word appears.

Fuzzy Logic
Is a full-text search procedure that looks for exact matches as well as similarities to the search criteria, in order to compensate for spelling or OCR errors.


GIF
Graphics Interchange Format is a file format for storing images.

ICR
Intelligent Character Recognition is a software process that recognizes handwritten and printed text as alphanumeric characters.
ICR allows for fast, straightforward manipulation of an imaging application through third-party software. For example, image enabling allows for launching the imaging client interface, displaying search results in the client and bringing up the scan dialogue box, all from within a third-party application.

Index Fields
Are database fields used to categorize and organize documents; often user-defined, these fields can be used for searches.

Internet Publishing
Specialized document management software that allows large volumes of paper documents to be published on the Internet or intranet. These files can be made available to other departments, offsite colleagues or the public for searching, viewing and printing.

lSlS and TWAIN Scanner Drivers
Specialized applications used for communication between scanners and computers.


JPEG
Joint Photographic Experts Group (JPEG or JPG) is an image-compression format used for storing color photographs and images.

Key Field
Are database fields used for document searches and retrieval; Synonymous with index field.

MFP
Multifunction Printer or Multifunctional Peripheral is a device that performs any combination of scanning, printing, faxing, or copying.

OCR
Optical Character Recognition (OCR) is a soft-ware process that recognizes printed text as alphanumeric characters. OCR enables full- text searches of documents and records.

Open Architecture
Applied to hardware or software whose design allows for a system to be easily integrated with third-party devices and applications.


Pixel
Picture Element is a single dot in an image. It can be black and white, grayscale or color.

Portable Volumes
Is a feature that facilitates the transfer of large volumes of documents without the need to copy multiple files. Portable volumes enable individual CDs to be easily regrouped, detached and reattached to different databases for a broader information exchange.


Raster/Rasterized [Raster or Bitmap
Drawing]

A method of representing an image with a grid (or map) of dots or pixels. Typical raster file formats are GIF, JPEG, TIFF, PCX, BMP, etc.
Record Information, regardless of medium, that constitutes evidence of an organization’s business transactions.

Redaction
A type of document annotation that provides additional security by concealing from view specific portions of sensitive documents, such as particular words or phrases; like all annotations in a document management system, redactions should be image overlays that protect information but do not alter original document images. An area of an image file that is selected for specialized processing.

Retention Period
The length of time that a record must be kept before it can be destroyed. Records not authorized for destruction are designated for permanent retention.


Scalability
The capacity of a system to scale up, or expand, in terms of document capacity or number of users without requiring major reconfiguration or re-entry of data. For a document management system to be scalable, it must be easy to configure multiple servers or add storage.

Scanner
An input device commonly used to convert paper documents into computer images. Scanner devices are also available to scan microfilm and microfiche.

Security Markings or Tags
Within records management applications, a security-based metadata field intended to define and restrict access, as well as facilitate classification and retrieval.

SCSl Scanner Interface
The device used to connect a scanner with a computer.

SQL
Structured Query Language is a popular standard for running database searches (queries) and reports.


Thumbnails
Small versions of an image used for quick overviews that give a general idea of what an image looks like.

TlFF
Tagged Image File Format is a non-proprietary raster image format, in wide use since 1981, which allows for several different types of compression. TIFFs may be either single or multipage files. A single-page TIFF is a single image of one page of a document. A multi- page TIFF is a large, single file consisting of multiple document pages. Document management systems that store documents as single- page TIFFs offer significant benefits in net- work performance over multipage TIFF systems. A one-dimensional compression format for storing black and white images that is utilized by most fax machines.


Versioning
In document or records management applications, the ability to track new versions of documents after changes have been made.