New to Document Management

Don't panic! Everybody is to start with. We have tried to put together a brief description of the many options and terminology used within our industry.

Hopefully this will make it easier for you to decide which areas of technology you might like to investigate further.

Documotive provide complete business solutions that contain a number of core key elements that will improve the efficiency of document and process management.

Remember once you have decided which options are best suited to your particular requirements you can request further information, a quotation or a free initial consultation with one of our team.

Document Management Equipment

Document Management Services

Document Scanning

Paper documents can be converted to industry standard Tiff or PDF formats for electronic storage within a searchable database. Documents may be single or multi-page and can be indexed with a wide variety of search fields. Once scanned the documents are indexed using the most appropriate references and the flexible search criteria means that documents can be accessed immediately from your PC workstation. The fully indexed database of information is normally transferred to CD-ROM or DVD-ROM or can be hosted remotely for secure web based access (please see hosting section for more details). This format of storage is legally admissible given the nature of storage WORM (Write Once Read Many)

A full set of guidelines are available from BSI and in particular publication: BIP 0008-2:2005 A Code of practice for legal admissibility and evidential weight of information communicated electronically. This code is primarily concerned with the authenticity, integrity and availability of electronically communicated information, to the demonstrable levels of certainty required by an organization. It is particularly applicable where this communicated information may be used as evidence in disputes inside and outside the legal system.


Forms Processing

This method of information capture is ideal for any organisation that receives high volumes of structured or unstructured paper forms. OCR/ICR/OMR technology allows the extraction of data directly from scanned images for output to any Line of Business (LoB) System. It is possible to read Handwriting, Machine printed characters, Tick boxes and bar codes to help reduce the amount of manual data entry...

Forms processing can be a very effective way of 'capturing' data from paper automatically. Forms can be read, verified, stored and processed much more efficiency thus making the job of processing the information much easier.


OCR

OCR stands for Optical Character Recognition and is a method of reading machine printed characters from a scanned document image.

When a paper document is scanned it is converted to 'dot file' or picture of the original.

The most common format of image produced is TIFF (Tagged Image File Format).

Each character is represented by a series of dots (hence the term dots per inch) and during the OCR process the shape of the dots is matched to a character on your keyboard. A text file ids then produced that represents the shapes of the characters on the page. The contents of the text file are then associated with the image so that a user can search for words that appear within the scanned document.

The accuracy of the OCR process depends on which OCR 'engine' is used and the quality of the scanned image.


ICR

ICR stands for 'Intelligent Character Recognition' and is similar in process to OCR with some very clear differences.

ICR has built in intelligence to read hand written characters in addition to machine print. The main difference is the ability to set intelligent rules to the processing of the shapes of character. I.e. a user can specify that a certain area on a document will contain a Date and therefore must conform to XX/XX/XXXX format. Additional rules for processing Alpha/Numeric or Alpha/Numeric data can be applied to increase accuracy.

Data validation is an essential part of the process where a human is asked to validate or check that the data follows the rules applied.

In most cases the 'trust' level can be modified accordingly so that false negatives are not highlighted once confident that the system is obtaining accurate data.


OMR

OMR stands for 'Optical Mark Recognition' and is the technology used in processing forms when tick boxes are used. The technology allows the automatic reading of tick boxes to output a file containing the choices selected within the form.

An example of the use of OMR technology is when a user has multiple choice questions that they may select by ticking a box. Business rules can be applied to the collection of this data to enable only a single choice, i.e. Are you Male or Female or alternatively a multiple choice such as what type of hobbies do you enjoy, Football, Reading, Walking etc.

OMR technology is generally bundled with ICR technology applications.


Microfilm/Microfiche

A traditional format of storage that has been around for many years. Many people get confused between Microfilm and Microfiche. They are actually the same thing although the format of presentation is different. Microfilm refers to a roll of film usually 100ft (30M) long where approx. 2,000 - 2,500 images are stored. Microfiche is the roll film chopped up into sections and then loaded in to 'jackets'. The 'jackets' are then copied and this becomes the microfiche. Each microfiche holds approx. 40 images per card. The card is indexed and therefore information can be accessed more quickly than from a roll.

Paper documents are filmed using equipment that reduces the size of the image to typically 24X reduction. The film is then developed and can be viewed using a Microfilm/Fiche Reader/Printer. Although Microfilm/Fiche is a space efficient method of storage retrieval it is not an ideal format of storage for information that is constantly retrieved.

Microfilm/Fiche is a non-technology dependant method of long term storage.


COM

Computer Output to Microfilm is a fast fading method of storage. It was typically used before COLD or COOL became available.(See below).

This method of storage enables a computer file i.e. Invoices, Statement and/or financial reports to be taken directly from computer output file and transferred on to microfiche. The images are normally reduced at a much greater rate than from paper documents and therefore more images can be stored per fiche.

Standard COM Fiche will hold 270 A4 images per Fiche and a microfiche reader/reader printer is required to view and/or print the images.


COLD

Computer Output to Laser Disk is the modern alternative to COM (See above). Computer files such as Invoices, Statements, Credit Notes, and Financial Reports can be captured directly from computer output file (normally ASCII format - American Standard Code for Information Interchange) and stored on CD-ROM or DVD-ROM. During the capture process the information is automatically indexed using pre-set fields in addition to full text search. The information can be displayed on P.C using a template of the original document i.e. a blank invoice or statement thus appearing identical to the original. This method of storage is very popular and saves space and retrieval time. Images can be access via any of the pre-defined search criteria or by any text values within the file.


COOL

Computer Output On Line is identical to COLD (See above) although the information is not transferred to CD or DVD but is hosted on a server for access via a browser. Thus, the term On-Line.


Hosting Services

This service is the modern alternative to off-site document storage of paper records.

Web enabled access to information is possible using a standard browser and given the advances in communications documents can be accessed instantaneously. Anyone who has access to the Internet from their P.C can be granted access to information stored anywhere in the world. Security is paramount and many different options are available to ensure critical or confidential information is protected. Specialist data encryption methods are used to ensure that access is only provided to authorised users.

The 'virtual' document store has many advantages over paper based storage in terms of access times however many document storage companies provide a 'scan-on-demand' service that allows clients to securely request files from storage for scanning and upload to the secure hosted service. A notification is sent to the requester when the file is available and they can access it via a secure link.


Intranet

An intranet is a private computer network that securely shares part of an organization's information or operations with its employees. Sometimes an intranet is also referred to as a private internal website.

An intranet can be understood as "a private version of the Internet," or as a version of the internet confined to an organization.

Many organisations use the intranet to publish company news, documents or guidelines as well as other information.


Document Management Equipment

Scanners

There are lots of different document scanners available from many manufacturers. The quality, speed and durability differ between models as does the cost.

Some manufacturers have developed specific options and features that may be useful for specific tasks depending on your particular requirements.

During the selection of a scanner it is important to take account of future developments in the use of your document management system rather than your immediate requirement.

In most cases a few months after the initial installation of a scanning system customers will find additional types of information that would benefit scanning and this will have an impact on the workload of the scanner.

Many scanners offer sheet feeding, double feed page detection and varied options of scan quality and image output. In most cases standard paper documents can be scanned at between 200-300dpi (dots per inch), the higher the quality a document is scanned, the larger the file size. It is important to note that the scanner is only a vehicle to capture the paper document image and other factors of the overall system play important roles. Most business documents can be scanned at 200dpi.

Most scanners come with some form of software although this may not always be used depending on your chosen Document Management system. A scanner Driver is required to enable the scanner to talk to the P.C. The set-up of this driver should be a simple operation from CD that will be supplied with the device. If the provided scanner driver is faulty or fails to load you can normally download a replacement from the manufacturers website (refer to packaging for address or search using the internet).

The installation, set-up and test of the scanning equipment is normally performed by the supplier. Sometimes it is possible to purchase the scanner directly from a manufacturer although you need to ensure that you are getting the same level of service when choosing your supplier, they may not deliver, install and commission the device for a lower quoted price.


Software

Document Management Software breaks down in to different sections. The following is a step by step description of the main areas of the process.


Image Capture

The software used to communicate with a document scanner is normally referred to as Image Capture software. The image capture software will allow the user to select the number of pages to be scanned, select multi-page or single page documents and user barcode separator sheets. Barcodes can also be used in some cases to identify individual documents and features such as separate on barcode allows the user to stack feed a batch of single and multi-page documents that will be separated every time a barcode is detected. The value of each barcode can also be captured and associated with each image.


Forms Processing

This option is for information extraction directly from forms (as described in services). Once the images have been captured by the scanner the batch of documents can then be processed using the Forms Processing software. During this stage the software follows pre-defined business rules on what information is to be collected and which rules to apply to the various areas of the document. It is normally possible to design forms within the application or alternatively existing forms may be used. Forms designed within the application normally have a much greater level of accuracy as they are designed for the ease of data capture.


Database

Once the information is captured and processed the database is where the information is stored. If the images have been processed using forms processing then the index will be applied to the database automatically during the import of the information. In the case of manual data entry or indexing, personnel would apply the correct information to the images in order that they can be retrieved using the specific fields.

In most cases databases are ODBC (Open DataBase Connectivity) compliant which means that the information within the database can be interrogated and used in other applications i.e. a user may perform a search for certain information and then take the results of that search and use them within another application or display them in a table or spreadsheet.