Digitization line with data extraction

  • Contracting authority: Komerční pojišťovna a.s.

The digitization line is located at the workplace in KPKB Jihlava. Complete process of digitization takes place there, ie sorting documents according to type of processing, scanning, OCR, possible subsequent correction of values and export of scanned images. Original paper documents are then stored in archive boxes. The software used is Teleform version 11.2 (from 9.6.2017). The whole digitization process is provided in Teleform.

The production line has its own virtual server, where a database (MSSQL Express) also runs. A separate physical station is reserved for extraction for the Reader module.

One station with its own scanner and desktop installation of Teleform is reserved for testing. Remote access via RDP is provided for the station.

Processing is divided into several types of tasks according to content of the processed form. Scanning is in smaller batches, usually containing forms relating to only one contract or policyholder, so often documents are only scanned one at a time. Some information about the scanned documents, such as contract number, document receipt date, type, workflow or archive box number, is entered before scanning. Tasks that have more than one template defined have QC set for possible manual identification.

After the OCR process, the extracted data is corrected on the forms. Forms in which only the contract number is extracted are not extracted if the extracted number is the same as the one entered during scanning.

Data export is in xml or xls format for form data and tif for scanned images. The color depth of the output images depends on the scan profile settings.