Verba Logica Home page | BiblioTECA Home page | BiblioTECA Implementation | Previous | Next
The module contains two submodules: a) IDR sensu stricto, a module in which automatic character recognition is implemented and b) the Videocoding module, a word and character correction module.
The data processing inside this module could be represented as follows:
Images are processed in batch. The functionality in the module is ordered around MCS data model. It is then possible to inspect, for each image, every object related to it.
The interface includes a Document Structure Dialogue that allows for parametrisation of language, column detection, resolution, borders and spacing between documents. It is worth noting that IDR possesses multi-column capacity.
During image pre-processing borders are removed, the image quality is assessed and noise is removed. The next phase is segmentation. A precise evaluation of line height as well as inter-word, inter-letter distance and dynamic word distance detection is performed in this phase.
The phase of Optical Character Recognition uses an external library with multifont recognition capabilities.
Dictionaries help in the analysis of linguistic context of a recognised text and at the same time provide alternatives and suggestions for correction.
So the main IDR features are:
A user manual has been written for the IDR module.
|Videocoding User Manual||User Manual|
|IDR User Manual||User Manual|
|IDR Algorithmic Documentation||Technical Report|
|IDR Tecnical Documentation||Technical Report|
Verba Logica Home page | BiblioTECA Home page | Previous | Next