Adaptive soft thresholding for intensity-based HDI binarization

Adaptive soft thresholding for intensity-based HDI binarization

Intensity based document image enhancement

Recently, document image enhancement methods based on intensity information have attracted a great deal of attention and interest, as they play an important role in other automatic analysis tasks (OCR, document recognition, etc.) and enhance the readability of documents for the benefit of experts (historians and librarians). Hundreds of methods have been proposed over the years, which can be classified in two main categories: those that require access to both the recto and verso sides of the document simultaneously (double-sided enhancement methods); and those that process each side independently (one-sided enhancement methods).

One-sided document image enhancement methods

One-sided enhancement methods attempt to eliminate interfering patterns using thresholding or classification techniques. Thresholding techniques are aimed at finding an optimal threshold (gray-level) which separates the document image pixels into two classes, foreground and background, and classification-based techniques are aimed at classifying the document image pixels into two or more classes, foreground and background, and potentially a fuzzy class. With thresholding-based enhancement techniques, a pixel is considered to be foreground if its value is above the threshold, and background otherwise. There are two categories of thresholding techniques: global and adaptive. Global thresholding techniques (Otsu, 1979; Kapur et al., 1985; Abutaleb, 1989a) are designed to find an single optimal threshold for all the document image pixels. Unfortunately, in the presence of a high level of degradation, such as severe bleed-through, simple thresholding techniques are inadequate for the task of image enhancement, because the intensity of interfering patterns or degraded background can be very similar to that of the foreground. Global thresholding cannot eliminate the degradation in such cases either, or, if it succeeds in doing so, it may eliminate parts of the main text as well

Criticism

In spite of the large number of image enhancement/restoration algorithms in the literature,there are no generic algorithms that can be used to handle a number of types of document image degradation. Most algorithms are trained on a finite set of document images, and then tested on another set of document images in the same category, i.e. having similar characteristics (Cheriet et al., 2012). It seems that the time has not yet come to design generic frameworks for the document image enhancement problem that can at least handle a large set of degraded documents that belong to a single culture or to a specific time period. There are two main reasons for this. One reason is the nature of non linear degradations, which is that they are not predictable. This makes it difficult to develop robust and reliable enhancement/restoration models. Researchers are particularly interested in designing specific models that incorporate information gathered from the available data, in order to regularize the results of their algorithms. Unfortunately, they fail to consider (intentionally or unintentionally) how degradations occur. The second reason is that intensity-based information is not suitable for designing good discriminant features, especially in the case of severely degraded document images.

This is mainly due to the physical and mechanical limitations of the conventional tools used for document image acquisition. The cameras and scanners typically used to capture these images provide a subset of information that is made available to us by combining the responses of the visible radiation into three spectral images or less (color or gray-scale). Although the RGB color space is the most common choice for computer graphics, it is not very efficient in dealing with real-world images, because the RGB channels contain redundant luminance information. This information is highly correlated, as all of it includes a representation of brightness (de Campos, 2006). So, if the acquisition is based on color information only, or on gray-level information only, the various document image constituents may appear similar to the human eye, which makes the process of separation difficult, or even impossible. Multispectral (MS) imaging systems seem to be a good alternative, as they offer detailed quantitative measurement of the spectral responses of the document image constituents. These systems are the subject of the next chapter.

Multispectral imaging

Multispectral (MS) imaging is used mostly to record spectral images in the visible light range and in the invisible light range (i.e. UV and IR). Thanks to the use of UV and IR sensors, MS imaging can extract information that the human eye cannot capture with its receptors for red, green and blue. Light that is visible (to the human eye) has wavelengths in the range of about 380 nm to 740 nm. A spectral image is reproduced as a grey-scale image or an RGB color image. Visible light is situated between UV light, which has short wavelengths – in the 10 nm to 400 nm range, and near-IR light, which has long wavelengths – in the 700 nm to 1 mm range. IR spectral images can be combined into a grey-scale image, and three of them can be used to create pseudo color RGB images. The principle underlying MS imaging systems is the concept of the spectral signature.

The main idea is that all materials emit, transmit, or absorb EM radiation based on the inherent physical structure and chemical composition of the material, and the wavelength of the radiation. Every material transmits, absorbs, or emits an amount of EM radiation commensurate with the wavelength and intensity of the radiation impinging on the material. The ratio of reflected to emitted radiation from the surface of an object varies with the frequency of the wavelength and the angle of incidence of the radiation. The combination of emitted, reflected, and absorbed EM radiation across a range of wavelengths produces what we call a spectral signature, which is unique to that material (see Figure I-7 in Appendix I: MS imaging system, set-up and acquisition). It is therefore possible to differentiate between objects based on differences in their spectral signatures.

Le rapport de stage ou le pfe est un document d’analyse, de synthèse et d’évaluation de votre apprentissage, c’est pour cela chatpfe.com propose le téléchargement des modèles complet de projet de fin d’étude, rapport de stage, mémoire, pfe, thèse, pour connaître la méthodologie à avoir et savoir comment construire les parties d’un projet de fin d’étude.

Table des matières

INTRODUCTION
0.1 Context of the thesis
0.2 Problem statement
0.3 Objectives of the thesis
0.4 Outline of the thesis
CHAPTER 1 LITERATURE REVIEW
1.1 Intensity based document image enhancement
1.1.1 One-sided document image enhancement methods
1.1.2 Double-sided document image enhancement methods
1.1.3 Criticism
1.2 Multispectral Imaging based historical document image restoration
1.2.1 Electromagnetic radiation and optical proprieties of objects
1.2.2 Multispectral imaging
1.2.3 MS Images
1.2.4 Historical document image analysis
CHAPTER 2 METHODOLOGY AND CONTRIBUTIONS
2.1 Intensity-based binarization of historical document images
2.2 Multispectral restoration of historical document images
2.3 Reference data estimation for historical document image binarization
CHAPTER 3 ARTICLE I: A SPATIALLY ADAPTIVE STATISTICAL METHOD FOR HISTORICAL DOCUMENT IMAGE BINARIZATION
3.1 Introduction
3.2 Related work
3.3 Problem statement
3.4 Formulation
3.5 Methodology
3.5.1 Sauvola binarization algorithm
3.5.2 Spatially adaptive model
3.5.3 Computing the fields of μt, μb, and σb
3.5.4 Estimation of the σt field
3.5.4.1 Estimation of the global σt: St
3.5.4.2 Spatial adaptation of σt
3.5.5 Estimation of uBW
3.6 Experimental results and discussion
3.6.1 Subjective evaluation
3.6.2 Objective evaluation against DIBCO’09 (Gatos et al., 2009a)
3.6.2.1 Evaluation setup
3.6.2.2 Performance measures
3.6.2.3 Comparison with the state of the art
3.6.3 Computational cost and complexity of the method
3.7 Conclusions and future prospects
CHAPTER 4 ARTICLE II: DOCUMENT IMAGE RESTORATION USING MULTISPECTRAL IMAGING SYSTEM
4.1 Introduction
4.1.1 Difficulty in analyzing degraded document images
4.1.2 Objective of the paper
4.2 Related work
4.2.1 Hyperspectral remote sensing image enhancement
4.2.2 Multispectral imaging in the area of document analysis
4.3 Multispectral Image Acquisition
4.3.1 Characteristics of the MS degraded document image
4.4 Proposed restoration model
4.5 Parameter estimation and model optimization
4.5.1 Unsupervised IR band selection
4.5.2 Semi-local correction of slight degradations
4.5.3 Correction of strong degradations
4.5.3.1 Estimation of the binary mask
4.5.3.2 TV denoising and inpainting problem
4.6 Experimental result
4.6.1 Parameters setup
4.6.2 Subjective and objective evaluation
4.7 Conclusion
CHAPTER 5 ARTICLE III: REFERENCE DATA ESTIMATION
5.1 Introduction
5.2 Reference estimation methodology and its evaluation
5.2.1 General framework
5.2.2 Evaluation
5.3 Application: historical document image analysis
5.4 Conclusion
CHAPTER 6 GENERAL DISCUSSIONS
6.1 Adaptive soft thresholding for intensity-based HDI binarization
6.2 Variational method of multispectral HDI restoration
6.3 Reference data estimation in a multispectral representation space
GENERAL CONCLUSION
ANNEX I MS IMAGING SYSTEM, SET-UP AND ACQUISITION
ANNEX II AUTOMATIC FINDING OF THE THRESHOLDτ.
ANNEX III EXPERIMENTAL SET-UP FOR IRR, UVR AND UVF IMAGING TECHNIQUES
BIBLIOGRAPHY