RSLDI: Restoration of single-sided low-quality document images

Publication Type:

Journal Article


Pattern Recognition, Volume 42, Issue 12, p.3355–3364 (2009)




Highlight 1: Modeling vocabulary (multi-level features as model-words)
Highlight 2: multi-level features at four levels: pixel, local, regional, and global
Highlight 3: Combining features in nonlinear ways
Highlight 4: The Flow Field
Highlight 5: The Estimated Background (EB)
Highlight 6: PDE-based Diffusion from other sources
Highlight 7: Combining the EB and the ICA (single-source ICA)
Highlight 8: A Bayes method using the nonlinear combination of features


The key innovation in this paper was introduction of model-words, in the form of multi-level features, in order to uniformly distribute the complexity of modeling among main model and its dependencies. The multilevel features were categorized in four groups. The most interesting features were the Flow Field, and the Estimated Background. The second innovation was generalization of PDE-based diffusion methods to include diffusion from sources outside the image (from other levels), for example the reverse diffusion. The reverse diffusion was a key to process single-side images suffering from the bleed-through. Another innovation was using multiscale feature to enable the ICA method for single-sided images which was assumed by default to be impossible before that. Finally, using nonlinear combination of features, a learning-based Bayesian classifier was trained for binarization. This can be considered as a pioneer work in learning-based binarization.

Post Publication Critics

Flow field has been forgotten after this work. One aspect of Flow Field that was supposed to be improved was its computational performance, as it involves a lot of nonlinear function calculations. The other proposal on to-do list was to use the Flow Field in the NLM denoising and to show that it is more powerful than gray values. The single-source ICA enabling technique, which can be generalized to other statistical methods, has been also forgotten after this work. The nonlinear combination of features were also ignored for a long time until our recent paper on learning and optimization of binarization methods.

Civimetrix Telecom logo
risq logo
University of Torontologo
MDEIE logo