Document Information Retrieval has attracted researchers’ attention when discovering secrets behind ancient manuscripts. To understand such documents, analyzing their layouts and segmenting their relevant features are fundamental tasks. Recent efforts represent unsupervised document segmentation, and its importance in ancient manuscripts has provided a unique opportunity to study the said problem. This paper proposes a novel collaborative deep learning architecture in an unsupervised mode that can generate synthetic data to avoid uncertainties regarding their degradations. Moreover, this approach utilizes the generated distribution to assign labels that are associated with superpixels. The unsupervised trained model is used to segment the page, ornaments, and characters simultaneously. Promising accuracies in the segmentation task were noted. Experiments with data from degraded documents show that the proposed method can synthesize noise-free documents and enhance associations better than the state-of-the-art methods. We also investigate the usage of overall generated samples and their effectiveness in different unlabelled historical document tasks.
Simultaneous Detection of Regular Patterns in Ancient Manuscripts Using GAN-Based Deep Unsupervised Segmentation
Attachment: Download Attachment