Machine learning for document understanding
In order to provide training data for optical shape recognition (OSR), two databases of different sizes have been created in collaboration with Prof. Robert Wisnovsky (Institute of Islamic Studies, McGill University):
3. IBN SINA Ext database (with images): A database of more than 22720 shapes with their color images (fast access with binarized images (201), fast access with color images (66), fast access to the guide (53)).
These databases have been used in two learning challenges. The details of databases and challenges can be found below.
Online handwritten gestures for interaction
Articles in this category
PHIBD is the first groundtruthed Persian Heritage Image Binarization Dataset developed using an efficient ground thruthing tool called “PhaseGT” . The PHIBD 2012 contains 15 historical document images with their corresponding ground truth binary images. The historical images in the dataset suffer from various types of degradation. It has been also divided into two subsets of training and testing images for those binarization method that use learning approaches. For more information, please visit the IAPR-TC11 website: http://www.iapr-tc11.org/mediawiki/index.php/Datasets_List