Título: Optical Character Recognition of Amharic Documents
Autores: Meshesha, Million; International Institute of Information Technology
Jawahar, C V; Center for Visual Information Technology; International Institute of Information Technology, Hyderabad - 500 032, India
Fecha: 2007-08-13
Publicador: African Journal Of Information & Communication Technology
Fuente:
Tipo: info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Tema: Optical Character Recognition; African Scripts; Feature Extraction; Classification; Amharic Documents
Descripción: In Africa around 2,500 languages are spoken. Some of these languages have their own indigenous scripts. Accordingly, there is a bulk of printed documents available in libraries, information centers, museums and offices. Digitization of these documents enables to harness already available information technologies to local information needs and developments. This paper presents an Optical Character Recognition (OCR) system for converting digitized documents in local languages. An extensive literature survey reveals that this is the first attempt that report the challenges towards the recognition of indigenous African scripts and a possible solution for Amharic script. Research in the recognition of African indigenous scripts faces major challenges due to (i) the use of large number characters in the writing and (ii) existence of large set of visually similar characters. In this paper, we propose a novel feature extraction scheme using principal component and linear discriminant analysis, followed by a decision directed acyclic graph based support vector machine classifier. Recognition results are presented on real-life degraded documents such as books, magazines and newspapers to demonstrate the performance of the recognizer.
Idioma: Inglés

Artículos similares:

Aspects of Delay Diversity in OFDM por Bauch, Gerhard; DoCoMo Euro-Labs
UWB Electric and Magnetic Monopole Antennas por Chen, Xiaodong; Queen Mary, University of London, London E1 4NS,Liang, Jianxin,Li, Pengcheng,Chiau, Choo C.
Contextualizing ICT in Africa: The Development of the CATI model in Tanzanian Higher Education por Vesisenaho, Mikko; Department of Computer Science, University of Joensuu, Finland,Kemppainen, Jyri; Tumaini University, Iringa University College, Tanzania,Islas, Carolina; Depertment of Computer Science, University of Joensuu, Finland,Tedre, Matti; Department of Computer Science, University of Joensuu, Finland,Sutinen, Erkki; Department of Computer Science, University of Joensuu, Finland
"Scenario"-adaptivity for e-service management in heterogeneous networks por Iera, Antonio,Molinaro, Antonella,Pudano, Alfredo,Ursino, Domenico
Generic Model and Architecture for Cooperating Objects in Sensor Network Environments por Marron, Pedro Jose; University of Stuttgart,Minder, Daniel; University of Stuttgart,Lachenmann, Andreas; University of Stuttgart,Saukh, Olga; University of Stuttgart,Rothermel, Kurt; University of Stuttgart
A New Resource Management Scheme for Ad Hoc and Sensor Networks por de Renesse, Ronan,Friderikos, Vasilis,Aghvami, Hamid
An Applicable GSM Network Model for Networking in Rural Environments por Li, Yang; University of Cape Town,Agbinya, Johnson I; University of Technology, Sydney,Chan, H Anthony; University of Cape Town
10