Título: Articulatory features for robust visual speech recognition
Autores: Saenko, Ekaterina, 1976-
Fecha: 2005-09-27
2005-09-27
2004
2004
Publicador: MIT
Fuente:
Tipo: Thesis
Tema: Electrical Engineering and Computer Science.
Descripción: This thesis explores a novel approach to visual speech modeling. Visual speech, or a sequence of images of the speaker's face, is traditionally viewed as a single stream of contiguous units, each corresponding to a phonetic segment. These units are defined heuristically by mapping several visually similar phonemes to one visual phoneme, sometimes referred to as a viseme. However, experimental evidence shows that phonetic models trained from visual data are not synchronous in time with acoustic phonetic models, indicating that visemes may not be the most natural building blocks of visual speech. Instead, we propose to model the visual signal in terms of the underlying articulatory features. This approach is a natural extension of feature-based modeling of acoustic speech, which has been shown to increase robustness of audio-based speech recognition systems. We start by exploring ways of defining visual articulatory features: first in a data-driven manner, using a large, multi-speaker visual speech corpus, and then in a knowledge-driven manner, using the rules of speech production. Based on these studies, we propose a set of articulatory features, and describe a computational framework for feature-based visual speech recognition. Multiple feature streams are detected in the input image sequence using Support Vector Machines, and then incorporated in a Dynamic Bayesian Network to obtain the final word hypothesis. Preliminary experiments show that our approach increases viseme classification rates in visually noisy conditions, and improves visual word recognition through feature-based context modeling.
by Ekaterina Saenko.
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.
Includes bibliographical references (p. 99-105).
Idioma: Inglés

Artículos similares:

Description Of Procedures In Automotive Engine Plants por Artzner, Denis,Whitney, Dr. Daniel
Reading Courtesy Amounts on Handwritten Paper Checks por Palacios, Rafael,Wang, Patrick S.P.,Gupta, Amar
On Trees and Logs por Pavlova, Anna,Cass, David
Saturn, The GM/UAW Partnership por Rubinstein, Saul,Kochan, Thomas
Academic Earmarks and the Returns to Lobbying por De Figueiredo, John M.,Silverman, Brian S.
10