Título: Using external sources of bilingual information for on-the-fly word alignment
Autores: Esplà Gomis, Miquel
Sánchez Martínez, Felipe
Forcada Zubizarreta, Mikel L.
Fecha: 2013-03-26
2013-03-26
2012-12-07
Publicador: RUA Docencia
Fuente:
Tipo: info:eu-repo/semantics/report
Tema: Machine translation
Word-alignment
External sources
Bilingual information
Lenguajes y Sistemas Informáticos
Descripción: In this paper we present a new and simple language-independent method for word-alignment based on the use of external sources of bilingual information such as machine translation systems. We show that the few parameters of the aligner can be trained on a very small corpus, which leads to results comparable to those obtained by the state-of-the-art tool GIZA++ in terms of precision. Regarding other metrics, such as alignment error rate or F-measure, the parametric aligner, when trained on a very small gold-standard (450 pairs of sentences), provides results comparable to those produced by GIZA++ when trained on an in-domain corpus of around 10,000 pairs of sentences. Furthermore, the results obtained indicate that the training is domain-independent, which enables the use of the trained aligner on the fly on any new pair of sentences.
Work partially supported by the Spanish Ministry of Science and Innovation through project TIN2009-14009-C02-01, and by the Universitat d'Alacant through project GRE11-20.
Idioma: Inglés

Artículos similares:

Choosing the correct paradigm for unknown words in rule-based machine translation systems por Sánchez Cartagena, Víctor Manuel,Esplà Gomis, Miquel,Sánchez Martínez, Felipe,Pérez Ortiz, Juan Antonio
10