L
Título: Impact of the initialization in tree-based fast similarity search techniques
Autores: Serrano Díaz-Carrasco, Aureo
Micó Andrés, Luisa
Oncina Carratalá, Jose
Fecha: 2012-01-09
2012-01-09
2011
Publicador: Springer Berlin / Heidelberg
Fuente: Ver documento
Tipo: info:eu-repo/semantics/bookPart
Tema: Fast similarity search techniques
Pivot selection techniques
Tree-based
Lenguajes y Sistemas Informáticos
Descripción: Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen. This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity. Moreover, most pivot tree-based indexes emphasizes in building balanced trees. We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.
The authors thank the Spanish CICyT for partial support of this work through projects TIN2009-14205-C04-C1, the Ist Programme of the European Community, under the Pascal Network of Excellence, (Ist– 2006-216886), and the program Consolider Ingenio 2010 (Csd2007-00018).
Idioma: Inglés
Artículos similares:
Choosing the correct paradigm for unknown words in rule-based machine translation systems por Sánchez Cartagena, Víctor Manuel,Esplà Gomis, Miquel,Sánchez Martínez, Felipe,Pérez Ortiz, Juan Antonio
Using external sources of bilingual information for on-the-fly word alignment por Esplà Gomis, Miquel,Sánchez Martínez, Felipe,Forcada Zubizarreta, Mikel L.
10