Título: Mining interesting frequent patterns in a single graph using inexact matching
Autores: MARISOL FLORES GARRIDO
Fecha: 2015-04
Publicador: INAOE
Fuente:
Tipo: info:eu-repo/semantics/doctoralThesis
Tema: info:eu-repo/classification/Patrón/Pattern
info:eu-repo/classification/Clasificación/Classification
info:eu-repo/classification/Bases de datos/Databases
info:eu-repo/classification/Teoría de algoritmos/Algorithm theory
info:eu-repo/classification/cti/1
info:eu-repo/classification/cti/12
info:eu-repo/classification/cti/1203
Descripción: Frequent graph pattern mining is an important problem in diverse research fields but, until recently, algorithms that focus on the problem had only used graph isomorphism to identify occurrences of a given pattern. In the last years, however, a few works have focused on the case where a pattern could differ from its occurrences, which is a scenario that corresponds better to reality and that could be important, for example, to analyze noisy data. Although these algorithms allow differences in labels and structural differences in edges, none of them considers structural differences in vertices. How can we identify occurrences that differ by one (or several) nodes from the pattern they represent? Our work approaches the problem of frequent graph pattern mining with three main characteristics. First, we use inexact matching, allowing structural differences in both edges and vertices. Second, we focus on the problem of mining patterns in a single graph, a problem that has been less explored than the case in which patterns are mined from a graph collection. Third, instead of mining the whole set of frequent patterns, we are interested in obtaining an interesting subset of patterns, thus lessening redundancies and facilitating the analysis of the output set. We introduce the algorithm AGraP, which, by allowing patterns that can have structural differences, in vertices as well as in edges, respect to their occurrences, is able to find patterns missed by other state of the art algorithms. Then, we introduce the algorithms CloseAFG, MaxAFG and IntAFG that, by focusing on closed, maximal or interesting patterns, respectively, are able to reduce the amount of mined patterns by AGraP, while retaining new potentially useful information found by allowing structural differences in vertices. Finally, we show, through experimental results, that the "extra" patterns found by our proposed algorithms can help in tasks of classification and clustering, thus suggesting that useful information about the input data can be captured through the patterns mined by our algorithms.
Idioma: eng