Metabuscador

Inicio Atrás
Título:	Representing functional data in reproducing Kernel Hilbert spaces with applications to clustering, classification and time series problems
Autores:	González Hernández, Javier
Fecha:	2010
Publicador:	Dialnet (Tesis)
Fuente:
Tipo:	text (thesis)
Tema:	Estadística Análisis de datos
Descripción:	In modern data analysis areas such as Image Analysis, Chemometrics or Information Retrieval the raw data are often complex and their representation in Euclidean spaces is not straightforward. However most statistical data analysis techniques are designed to deal with points in Euclidean spaces and hence a representation of the data in some Euclidean coordinate system is always required as a previous step to apply multivariate analysis techniques. This process is crucial to guarantee the success of the data analysis methodologies and will be a core contribution of this thesis. In this work we will develop general data representation techniques in the framework of Functional Data Analysis (FDA) for classification and clustering problems. In Chapter 1 we motivate the problems to solve, describe the roadmap of the contributions and set up the notation of this work. In Chapter 2 we review some aspects concerning Reproducing Kernel Hilbert Spaces (RKHSs), Regularization Theory Integral Operators, Support Vector Machines and Kernel Combinations. In Chapter 3 we propose a new methodology to obtain finite-dimensional representations of functional data. The key idea is to consider each functional curve as a point in a general function space and then project these points onto a Reproducing Kernel Hilbert Space (RKHS) with the aid of Regularization theory. We will describe the projection methods, analyze its theoretical properties and develop an strategy to select appropriate RKHSs to represent the functional data. Following the functional data analysis approach, we develop in Chapter 4 a new procedure to deal with proximity (similarity or distance) matrices in classification problems by studying the connection between proximity measures and a certain class of integral operators. The idea is to come up with a methodology able to estimate an integral operator whose associated kernel function, evaluated at the sample, approximates the sample proximity matrix of the problem. To show the broad scope of application of the methodology, we will apply it to three cases: (1) classification problems where the only available information about the data is an asymmetric similarity matrix (2) partially labeled classification problems and (3) classification problems where several sources of information are available and can be combined to obtain the discrimination function. In Chapter 5 we propose an spectral framework for information fusion when the sources of information are given by a set of proximity matrices. Our approach is based on the simultaneous diagonalization of the original matrices of the problem and it represents a natural way to manage the redundant information involved in the fusion process. In particular, we define a new metric for proximity matrices and we propose a method that automatically eliminates the redundant information among a set of matrices when they are combined. We conclude the contributions of the thesis in Chapter 6 with a battery of simulated and real examples devoted to compare the performance of the proposed methodologies with the state of the art in representation methods. Finally, in Chapter 7 we include a discussion regarding the topics described above and we propose some future lines of research we believe are the natural extensions to the work developed in this thesis.
Idioma:	eng

1 Síntesis de hidroxiaminoácidos conformacionalmente restringidos, análogos de aminoácidos naturales por Fernández Recio, Miguel Angel	6 Objetos localmente efectivos y tipos abstractos de datos por Pascual Martínez Losa, María Vico
2 Aproximación de funciones cuya transformada de Hankel está soportada en el intervalo (0,1) por Ciaurri Ramírez, Oscar	7 Mecanismos de resistencia a antibióticos macrólidos, lincosamidas y estreptograminas en streptococcus y enterococcus por Portillo Barrio, Aránzazu
3 Análisis de los factores explicativos del éxito empresarial: una aplicación al sector de la denominación de origen calificada Rioja por Sáinz Ochoa, Alberto	8 Algunos problemas diofánticos por Benito Muñoz, Manuel
4 Intercambio de recursos en la ficción de Percy Wyndham Lewis por Terrazas Gallego, Melania	9 Metamodelización y formalismos para la representación del comportamiento por Rubio García, Angel Luis
5 Técnicas de automatización avanzadas en procesos industriales por Jiménez Macías, Emilio	10 Aplicación de la dinámica de fluidos computacional al control de las mermas de vino en naves de crianza climatizadas por Ruiz de Adana Santiago, Manuel María