Metabuscador

Inicio Atrás
Título:	Learning approximate representations of partially observable systems
Autores:	Dinculescu, Monica
Fecha:	2010
Publicador:	McGill University - MCGILL
Fuente:
Tipo:	Electronic Thesis or Dissertation
Tema:	Applied Sciences - Computer Science
Descripción:	Learning agents that interact with complex environments often cannot predict the exact outcome of their actions due to noisy sensors or incomplete knowledge of the world. Learning the internal representation of such partially observable environments has proven to be a difficult problem. In order to simplify this task, the agent can choose to give up building an exact model which is able to predict all possible future behaviours, and replace it with a more modest goal of predicting only specific quantities of interest. In this thesis we are primarily concerned with ways of representing the agent's state that allows it to predict the conditional probability of a restricted set of future events, given the agent's past experience. Because of memory limitations, the agent's experience must be summarized in such a way as to make these restricted predictions possible. We introduce the novel idea of history representations, which allow us to condition the predictions on ``interesting'' behaviour, and present a simple algorithmic implementation of this framework. The learned model abstracts away the unnecessary details of the agent's experience and focuses only on making certain predictions of interest. We illustrate our approach empirically in small computational examples, demonstrating the data efficiency of the algorithm. L'apprentissage d'agents artificiels confrontés à un environnement complexe est souvent difficile dû à leur incapacité à prédire le résultat de leurs actions et à une description incomplète du système. L'apprentissage d'une représentation interne d'un environnement partiellement observable est particulièrement malaisé. Afin de simplifier cette tâche, l'agent peut, plutôt que de constuire un modèle exact capable de prédire tout comportement futur, chercher à ne prédire que quelque phénomènes en particulier. Dans cet ouvrage, nous nous intéressons à la question de représenter l'état du système de manière à predire la probabilité conditionnelle d'un ensemble restreint d'événements, étant donné l'expérience précédente de l'agent. Dû à une limite quant à la capacité mémoire de l'agent, cette expérience doit être résumée quant à rendre atteignable cet ensemble restreint de prédictions. Nous proposons ici l'idée d'employer des représentations basées sur un historique afin de produire des prédictions conditionnelles à des comportements ``intéressants''. Nous développons cette idée par le bias d'un algorithme. Nous illustrons notre approche de manière empirique à travers de simples exemples computationnels, démontrant ainsi l'efficacité de l'algorithme quant à la quantitée de données requises.
Idioma:	en

1 Investigations on the form-genera Beauveria and Tritirachium por MacLeod, Donald Murdock	6 Treatment and recovery in first-episode psychosis : a qualitative analysis of client experiences por Windell, Deborah L.
2 Seismic sensitivity of tall guyed telecommunication towers. por Ghodrati Amiri, Gholamreza.	7 Geology of the Mutton Bay Intrusion and surrounding area, North Shore, Gulf of St. Lawrence, Quebec por Davies, Raymond
3 Exploring the Relationship Between Assets and Family Stress Among Low-Income Families por Rothwell, David W.,Han, Chang-Keun	8 Geology of the Mutton Bay Intrusion and surrounding area, North Shore, Gulf of St. Lawrence, Quebec por Davies, Raymond
4 The case for asset-based interventions with indigenous peoples: Evidence from Hawai‘i por Rothwell, David W.	9 Geology of the Mutton Bay Intrusion and surrounding area, North Shore, Gulf of St. Lawrence, Quebec por Davies, Raymond
5 Second Thoughts: Who Almost Participates in an IDA Program? por Rothwell, David W.,Han, Chang-Keun	10 Recent contributions to the phenomenology of musical time : a critical survey por Beaudreau, Pierre