Q-learning in collaborative multiagent systems

González Trastoy, Alfred

Inici de DSpace
→
Grau en ID
→
Sistemes de Gestió Digital de la Informació I (2024-25, matí)
→
Comunitat Carla Rubio
→
Col·lecció Carla Rubio
→
Visualitza element

dc.contributor	López Sánchez, Maite
dc.creator	González Trastoy, Alfred
dc.date	2018-08-02T08:53:56Z
dc.date	2018-08-02T08:53:56Z
dc.date	2018-02
dc.date.accessioned	2024-12-16T10:26:36Z
dc.date.available	2024-12-16T10:26:36Z
dc.identifier	http://hdl.handle.net/2445/124087
dc.identifier.uri	http://fima-docencia.ub.edu:8080/xmlui/handle/123456789/21329
dc.description	Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2018, Director: Maite López Sánchez
dc.description	Q-learning is one of the most widely used reinforcement learning techniques. It is very effective for learning an optimal policy in any finite Markov decision process (MDP). Collaborative multiagent systems, though, are a challenge for self-interested agent implementation, as higher utility can be achieved via collaboration. To evaluate the Q-learning efficiency in collaborative multiagent systems, we will use a simplified version of the Malmo Collaborative AI Challenge (MCAC). It was designed by Microsoft and consists of a game where 2 players can collaborate to catch the pig (high reward) or leave the game (low reward). Each action costs 1, so knowing when to leave and when to chase the pig is key for achieving high scores. Two main problems are faced in the challenge: uncertainty of the other agent behaviour and a limited learning time. We propose solutions to both problems using a simplified MCAC environment, a stateaction abstraction and an agent type modelling. We have implemented an agent that is able to identify the other player behaviour (whether it is collaborating or not) and can learn an optimal policy against each type of player. Results show that Q-learning is an efficient and effective technique to solve collaborative multiagent systems.
dc.format	26 p.
dc.format	application/pdf
dc.language	eng
dc.rights	memòria: cc-by-nc-sa (c) Alfred González Trastoy, 2018
dc.rights	codi: GPL (c) Alfred González Trastoy, 2018
dc.rights	http://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.rights	http://www.gnu.org/licenses/gpl-3.0.ca.html
dc.rights	info:eu-repo/semantics/openAccess
dc.source	Treballs Finals de Grau (TFG) - Enginyeria Informàtica
dc.subject	Aprenentatge automàtic
dc.subject	Intel·ligència artificial
dc.subject	Programari
dc.subject	Treballs de fi de grau
dc.subject	Aprenentatge per reforç (Intel·ligència artificial)
dc.subject	Processos de Markov
dc.subject	Machine learning
dc.subject	Artificial intelligence
dc.subject	Computer software
dc.subject	Bachelor's theses
dc.subject	Reinforcement learning
dc.subject	Markov processes
dc.title	Q-learning in collaborative multiagent systems
dc.type	info:eu-repo/semantics/bachelorThesis

Fitxers en aquest element

Fitxers	Grandària	Format	Visualització
No hi ha fitxers associats a aquest element.

Aquest element apareix en la col·lecció o col·leccions següent(s)

Col·lecció Carla Rubio
Col·lecció de prova de l'assignatura de SGDI 1.

Mostra el registre parcial de l'element

Cerca a DSpace

Cerca avançada

Visualitza

Tot DSpace
Aquesta col·lecció

Q-learning in collaborative multiagent systems

Fitxers en aquest element

Aquest element apareix en la col·lecció o col·leccions següent(s)

Cerca a DSpace

Visualitza

Tot DSpace

Aquesta col·lecció

El meu compte