A Multi-agent Cooperative Reinforcement Learning Model Using a Hierarchy of Consultants, Tutors and Workers

This is an old revision of the document!

Abed-Alguni, B. H., Chalup, S. K., Henskens, F. A., and Paul, D. “A Multi-Agent Cooperative Reinforcement Learning Model Using a Hierarchy of Consultants, Tutors and Workers”, Vietnam Journal of Computer Science, 2:4, November 2015, pp 213-226, DOI: 10.1007/s40595-015-0045-x.

Abstract

The hierarchical organisation of distributed systems can provide an efficient decomposition for machine learning. This paper proposes an algorithm for cooperative policy construction for independent learners, named Q-learning with Aggregation (QA-learning). The algorithm is based on a distributed hierarchical learning model and utilises three specialisations of agents: workers, tutors and consultants. The consultant agent incorporates the entire system in its problem space, which it decomposes into sub-problems that are assigned to the tutor and worker agents. The QA-learning algorithm aggregates the Q-tables of worker agents into a central repository managed by their tutor agent. Each tutor's Q-table is then incorporated into the consultant's Q- table, resulting in a Q-table for the entire problem. The algorithm was tested using a distributed hunter prey problem, and experimental results show that QA-learning converges to a solution faster than single agent Q-learning and some famous cooperative Q-learning algorithms.

Vietnam Journal of Computer Science