Reinforcement learning has achieved recent success in a number of difficult, high-dimensional environments. However, these methods generally require millions of samples from the environment to learn optimal behaviours, limiting their real-world applicability. Hence this work is aimed at creating a principled framework for lifelong agents to learn essential skills and be able to combine them to solve new compositional tasks without further learning. To achieve this, we design useful representations of skills for each task and we construct a Boolean algebra over the set of tasks and skills. This enables us to compose learned skills to immediately solve new tasks that are expressible as a logical composition of past tasks. We present theoretical guarantees for our framework and demonstrate its usefulness for lifelong learning via a number of experiments.
Geraud Tasse PhD student in Computer Science at Wits and an IBM PhD fellow. Before moving to Wits for my M.Sc, I did my B.Sc in Computer Science, Pure&Applied Mathematics and Physics at Rhodes University. My main research interests lie in reinforcement learning (RL) and particularly in techniques that allow an agent to leverage existing skills to build complex, novel behaviors.
26 October 2022