Surveying research directions on AI safety


AI safety is a subject which has often been viewed with skepticism regarding its necessity and plausibility in the AI community. However, as we have progressed towards transformational AI systems the urgency of this research has become apparent.

In this talk I present reasons for why working on AI safety should be on your radar if you are even somewhat interested in AI, and then I will discuss some pragmatic research approaches that aim to approach AI safety as a safety problem, through layered and systematic interventions that currently have wide open research problems waiting to be solved. The 3 domains of these interventions are: robustness, monitoring and AI control/alignment. We will investigate open research problems within these domains.


Benjamin Sturgeon, based in Cape Town, used to work as a machine learning engineer. He’s now focused on studying AI safety, using a research grant from the Long Term Future Fund. His research currently focuses on trying to measure levels of agency in AI agents in RL settings. He’s also interested in large language models and how they might be used to influence the behaviour of RL agents.


26 July 2023