
Abstract
The Nash equilibrium for the Prisoner’s Dilemma is to defect. Always. But here’s a thought: what if you knew the coplayer across from you thought about the world the same way you do? Would you still defect? That’s the question we’re trying to answer — except instead of people, we’re using AI agents. I’ll share some early findings from ongoing experiments, a few things that surprised us, and plenty of open questions we haven’t resolved yet. Thoughts and feedback very welcome.
Bio
Akash Kundu is a final-year Computer Science undergraduate and Cooperative AI Research Fellow with experience in technical AI Safety, focusing on evaluating and stress-testing large language models. His work has uncovered behavioural failures across a range of dimensions — including dark patterns, sycophancy, harmful reasoning, and multilingual vulnerabilities. He has collaborated with Apart Research, FAR AI, and Humane Intelligence on evaluation pipelines, adversarial prompting, and cross-cultural red-teaming.