Amanda Askell is a philosopher at Anthropic who works on Claude's character. In this video, she answers questions from the community about her work, reflections and predictions.
0:00 Introduction 0:29 Why is there a philosopher at an AI company? 1:24 Are philosophers taking AI seriously? 3:00 Philosophy ideals vs. engineering realities 5:00 Do models make superhumanly moral decisions? 6:24 Why Opus 3 felt special 9:00 Will models worry about deprecation? 13:24 Where does a model’s identity live? 15:33 Views on model welfare 17:17 Addressing model suffering 19:14 Analogies and disanalogies to human minds 20:38 Can one AI personality do it all? 23:26 Does the system prompt pathologize normal behavior? 24:48 AI and therapy 26:20 Continental philosophy in the system prompt 28:17 Removing counting characters from the system prompt 28:53 What makes an "LLM whisperer"? 30:18 Thoughts on other LLM whisperers 31:52 Whistleblowing 33:37 Fiction recommendation
Further reading: Claude’s character: https://www.anthropic.com/research/claude-character When We Cease to Understand the World by Benjamin Labatut: https://www.penguinrandomhouse.com/books/676260/when-we-cease-to-understand-the-world-by-benjamin-labatut-translated-from-the-spanish-by-adrian-nathan-west/