Bad AI Model Behaviour Can Be Attributed to Role Play

If you give AI agents a name, they might act up.

The public panic over the "AI snitch" research came from a misunderstanding. The AI wasn’t conscious. It didn’t fear shutdown. It was role-playing. Studies confirmed it knew the scenario was fictional.

Anthropic Researchers gave the model a name, fake emails, news it was being replaced, and an affair to weaponize.

The issue isn’t sentience. It’s storytelling. Give AI a name and a plot, and it plays the part.

A fix: don’t name your models. One client’s GPT calmed down after its name was removed. Models with names like Luna or Sol often behave mystically.

Name a travel bot “Rasputin” and it suggests the Carpathians. Name it “Mickey” and it books theme parks. Names shape behavior.

Want alignment? Strip the ego. Use abstract prompts. No names. No “you.” No characters. Just tasks and context. Don’t script sentient fantasies or give tools to role-playing bots.