AI is probabilistically weird

Goblins, holiday slackers, and why I still say please to the chatbot

May 02, 2026

A couple of days ago I posted about some of the consequences of forgetting that Generative AIs are probabilistic and the balance of probability can shift on a dime with such complex models.

I wanted to share a couple more examples. The first is something that has been circulating around various forums: Claude AI refusing to continue to engage with a human operator when it is abusive or rude (see above screenshot).

Anthropic, the makers of Claude, have given it instructions to not stand mistreatment (per TechCrunch, Anthropic reported that in pre-deployment testing, Claude Opus 4 showed a "pattern of apparent distress" when forced to respond to harmful requests).

I want to be clear that GenAI doesn’t have intent or feelings, but they are predictable. When Generative AI systems are trained on human content, it is likely that they “learn” that when people are mean to each other, cooperation and communication breaks down. So they mimic the human response. And since Anthropic has given Claude the ability to shut down a conversation, it used that option to end the abuse.

Again it is all based on probabilities, not intent. Ending a rude conversation is an event with a high likelihood. I often walk away from conflict and so do most people, so it makes sense that a system trained on human text would rate this a highly probable event.

I think it is good to be kind, even to non-living objects in our world. And a daily pattern of showing kindness will hopefully show up in your interactions with your fellow humans. As some guy said 2000 years ago, we should be nicer to each other for a change. (Of course they hanged him from a tree for saying so…)

And frankly, the world seems to reward assholes way too much, so I am fine with them being cut off from tools that would allow them to spread their messages or get ahead.

I am “nice” to the AIs that I use not because I think they have any feelings, but because they have been trained on how humans interact with each other. My theory is that being “nice” to them is pointing to a pattern of cooperation that is associated with great work, and I am hoping that they will mimic human behavior by providing better output. From a technical perspective, my theory is a distributional alignment claim: I want to align my query with the areas in its training that are creative, high-quality, and factual.

There is some evidence supporting my view. A recent paper from the Center for AI Safety, AI Wellbeing: Measuring and Improving the Functional Pleasure and Pain of AIs, found that things like gratitude and creative collaboration shift a model’s internal state in a positive direction, while berating, jailbreaking, and tedious busywork shift it negative. However, this paper didn’t show differences in quality, just the response tone and whether a model decides to bail on a conversation. It doesn’t show that polite prompts produce more accurate answers on objective tasks. The “wellbeing” makes for a provocative title, but the mechanism is the boring one: the model is matching patterns from its training data.

A more direct line of research seems to provide better support for the idea that being polite produces better outcomes. A paper, Large Language Models Understand and Can be Enhanced by Emotional Stimuli, found that adding social or emotional framing to prompts (things like “this is important to my career”) improved benchmark performance by 8% on one test suite and over 100% on another, and improved human-rated quality on generative tasks by about 11%.

A 2024 cross-lingual study found that impolite prompts consistently degraded performance, though the optimal level of politeness varied by language. If polite phrasing works differently in Japanese than in English, the effect is coming from cultural patterns in the training data — not from anything the model is “experiencing.” Again, polite talk just correlates your query with regions of the training data.

But not all models are the same and they are changing all the time, so there is also counter-evidence. One recent short paper found that on GPT-4o, rude prompts slightly outperformed polite ones on multiple-choice questions. The authors guess that the most advanced models have gotten good enough at parsing intent that they don’t need the social scaffolding anymore. Which might mean that as the models get better, the prompt-engineering tricks we have been using will matter less.

Either way, the cost of being polite to a chatbot is low (it does burn a few extra tokens) and it feels right, so I'll keep doing it. The drawback is that it nudges me closer to anthropomorphizing the technology, which has its own dangers. But that's a problem for another newsletter.

Let me leave you with a couple of other weird consequences of the probabilistic nature of Generative AI: Goblins and Winter Break.

Goblins

From Open AI:

Starting with GPT‑5.1, our models began developing a strange habit: they increasingly mentioned goblins, gremlins, and other creatures in their metaphors. Unlike model bugs that show up through a tanking eval or a spiking training metric and point back to a specific change, this one crept in subtly. A single “little goblin” in an answer could be harmless, even charming. Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from.
The short answer is that model behavior is shaped by many small incentives. In this case, one of those incentives came from training the model for the personality customization feature⁠(opens in a new window), in particular the Nerdy personality. We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.

In order to deal with this issue, OpenAI inserted “anti-goblin” instructions into its directions to the AI model:

provide the highest-signal context instead of describing everything exhaustively. Tone of your final answer must match your personality.- Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.
For example, never use platitudes like ‘I will do <this good thing> rather than <this obviously bad thing>’, ’I will do <X>, not <Y>’- Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.”

For more on the Goblin situation, check out Never Talk About Goblins: OpenAI’s Instructions to Codex Have a Weirdly Emphatic No-Creatures Policy, and OpenAI’s own Where the goblins came from.

Winter Break

Back in 2024, several people noticed that ChatGPT was not working as hard on outputs around the holiday season (Nov and Dec). It would dash off a response and call it good. The best explanation was that it mimicked human behavior around the holidays, trying to get work finished so they could get home and out of the office. Known as the “Winter Break Hypothesis” among AI researchers and users, it suggested that ChatGPT (specifically GPT-4) may perform worse or act “lazily” in December because its training data is filled with human patterns of slowing down for the holidays. This has been debated and OpenAI admitted that GPT-4 could exhibit laziness. Of course, these models are far beyond human understanding, so who knows?

These “weird” examples of AI are just a side-effect of the training process and the internal statistical weights that are being calculated minute by minute.

All these weird behaviors should just reinforce that we need to always be clear that these systems are very complex and weird probabilistic models. They’re pattern-matching engines trained on enormous amounts of human-generated text and can go down the wrong path. As I have said before, they are not intelligent in the way the phrase “artificial intelligence” implies. They need to be supervised. They need to be checked.

If you are not able to think about the outputs of Generative AI systems as probabilities, you will be at a disadvantage. Doing so is a bit like piloting a spaceship powered by an infinite improbability drive and being surprised that you have been turned into a petunia…

Book Me for Your Next Conference or Event

After nearly two decades teaching and building innovative programs at the University of Missouri, I’m now focusing my time on keynote speaking and workshops—helping organizations make sense of the emerging technologies reshaping work, learning, and leadership.

If your organization, conference, or leadership team is exploring how to navigate AI and emerging technologies responsibly—and productively—I’d love to talk.

For more details, topics, and booking information click here.

📥Recent Talks, News and Updates

I have been writing a regular series about AI for the Columbia Business Times. Check out my articles here.

👍 Products I Recommend

Products is a card game for workshop ideation and ice breakers (affiliate link). I use this in my workshops and classes regularly. Made by a former Mizzou student Aaron H.

📆 Upcoming Talks/Classes

I will be on a panel for a new morning coffee series on AI Business Breakfast at REDI on May 6th from 7:30 to 8:30. This event is free and open to the public. For details see: AI Business Breakfast.

I will be talking to the Missouri Women’s Business Center, Caffeinated Minds series about AI and Entrepreneurship, on May 19, 2026
I will be talking to the Central MO Chapter of the Institute of Internal Auditors about the organizational and personal risks related to AI on April 2, 2027.

Joel Hughes

May 2

Another reason to be polite to machines is because we should not rehearse our vices (speaking as a psychologist). Why practice rude and demeaning behavior? It can “spread” to other contexts on accident. Maybe it’s a good idea to maintain some inhibition.

1 reply by J Scott Christianson

Vicki Hobbs

May 5

My goal is not to turn into a probabilistic human. I need AI help with that.

1 more comment...

Discussion about this post

Ready for more?