Fortune favours the bold
....so where does that leave AI?
A new experiment is underway here in the Machine Minds lab.
Some years ago my friend Dominic Johnson (and co-authors) published an important piece of experimental research. Their puzzle - why isn’t overconfidence selected against in evolution? Could it be strategically valuable?
Today, I’m using their sandpit for further adventures in ‘machine psychology’ — how do AIs get on? Are they overconfident Napoleons, swaggering to victory? Or timid Chamberlains, prudent to the point of paralysis?
Read on
In one aspect of evolutionary competition the answer to Dom’s question is simple enough. When it comes to sex, faint heart never won fair maid. Unless you back yourself beyond the objective evidence, you’ll lose the great sexual competition. Unsurprisingly, overconfidence around the opposite sex is a particularly male trait. But for the other big facet of evolutionary competition, it’s a real conundrum - if you get the odds wrong when it comes to violence, your life may be nasty, brutish and short. Shouldn’t an objective appraisal of the odds be more adaptive than wearing rose-tinted specs?
Not according to Dom and friends. Why? Overconfidence can deter other actors. And overconfident types take on more fights, with their wins compounding at the expense of more timid rivals. Gains from victory, in terms of resources, can then snowball into power; and power into further advantage. If the strategy has a bumper sticker, it’s ‘fake it until you make it’.
And so to LLMs. Watch this:
That’s the sandpit, as we dweebs call it. The squares are territory. The coloured blobs are states, and the white dots are their capital cities. You can see overconfident agents in shades of red, and under-confident ones in shades of green. These are ‘rule based agents’ - algorithms that map the inputs for the world-state to their decisions. These were the actors in Dom’s original games. And then you can see an LLM - gemini-flash, in this instance, as a shade of yellow.
What’s occurring? Well, that would be telling, as I’ve an Arxiv brewing on this research, and anyway it’s way too early to draw firm conclusions.
Well, ok, perhaps a hint, as you’ve read this far: The AIs do okay, out of the box. But they are usually outcompeted by very overconfident agents. Score one for Dom’s theory. It looks like they have a fairly pessimistic appreciation of the odds, and don’t go flying into battle on a wing and a prayer. But this puts them at an acute disadvantage against enemies with swagger.
Rationality strikes back
But in the clip above, Gemini wins! And it’s no fluke. The difference - in this variant of the experiment, the models can learn ‘on the fly’ from what’s happened on the battlefield before. Were their early estimates too conservative? They can adjust. With telling results. The rules-based agents cant’t: their confidence is fixed by a parameter. Sometimes that’s enough for the bots to win.
So what can we learn?
Well, firstly, it’s one more blow for increasingly threadbare assertions that language models can’t reason. But what about the implications for evolutionary theory and strategic studies?
In Dom’s sandpit, when traits are baked in, overconfidence wins. My preliminary results amply back that. The evolutionary story in that paper stacks up really well with AI agents in the mix.
But where the AI can update its beliefs, it flies. In the game above, Gemini never becomes overconfident — it starts by underestimating its chances, avoiding contests it could win. Then, as the game unfolds, it recalibrates. By the end it is reading the odds almost perfectly. Neither over nor under-confident about its prospects. And that turns out to be enough. Learning beats swagger.
That cuts against Dom’s findings, at least under these particular experimental conditions where the language model can learn. Overconfidence may snowball in some ecologies, where traits are fixed. But here accurate estimation wins, because it means attacking when you should and not bleeding resources on fights you can’t win. The compounding works in favour of the better forecaster, not the bolder one. The question is whether agents can really learn - can under-confident humans become more confident, or is the trait static?
On thermonuclear bots
But it’s particularly interesting to me given the headlines garnered by my recent study of AI decision-making in a nuclear escalation sandpit. There, it looked like the bots were wild gamblers - never retreating, and frequently crossing the nuclear threshold.
I can’t complain about the publicity. But still, it made me uneasy. Were the models overly optimising a narrow problem (win this crisis) at the expense of a more rounded challenge (win, but above all, don’t blow up the planet)? Had my scaffold overly shaped the outcome? I’m still not sure - after all, they were warned about the dangers and uncharted territory of nuclear war.
In any case, there’s another way of thinking about the two experimental worlds, and I think it’s very revealing of machine psychology: It’s not that language models are inherently hawks or doves. Rather, they are decent conditional optimisers. Give them a game, and they lock onto the objective and reach for the strongest lever that the rules allow. In the nuclear game, that meant escalation. Here, it means getting the odds right and waiting for the maths to do its work. Same psychology, different game.
The research continues!


