What is a calibrated probability?
Everyone quotes a number. Almost no one tells you how often it’s right. A calibrated probability is one you can actually take at face value — and it’s the difference between honest research and confident nonsense.
A number you can trust at face value
A probability is calibrated if reality matches it over many tries: across all the days a model says 60%, the thing should actually happen about 60% of the time. Not 95%, not 30%. The point isn’t to be bold — it’s to be right about how right you are.
This is why calibration beats confidence. A weather app that says “70% rain” and is right 70% of the time is far more useful than one that screams “100% rain!” and is wrong a third of the time. Confidence is cheap; calibration is earned.
Reliability and the Brier score
- Reliability diagram — group every prediction by the probability it stated, then check the realized rate. If the “54%” bucket happens 54% of the time, the model is calibrated there.
- Brier score — a single number that rewards being accurate and honest about uncertainty, and punishes overconfidence. Lower is better.
In Ioxer’s walk-forward tests, when the model said around 54% it landed near 54% — the honest “say 54, hit 52”. The live reliability curve is still filling in, and we won’t claim it’s settled until it has.
A few points off a coin-flip — on purpose
Here’s the part that sounds underwhelming and is actually the proof of honesty: Ioxer’s probabilities sit a few points off 50/50, not at 70% or 90%. The funding-crowding edge is real but modest, and a calibrated model has to say so.
If a crypto tool quotes you a confident 85% on a single coin, it’s not better than us — it’s mis-calibrated. In liquid crypto, a genuine edge is small, and pretending otherwise is the tell of a scam.
Common questions
What does "calibrated" mean for a probability?
It means the number is honest at face value: across all the times a model says 60%, the event should actually happen about 60% of the time. A model can be confident and wrong (poorly calibrated) or modest and reliable (well calibrated) — calibration is what makes a probability usable.
How do you measure calibration?
With a reliability diagram (group predictions by their stated probability and check the realized rate matches) and the Brier score (which rewards being both accurate and honest about uncertainty). In walk-forward tests, when Ioxer said around 54% it landed near 54%; the live reliability curve is still accumulating.
Why is Ioxer’s probability only a few points off 50%?
Because the edge is real but modest. An honest funding-crowding tilt is a few points off a coin-flip, not 70% or 90%. Any crypto tool quoting high-confidence per-coin probabilities is mis-calibrated or lying — a genuine edge in liquid crypto is small by nature.
Ioxer is research, not investment advice. IOX is a crowding read — not a price prediction, not a buy/sell signal.