Skip to content
Back to blog
2026-04-064 min read

How We Track Prediction Accuracy with Brier Scores

Every intelligence platform makes forecasts. We actually track whether ours come true — using the same calibration methods as professional forecasting tournaments.

The Accountability Gap

Most intelligence platforms make predictions constantly. "Oil will rise." "The Fed will cut rates." "Tensions will escalate." But almost none of them track whether those predictions came true.

This creates a perverse incentive: make bold predictions that attract attention, then quietly move on when they're wrong. The reader has no way to evaluate which sources are actually reliable.

How Brier Scoring Works

VORENTH uses Brier scores to measure prediction accuracy. The formula is simple:

Brier Score = (probability - outcome)²

Where outcome is 1 (correct), 0.5 (partial), or 0 (incorrect).

A perfect Brier score is 0.00 (you assigned 100% probability and it happened). The worst possible score is 1.00 (you assigned 100% probability and it didn't happen). Random guessing averages around 0.25.

What Makes Our System Different

1. Forecasts vs. Signals

VORENTH distinguishes between scored forecasts (discrete, falsifiable claims tracked against outcomes) and analytical signals (directional intelligence that informs but isn't scored). Only forecasts appear in the accuracy record — this prevents vague directional calls from polluting calibration data.

2. Structured Forecasts

Every forecast must be specific, time-bounded, and independently verifiable — no vague claims like "markets will react." This discipline is what makes scoring possible.

3. Calibrated Probabilities

Forecast probabilities are cross-referenced against external benchmarks including prediction market consensus. This prevents overconfidence and probability clustering — two of the most common failure modes in analytical forecasting.

4. Domain-Aware Confidence

Not all domains are equally predictable. The system accounts for where narrative-based analysis has genuine predictive edge versus where other methods are more reliable.

5. Evidence-Based Resolution

When a forecast reaches its target date, the system evaluates the outcome based on real-world evidence — market data, verified reporting, and official sources. No subjective judgment.

6. Self-Correcting Calibration

Historical accuracy data feeds back into the system. If calibration drifts in any category, future probabilities are adjusted accordingly. This isn't a human deciding to "be more careful" — it's a quantitative correction based on measured performance.

7. Public Track Record

Our track record page shows aggregate forecast accuracy — Brier scores by category and individual resolved forecasts. Full transparency.

Why This Matters

When you're making decisions based on intelligence analysis — whether it's portfolio allocation, policy recommendations, or risk assessment — you need to know how reliable the source is.

A system that tracks its own accuracy and self-corrects isn't just more honest. Over time, it becomes genuinely better at forecasting.

Get intelligence briefings delivered

Weekly analysis, prediction updates, and early access to new features.

Intelligence briefings and early access. No spam.