The Forecaster Test

This measures five dimensions of judgment quality that predict forecasting accuracy: Bayesian updating, regression awareness, calibration, cognitive reflection, and open-minded thinking.

Time: ~8-10 minutes

Section 1 Bayesian Update Tests

These problems test how you update probability estimates when new evidence arrives. First commit to an initial estimate, then update based on new information.

Problem 1 of 3

Stage 1: Initial Estimate

A company implements mandatory drug testing. The test has:

95% sensitivity (correctly detects 95% of drug users)
92% specificity (correctly clears 92% of non-users)

In this industry, approximately 4% of employees use drugs.

An employee tests positive. What is the probability they actually use drugs?

✓ Committed: —%

Stage 2: Update with Evidence

New evidence: You learn this employee works in a warehouse division. Internal audits have found that 11% of warehouse employees use drugs (vs 4% company-wide).

Recalculate: What is the probability this employee uses drugs?

Most people spend about 30 seconds on each stage

Problem 2 of 3

Stage 1: Initial Estimate

A retailer sources products from two factories:

Factory A supplies 70% of inventory, with a 3% defect rate
Factory B supplies 30% of inventory, with a 9% defect rate

A customer returns a defective product. What is the probability it came from Factory A?

✓ Committed: —%

Stage 2: Update with Disconfirming Evidence

New evidence: The defect is identified as a coating flaw. Historical data shows:

15% of Factory A's defects are coating flaws
60% of Factory B's defects are coating flaws

Given the defect is a coating flaw, update your estimate for Factory A.

Most people spend about 30 seconds on each stage

Problem 3 of 3

Stage 1: Initial Estimate

A venture fund screens startup founders. Historically:

6% of applicants are "high-potential" (will return 10x+)
The screening committee correctly advances 85% of high-potential founders
The committee also advances 20% of ordinary founders

A founder passes the screening committee. What is the probability they are high-potential?

✓ Committed: —%

Stage 2: Update with Mixed Evidence

New evidence: You learn two additional facts about this founder:

Signal A (positive): The committee ranked them in their top 3 picks of the quarter. Among advanced founders:

40% of high-potential founders receive top-3 ranking
8% of ordinary founders receive top-3 ranking

Signal B (negative): The founder has no prior startup experience. Among advanced founders:

30% of high-potential founders lack prior experience
65% of ordinary founders lack prior experience

Integrating both signals, update your estimate for this founder being high-potential.

Most people spend about 45 seconds on each stage

Section 2 Intuition Check

This tests whether you recognize common statistical patterns that often mislead people.

Performance Prediction

A regional sales team had an exceptional Q3, beating their quarterly target by 40%. This was their best quarter in 3 years.

The company's leadership is now projecting Q4 performance for this team. The team composition and market conditions are expected to remain similar.

What's most likely for Q4?

Q4 will likely match or exceed Q3 (they've found their groove)

Q4 will likely be above average but below Q3 (regression toward typical performance)

Q4 is unpredictable from Q3 data alone

Most people answer in about 20 seconds

Section 3 Calibration Check

For each statement, indicate whether you think it's true or false, then rate your confidence in that answer. These test how well your confidence matches your accuracy.

Statement 1 of 4

More than half of major corporate mergers fail to create shareholder value (as measured by stock performance vs. industry benchmarks 3 years post-merger).

True

False

How confident are you? (50% = just guessing, 100% = certain)

75%

Most people answer in about 15 seconds

Statement 2 of 4

Most startups (more than 50%) that raise Series A funding eventually go on to raise a Series B round.

True

False

How confident are you?

75%

Most people answer in about 15 seconds

Statement 3 of 4

Clinical trials that show positive results are published at higher rates than those showing null or negative results.

True

False

How confident are you?

75%

Most people answer in about 15 seconds

Statement 4 of 4

Professional economic forecasters accurately predict the direction (up or down) of annual GDP growth more than 75% of the time when forecasting 2 years ahead.

True

False

How confident are you?

75%

Most people answer in about 15 seconds

Section 4 Cognitive Reflection

Take a moment to verify your answer.

Problem 1 of 2

A bat and ball cost $1.10 total. The bat costs $1.00 more than the ball. How much does the ball cost?

Problem 2 of 2

A lily pad patch doubles daily. It covers the lake on day 48. When did it cover half the lake?

days

Section 7 Scientific Calibration

Each study below was later tested in a large, pre-registered replication attempt. Estimate the probability that the original finding successfully replicated.

Study 1 of 6

Ego Depletion (1998)

Participants who first resisted eating cookies (exerting self-control) gave up faster on a subsequent puzzle task than those who hadn't resisted temptation. The researchers concluded that willpower is a limited resource that gets depleted with use.

Probability this replicated?

50%

Study 2 of 6

Facial Feedback (1988)

Participants who held a pen in their teeth (forcing a smile-like expression) rated cartoons as funnier than those who held the pen with their lips (preventing smiling). The researchers concluded that facial expressions can directly influence emotional experience.

Probability this replicated?

50%

Study 3 of 6

Anchoring Effect (1974)

Participants who first saw a random number (e.g., spinning a wheel showing "65") gave higher estimates to unrelated questions (e.g., "What percentage of African nations are in the UN?") than those who saw lower random numbers. The researchers concluded that arbitrary initial values bias subsequent numerical judgments.

Probability this replicated?

50%

Study 4 of 6

Power Posing (2010)

Participants who held "expansive" poses (arms spread, taking up space) for two minutes showed increased testosterone and decreased cortisol compared to those in "contractive" poses. The researchers concluded that body posture directly affects hormone levels and feelings of power.

Probability this replicated?

50%

Study 5 of 6

Loss Aversion (1979)

When choosing between gambles, people required potential gains to be roughly twice as large as potential losses before they'd accept a 50/50 bet. The researchers concluded that losses loom larger than equivalent gains in decision-making.

Probability this replicated?

50%

Study 6 of 6

Elderly Priming (1996)

Participants who unscrambled sentences containing words related to old age (e.g., "Florida," "wrinkle," "gray") walked more slowly down the hallway afterward than those exposed to neutral words. The researchers concluded that subtle exposure to concepts can unconsciously influence behavior.

Probability this replicated?

50%

The Forecaster Test

How it works

Assessment

Your Results

Bayesian Reasoning

Diagnostic Reasoning

Cognitive Reflection

Open-Minded Thinking

Your Forecasting Profile

Leaderboard Name

Predictions

Your Predictions

Leaderboard

The Research Question

Top Forecasters (by Brier Score)

Top Performers (by Judgment Score)

About

The Good Judgment Project

Generalizable Judgment

This Project

What We Measure

Can Judgment Improve?

Privacy

The Forecaster Test

How it works

Assessment

Your Results

Bayesian Reasoning

Diagnostic Reasoning

Cognitive Reflection

Open-Minded Thinking

Your Forecasting Profile

Share Results

Leaderboard Name

Predictions

Your Predictions

Leaderboard

The Research Question

Top Forecasters (by Brier Score)

Top Performers (by Judgment Score)

About

The Good Judgment Project

Generalizable Judgment

This Project

What We Measure

Can Judgment Improve?

Privacy