Evaluating my 2023 predictions
Yes, this is a year late. I didn't quite get round to it last year.
If you’re new to this game, here’s the rundown: At the start of the year, for a list of events, you give the probability with which you think they’ll happen. At the end of the year, you see how many actually happened. The goal isn’t to get them “right” in the binary sense, but for your probabilities to calibrate well. For example, about 70% of the things that you said would happen with probability 0.7 should turn out be true.
Astral Codex Ten (ACX) would do them, and then started running it as a prediction contest. Nowadays they’ve moved to Metaculus with a scoring system better suited for contests1, though I missed the boat for 2024. The ACX contest had 50 statements, relating to the US, UK and the world, but I skipped 5 of them. I then had another 55 statements about New Zealand, and my personal life (and the people around me), so for my exercise I have 100 in total.
In case you missed the title, which was not a typo—these are predictions for 2023, that is, the year before last year. Everything resolved a year ago, but I didn’t get round to this evaluation, so I’m doing it now.
Summary
Here’s the chart. The blue line shows the ideal: predictions of 0.7 happen 0.7 of the time. The red line shows the actual: events I gave 0.6 actually happened 0.67 of the time. It’s not too bad, except at 0.4 and 0.5. Apparently I should adjust downward on things that I’m very uncertain about or lean slightly against.
Predictions of 0.7 and 0.3 are actually the same thing, just flipped, so here’s the same chart with p and 1 − p combined:
I’m pretty happy with this, even though it was probably a fluke.
Statements
Format: Event — my prediction probability — outcome (TRUE/FALSE) and whether the side of 0.5 it was on matches the outcome (e.g. 0.3 and FALSE match, so would have ✅).
Remember: These relate to 2023 (where it matters, 31 December 2023). Some of these things became true in 2024, but it’s what happened in 2023 that matters.
ACX: World
Will Vladimir Putin be President of Russia? — 0.9 — TRUE ✅
Will Ukraine control the city of Sevastopol? — 0.1 — FALSE ✅
Will Ukraine control the city of Luhansk? — 0.3 — FALSE ✅
Will Ukraine control the city of Zaporizhzhia? — 0.7 — TRUE ✅
Will there be a lasting cease-fire in the Russia-Ukraine war? — 0.3 — FALSE ✅
Will the Kerch Bridge be destroyed, such that no vehicle can pass over it? — 0.4 — FALSE ✅
Will an issue involving a nuclear power plant in Ukraine require evacuation of a populated area? — 0.3 — FALSE ✅
Will a nuclear weapon be detonated (including tests and accidents)? — 0.2 — FALSE ✅
Will a nuclear weapon be used in war (ie not a test or accident) and kill at least 10 people? — 0.1 — FALSE ✅
Will China launch a full-scale invasion of Taiwan? — 0.1 — FALSE ✅
Will any new country join NATO? — 0.2 — TRUE ❌
Will Ali Khameini cease to be Supreme Leader of Iran? — 0.2 — FALSE ✅
Will any other war have more casualties than Russia-Ukraine? — 0.1 — FALSE ✅
Will there be more than 25 million confirmed COVID cases in China? — 0.4 — FALSE ✅
ACX: US/UK politics
Will prediction markets say Joe Biden is the most likely Democratic nominee for President in 2024? — 0.7 — TRUE ✅
Will prediction markets say Gavin Newsom is the most likely Democratic nominee for President in 2024? — 0.2 — FALSE ✅
Will prediction markets say Donald Trump is the most likely Republican nominee for President in 2024? — 0.7 — TRUE ✅
Will prediction markets say Ron DeSantis is the most likely Republican nominee for President in 2024? — 0.1 — FALSE ✅
Will the Supreme Court rule against affirmative action? — 0.8 — TRUE ✅
Will there be any change in the composition of the Supreme Court? — 0.3 — FALSE ✅
Will Donald Trump make at least one tweet? — 0.7 — TRUE ✅
Will Joe Biden have a positive (approval minus dispproval) rating? — 0.2 — FALSE ✅
Will Donald Trump get indicted on criminal charges? — 0.7 — TRUE ✅
Will a major US political figure be killed or wounded in an assassination attempt? — 0.1 — FALSE ✅
Will Rishi Sunak be Prime Minister of the UK? — 0.4 — TRUE ❌
Will the UK hold a general election? — 0.1 — FALSE ✅
ACX: Business and economy
Will Elon Musk remain owner of Twitter? — 0.6 — TRUE ✅
Will Twitter's net income be higher in 2023 than in 2022? — 0.4 — FALSE ✅
Will Twitter's average monetizable daily users be higher in 2023 than in 2022? — 0.6 — FALSE ❌
Will US CPI inflation for 2023 average above 4%? — 0.7 — FALSE ❌
Will the S&P 500 index go up over 2023? — 0.6 — TRUE ✅
Will the S&P 500 index reach a new all-time high? — 0.2 — FALSE ✅
Will the Shanghai index of Chinese stocks go up over 2023? — 0.6 — FALSE ❌
Will Bitcoin go up over 2023? — 0.6 — TRUE ✅
Will Bitcoin end 2023 above $30,000? — 0.2 — TRUE ❌
Will the US unemployment rate (now 3.7%) be above 4% in November 2023? — 0.7 — FALSE ❌
Will any FAANG or Musk company accept crypto as a payment? — 0.1 — FALSE ✅
ACX: Science and tech
Will OpenAI release GPT-4? — 0.3 — TRUE ❌
Will an image model win Scott Alexander’s bet on compositionality, to Edwin Chen’s satisfaction? — 0.7 — FALSE ❌
Will COVID kill at least 50% as many people in 2023 as it did in 2022? — 0.4 — FALSE ✅
Will a new version of COVID be substantially able to escape Omicron vaccines? — 0.5 — FALSE
Will a successful deepfake attempt causing real damage make the front page of a major news source? — 0.3 — TRUE ❌
Will WHO declare a new Global Health Emergency? — 0.2 — FALSE ✅
Will AI win a programming competition? — 0.4 — FALSE ✅
Will someone release "DALL-E, but for videos"? — 0.3 — TRUE ❌
Those are the 50 from the ACX contest. The rest are statements I came up with about New Zealand or my life and those around me.
New Zealand election
Labour leads the government formed after the general election — 0.4 — FALSE ✅
Greens win at least as many seats in the general election as they have now (10) — 0.7 — TRUE ✅
ACT wins at least as many seats in the general election as they have now (10) — 0.6 — TRUE ✅
TOP's share of the party vote is at least what it was in 2020 (1.51%) — 0.4 — TRUE ❌
New Zealand First re-enters Parliament — 0.3 — TRUE ❌
It is clear on election night (before 8am the next day) which major party will form the next government — 0.6 — TRUE ✅
An MP resigns from their party before the general election due to a conflict or scandal — 0.7 — TRUE ✅
Chris Hipkins remains leader of the Labour Party until the general election — 0.8 — TRUE ✅
National has a contested leadership vote before the general election — 0.1 — FALSE ✅
Green Party triggers a leadership contest during 2023 (before or after the election) — 0.5 — FALSE
Raf Manji wins the Ilam electorate — 0.1 — FALSE ✅
Chris Bishop wins the Hutt South electorate — 0.2 — TRUE ❌
Arena Williams wins the Manurewa electorate — 0.9 — TRUE ✅
I cast my party vote for a major party (Labour or National) — 0.3 — FALSE ✅
New Zealand politics and economy
The government abandons the RNZ–TVNZ merger — 0.6 — TRUE ✅
The government abandons the Three Waters reforms2 — 0.2 — FALSE ✅
New Zealand makes a bivalent COVID-19 booster vaccine available to all adults before 1 July — 0.5 — TRUE
New Zealand reintroduces a pre-departure testing requirement for inbound travellers from any country — 0.2 — FALSE ✅
A public entity in New Zealand suffers a data breach or ransomware attack making headline news — 0.8 — TRUE ✅
A major scandal engulfs Wayne Brown (enough cause serious damage to his reputation) — 0.1 — FALSE ✅
QV House Price Index annual change to 1 November 2023 is at least 0% — 0.4 — FALSE ✅
Inflation in New Zealand in the September 2023 year is at least 5% — 0.4 — TRUE ❌
New Zealand official cash rate is at least 7% at some point during the year — 0.5 — FALSE
Unemployment in New Zealand is at least 5% in at least one quarter (up to September 2023) — 0.3 — FALSE ✅
Personal
I can hold a handstand for at least 15 seconds — 0.8 — FALSE ❌
I can hold a human flag for at least 1 second — 0.6 — FALSE ❌
I run a race of at least 10km — 0.2 — TRUE ❌
I do a [redacted] — 0.7 — FALSE ❌
I start [redacted] — 0.2 — FALSE ✅
I resume doing WCS regularly — 0.2 — FALSE ✅
I attend a WSDC registry event outside Oceania and North America — 0.7 — TRUE ✅
I attend solo dance classes regularly for the whole year — 0.8 — TRUE ✅
I move out of my current house — 0.2 — FALSE ✅
I enter a relationship — 0.1 — FALSE ✅
I write at least five posts in Substack after this one — 0.4 — FALSE ✅
I coach a debating team — 0.1 — FALSE ✅
I visit the Bay Area — 0.5 — FALSE
I commit to a teacher training programme for 2024 — 0.4 — FALSE ✅
[My employer] [redacted] — 0.3 — FALSE ✅
[My team at work] [redacted] — 0.8 — FALSE ❌
I complete efforts to understand at least one education-related topic on my personal research backlog — 0.2 — FALSE ✅
Friends
I see at least one non-Kiwi friend from the Bay Area while they're visiting New Zealand — 0.1 — TRUE ❌
[friend] moves to [country] — 0.1 — FALSE ✅
[friend] moves to [country] — 0.8 — TRUE ✅
[friend] is pregnant (or gives birth) — 0.4 — FALSE ✅
[friend] is pregnant (or gives birth) — 0.4 — FALSE ✅
[friend] becomes eligible to [redacted] — 0.7 — TRUE ✅
[friend] becomes eligible to [redacted] — 0.4 — FALSE ✅
[friend] becomes eligible to [redacted] — 0.3 — TRUE ❌
[friend] earns at least one [redacted] — 0.8 — TRUE ✅
[friends] get engaged — 0.3 — FALSE ✅
[friends] get engaged — 0.2 — FALSE ✅
[friend] defends [his/her] thesis — 0.8 — TRUE ✅
[friend] defends [his/her] thesis — 0.7 — TRUE ✅
[friend] defends [his/her] thesis — 0.2 — FALSE ✅
In a contest you want to reward entrants more for making more certain predictions, e.g. someone who said that Trump would be elected with probability 0.9 should get a higher score than someone who said the same with probability 0.7. But for a personal exercise, where you’re trying to get good at evaluating uncertainty in the way advocated by Julia Galef in The Scout Mindset, you don’t want to incentivise overconfidence like this.
I should’ve clarified earlier, but I meant the Labour government—so this resolved as soon as they lost the election. I promise this is what I meant at the time and I’m not just making that up retrospectively.