STAGING

Professor Philip Tetlock's forecasting research

Illustrative image

▲ Photo by Tasha Marie on Unsplash

Related research

Philip E. Tetlock is a Penn Integrates Knowledge (PIK) Professor at the University of Pennsylvania, cross-appointed in the School of Arts and Sciences and Wharton. This research focuses on improving probability estimates of early-warning indicators of global catastrophic and existential risks.

What problem are they trying to solve?

Professor Tetlock’s research could be valuable for anticipating and mitigating global catastrophic risks, for example, those caused by natural or engineered pathogens, artificial intelligence, nuclear weapons or extreme climate change. To do this, Tetlock and collaborators are combining traditional techniques proven effective in first-generation forecasting tournaments with more experimental methods to build a discipline of second-generation forecasting.

Accurately forecasting future events is extremely challenging in itself but Tetlock’s research has previously developed methods to improve forecasting accuracy. Figure 1 shows that in the IARPA geopolitical tournaments of 2011-15: (a) unweighted averaging of regular forecasters improved accuracy well above chance; (b) Tetlock’s research team developed four methods of beating unweighted averaging - by spotting “superforecasting” talent, training, teamwork, and weighted averaging algorithms.

Figure 1

Tetlock_Fig 1.png

Source: Good Judgment.

In a recent paper funded by Founders Pledge, Improving Judgments of Existential Risks: Better Forecasts, Questions, Explanations, Policies, Tetlock and collaborators lay out their vision of key challenges and solutions for improving judgments of existential risks.

Tetlock notes that traditional tournaments can help in part by “chart[ing] the ebb and flow of judgments of short-run potential precursors of long-run X-risks.” But problems remain: how do we accurately identify these precursors, and how does early warning lead to risk mitigation? To address these and other challenges, Tetlock’s second generation of tournaments seeks to do three things that first-generation tournaments fail to do:

  1. “identify early-warning indicators in a noisy, distraction-laden world;”
  2. “craft insightful explanations that assist policymakers in spotting lead indicators;”
  3. “give louder voices in policy debates to high-value contributors at each phase of the knowledge-production cycle.”

In their recent paper, Tetlock and colleagues address these challenges and broader objections. Tetlock hypothesizes that “across a surprising range of environments, X-risk tournaments are good bets to deliver value even with moderately myopic forecasters".

What do they do?

Over the last 35 years, Professor Tetlock has pioneered the practice of forecasting, a way to make predictions about future events more accurate and useful. Professor Tetlock has been described by the economist Tyler Cowen as “one of the greatest social scientists in the world” and his papers have garnered over 55,000 citations.

Professor Tetlock’s research is particularly relevant from a long-term perspective because it has the potential to improve our ability to predict and/or mitigate global catastrophic risks. To prioritize efforts to reduce such risks, we must predict their relative probabilities — and the effect sizes of possible interventions — in advance. Of course, this is extremely difficult. Many of the most concerning risks humanity faces in the coming centuries, such as engineered pandemics or superintelligent computer systems, are unprecedented. Professor Tetlock’s forecasting methods could help us make rigorous, comparable and calibrated assessments of the likelihood of these risks and the effectiveness of efforts to reduce them. Moreover, even if forecasting proves possible only on relatively short timescales (e.g. under 5 years), the techniques would still be highly valuable for early-warning indicators and risk-mitigation.

Professor Tetlock has an ambitious research agenda related to forecasting global catastrophic risks. He is also an active public figure, building support for forecasting by writing books, going on podcasts and participating in interviews. With additional funding, we expect Professor Tetlock and his collaborators, Dr. Pavel Atanasov and Dr. Ezra Karger, to expand and scale up their work on second-generation forecasting.

Why do we recommend them?

  • Open Philanthropy, our research partner, recommends Professor Tetlock’s research as one of the highest-impact funding opportunities for mitigating global catastrophic risks in the world.
  • Professor Tetlock has a strong track record of conducting innovative, actionable research.
  • With previous Founders Pledge funding, the team delivered a research product that outlined their approaches to solving the biggest challenges of forecasting global catastrophic risks.
  • With additional funding, Professor Tetlock and his collaborators could scale up high-value research directly related to forecasting global catastrophic risks, and implement several of the methods outlined in their recent paper.

Here we briefly review the history of Professor Tetlock’s forecasting research and achievements.

Pioneering forecasting work

Professor Tetlock’s earliest forecasting work grew out of his work on the National Academy of Sciences Committee for the Prevention of Nuclear War (formed during the Cold War tensions of the early 1980s). Between 1984 and 2003, Professor Tetlock ran a number of forecasting tournaments in which predictions about future events were solicited from hundreds of experts. He then analyzed which traits best predicted the accuracy of individual forecasters. The findings, published in the 2005 book Expert Political Judgment: How Good Is It? How Can We Know?, famously suggested that the predictions of experts fared especially poorly. Experts were generally outperformed by simple extrapolation algorithms and generalists who relied on a range of different sources of information.

IARPA collaboration and the Good Judgment Project

After publishing Expert Political Judgment, Professor Tetlock entered into a collaboration with the Intelligence Advanced Research Projects Activity (IARPA) — the US intelligence community’s research arm — to further test different forecasting strategies. Between 2011 and 2015, IARPA sponsored a large forecasting tournament involving thousands of different forecasters and over a million forecasts. Professor Tetlock identified high-performing forecasters, who were consistently able to finish at the top of the tournaments they entered, whom he and his collaborators dubbed “superforecasters.” The superforecasting team, which Professor Tetlock called the Good Judgment Project, beat teams of other experts and intelligence professionals to win the IARPA tournament. Professor Tetlock and his collaborators analysed the significance of this result in the 2015 book Superforecasting and in 25-plus articles in peer-reviewed scientific journals.

Forecasting today

These past successes have led to the development of multiple forecasting platforms. Professor Tetlock co-founded Good Judgment, a consultancy that offers bespoke forecasting and workshops to private clients. Good Judgment also runs Good Judgment Open, an open platform for crowd-based forecasts. Metaculus and INFER-Public are similar platforms inspired by Professor Tetlock’s forecasting research. As an example of the value of forecasting, Open Philanthropy contracted Good Judgment to make public forecasts related to the development of the COVID-19 pandemic. In September 2021, the US government’s Office of the Director of National Intelligence announced that it was seeking to re-implement probabilistic forecasting in its policy process. We believe that this suggests that the path from Professor Tetlock’s basic research to ultimate policy implementation is now even clearer.

Why do we trust this organization?

For our initial recommendation of this opportunity in 2020, we were grateful to be able to utilize the in-depth expertise of, and background research conducted by, current and former staff at Open Philanthropy, the world’s largest grant-maker on global catastrophic risk. Open Philanthropy identifies high-impact giving opportunities, makes grants, follows the results and publishes its findings. (Disclosure: Open Philanthropy has made several unrelated grants to Founders Pledge.)

As indicated above, Professor Tetlock has demonstrated a strong track record of success, not only as a researcher, but as a science communicator and project manager, too, and in delivering research products to fulfill obligations under a previous Founders Pledge grant.

Professor Tetlock is a Penn Integrates Knowledge (PIK) Professor at the University of Pennsylvania, with cross-appointments at the Wharton School and the School of Arts and Sciences. He has done extensive research over the last three decades on the accuracy of a wide range of geopolitical, economic and military outcomes.

Dr. Pavel Atanasov has previously worked closely with Professor Tetlock on research projects involving a variety of forecasting methods, including aggregation algorithms, prediction markets and identifying accurate forecasters. Dr. Ezra Karger is an economist in the microeconomics research group at the Federal Reserve Bank of Chicago, who also works closely with Professor Tetlock on this research and was the lead author of “Improving Judgments of Existential Risk.”

What would they do with more funding?

We expect that additional funding at this time would help Professor Tetlock and his collaborators expand their research on forecasting global catastrophic risks. They are seeking support for tackling the ten methodological challenges to X-risk forecasting outlined in their recent paper:

  1. “Managing Rigor-Relevance Trade-Offs” and finding ways to make forecasting more actionable for decision-makers.
  2. “Crafting Incisive Forecasting Questions” by building “Bayesian question clusters” and “conditional trees” to enhance question relevance.
  3. “Incentivizing Persuasive, Predictively Powerful Explanations” to ensure that forecasts are more than just a number.
  4. “Incentivizing True Reports about X-Risk Mitigation” and developing methods to measure the effect sizes of policy interventions.
  5. “Recruiting the Right Talent” for second-generation forecasting research.
  6. “Motivating the Talent”.
  7. “Picking Probability-Elicitation Tools and Scoring Rules” to craft the appropriate measures to forecast low-probability high-impact events.
  8. “Helping People Prepare for Distinctive Analytic Challenges of X-Risk Assessment”.
  9. “Benchmarking against External Standards”.
  10. “Managing Information Hazards” and making sure that predictions of risks don’t do more harm than good.

To tackle these challenges, Tetlock and his collaborators are combining traditional forecasting methods like those developed in the IARPA tournaments with new techniques, including Reciprocal Scoring, conditional trees, risk-mitigation tournaments and hybrid persuasion-forecasting tournaments.

Since Founders Pledge’s last grant to Professor Tetlock, they have received additional grants to support related work, including from the Long-Term Future Fund. However, the researchers could productively absorb additional funding to work on the challenges outlined above, to upgrade their forecasting platform, and to turn part-time analysts into full-time employees of the project, and to scale up this research generally.

What are the major open questions?

One concern we have about this research is that questions bearing on global catastrophic risks do not seem well-suited to forecasting tournaments like those Professor Tetlock has previously carried out. For example, questions used for the Expert Political Judgment tournaments were chosen for clear resolution criteria, had mutually exclusive answers, and were amenable to base-rate forecasting. Some questions relevant to global catastrophic risk will very likely lack one or all of these characteristics. For this reason, as well as the inherent difficulty of making tangible progress on reducing global catastrophic risks, we see this project as part of our portfolio of high-risk, high-reward research. Successfully inventing new ways to forecast global catastrophic risk would be a hugely valuable achievement; however, the most likely result of this project is that it will make marginal progress, if any, towards this goal. For now, the solutions that the researchers have outlined to the challenges above remain largely theoretical. We will continue to follow up with the researchers while they collect essential empirical data from second-generation forecasting tournaments.

Responding to another concern about the usefulness of forecasting “black swan events,” in a 2022 article, Tetlock responds to Nassim Taleb’s critique of forecasting tournaments by proposing adversarial collaborations in which each side pre-registers their hypotheses about when they expect efforts to improve judgment to falter as we move deeper into the domain of rare high-impact events.

If Professor Tetlock’s success in attracting funding to this area continues, we also believe it may cease to be a neglected cause area, such that other funding opportunities provide a better “bang for our buck.” From conversations with the research team, we do not believe that we have reached this point yet.

Message from the organization

“We recommend supporting a sustained effort to test the methodological innovations that Tetlock and collaborators propose in ‘Improving Judgments of Existential Risk’. High-priority research questions include:

  1. How feasible is it to improve probabilistic forecasts of increasingly rare, high-impact events? How fast do we reach the point of diminishing marginal predictive returns? Which side will prove “closer to right” in the long-standing Tetlock-Taleb debate (Tetlock et al. 2022);
  2. How useful is it to supplement objective gold-standard metrics of accuracy with silver-standard metrics for X-risk questions that will not resolve for decades or that require making counterfactual judgments that can never be objectively resolved? Will this combination of metrics allow us to reduce noise and bias in X-risk forecasts?;
  3. Can we adapt tournament competitions so that they incentivize not just accurate forecasting but also crafting more incisive forecasting questions and compelling explanations?;
  4. What can we learn by testing methods for improving policy debates in games or simulations where we can rerun history and objectively measure tail risk judgments and the value-added of questions and rationales?”
  • Professor Philip Tetlock and collaborators

More resources

Disclaimer: We do not have a reciprocal relationship with any charity, and recommendations are subject to change based on our ongoing research. Christian Ruhl has previously worked on forecasting research at the University of Pennsylvania unrelated to this grant, and is a member of a working group to bridge the gap between academia and government on probabilistic forecasting.

Notes

  1. Mellers et al., 2015; Tetlock & Gardner, 2015.

  2. Good Judgment Science’, 8 April 2018.

  3. Conversations with Tyler’, “Philip E. Tetlock on Forecasting and Foraging as a Fox (Ep. 93)”.

  4. Luke Muehlhauser, “How Feasible Is Long-Range Forecasting”, Open Philanthropy blog.

  1. What problem are they trying to solve?
  2. What do they do?
  3. Why do we recommend them?
  4. Why do we trust this organization?
  5. What would they do with more funding?
  6. What are the major open questions?
  7. Message from the organization
  8. More resources
  9. Notes

    About the author

    Aidan Goth

    Aidan Goth

    Former Researcher

    Aidan is a former Researcher at Founders Pledge. Previously, he studied at the University of Oxford. He graduated with a First Class MMathPhil in Mathematics and Philosophy, specialising in formal epistemology, decision theory and ethics.