Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- I NONSTANDARD MARKETS
- II CONTRACTS
- III DECISION THEORY
- IV COMMUNICATION/ORGANIZATIONS
- V FOUNDATIONS: EPISTEMICS AND CALIBRATION
- 12 Strategies and Interactive Beliefs in Dynamic Games
- 13 Calibration: Respice, Adspice, Prospice
- 14 Discussion of “Strategies and Interactive Beliefs in Dynamic Games” and “Calibration: Respice, Adspice, Prospice”
- VI PATENTS: PROS AND CONS FOR INNOVATION AND EFFICIENCY
- Name Index
- Miscellaneous Endmatter
13 - Calibration: Respice, Adspice, Prospice
Published online by Cambridge University Press: 05 May 2013
- Frontmatter
- Contents
- Contributors
- Preface
- I NONSTANDARD MARKETS
- II CONTRACTS
- III DECISION THEORY
- IV COMMUNICATION/ORGANIZATIONS
- V FOUNDATIONS: EPISTEMICS AND CALIBRATION
- 12 Strategies and Interactive Beliefs in Dynamic Games
- 13 Calibration: Respice, Adspice, Prospice
- 14 Discussion of “Strategies and Interactive Beliefs in Dynamic Games” and “Calibration: Respice, Adspice, Prospice”
- VI PATENTS: PROS AND CONS FOR INNOVATION AND EFFICIENCY
- Name Index
- Miscellaneous Endmatter
Summary
Introduction
Suppose we are asked to forecast the probability of rain on successive days. How should we assess the accuracy of the forecast? If we forecast a 25 percent chance of rain and it rains, was the forecast in error?
A popular criteria for judging the effectiveness of a probability forecast is called calibration. Dawid (1982) offered the following intuitive definition of calibration:
Suppose that, in a long (conceptually infinite) sequence of weather forecasts, we look at all those days for which the forecast probability of precipitation was, say, close to some given value ω and (assuming these form an infinite sequence) determine the long run proportion p of such days on which the forecast event (rain) in fact occurred. The plot of p against ω is termed the forecaster's empirical calibration curve. If the curve is the diagonal p = ω, the forecaster may be termed (empirically) well calibrated.
We notice that the calibration criterion relies on only the realized forecasts and outcomes to make a determination. It assumes that the data will speak for itself.
The calibration criterion is used, for example, to assess the accuracy of prediction markets (Page and Clemen 2012). Tetlock (2005) used it in his comprehensive analysis of pundits. We quote from a blog entry by Tetlock (2006):
Between 1985 and 2005, boomsters made 10-year forecasts that exaggerated the chances of big positive changes in both financial markets (e.g., a Dow Jones Industrial Average of 36,000) and world politics (e.g., tranquility in the Middle East and dynamic growth in sub-Saharan Africa). They assigned probabilities of 65 percent to rosy scenarios that materialized only 15 percent of the time.
- Type
- Chapter
- Information
- Advances in Economics and EconometricsTenth World Congress, pp. 423 - 442Publisher: Cambridge University PressPrint publication year: 2013
- 3
- Cited by