The Brier Measure is not Strictly Proper (as Epistemologists have come to use that term)

In recent years, formal epistemologists have gotten interested in measures of the accuracy of a credence function. One famous measure of accuracy is the one suggested by Glenn Brier. Given a (finite) set $\Omega =$ { $\omega_1, \omega_2, \dots, \omega_N$ } of possible states of the world, the Brier measure of the accuracy of a credence function $c$ at the state $\omega_i$ is

$ \mathfrak{B}(c, \omega_i) = - (1-c(\{ \omega_i \}))^2 - \sum_{j \neq i} c(\{ \omega_j \})^2 $

And formal epistemologists usually say that a measure of accuracy $\mathfrak{A}$ is strictly proper iff every probability function expects itself (and only itself) to have the highest $\mathfrak{A}$-value.

Strict Propriety
A measure of accuracy $\mathfrak{A}$ is strictly proper iff, for every probability function $p$ and every credence function $c \neq p$, the $p$-expectation of $p$'s $\frak{A}$-accuracy is strictly greater than the $p$-expectation of $c$'s $\frak{A}$-accuracy. That is: for every probability $p$ and every credence $c \neq p$,

$ \sum_{i = 1}^N p(\{ \omega_i \}) \cdot \mathfrak{A}(p, \omega_i) \,\, > \,\, \sum_{i = 1}^N p(\{ \omega_i \}) \cdot \mathfrak{A}(c, \omega_i) $

(‘Weak propriety’ is the property you get when you swap out ‘$>$’ for ‘$\geq$‘.)

The point of today’s post is that, contrary to what I once thought (and perhaps contrary to what some others thought as well—though this could be a confusion localized to my own brain), the Brier score is not strictly proper.

First, a bit of background: Given a (finite) set $\Omega =$ { $\omega_1, \omega_2, \dots, \omega_N$ } of possible states of the world, we can call any set of states in $\Omega$ a ‘proposition’. And I’ll call a set of propositions, $\mathscr{F}$, a ‘field’. Given a pair $(\Omega, \mathscr{F})$, with $\mathscr{F} \subseteq \wp(\Omega)$, a credence function, $c$, is just any function from $\mathscr{F}$ to the unit interval, $[0, 1]$.

A credence function $c$ is a probability function if it additionally satisfies the following two constraints:

  1. $c(\Omega) = 1$.
  2. For all $A, B \in \mathscr{F}$ such that $A \cap B = \emptyset$, $c(A \cup B) = c(A) + c(B)$.

To see that the Brier measure $\mathfrak{B}$ is not strictly proper, consider the set of states $\Omega =$ { $\omega_1, \omega_2$ } and the field $\mathscr{F} =$ { $\emptyset,$ { $\omega_1$ }, { $\omega_2$ }, $\Omega$ }. Then, consider the probabilistic $p$ and the non-probabilistic $c$, both defined over the field $\mathscr{F}$.

$A \in \mathscr{F}$ $p(A)$ $c(A)$
$\emptyset$ 0 1
{$ \omega_1 $} 12 12
{$ \omega_2 $} 12 12
$\Omega$ 1 0

The $p$-expected Brier accuracy of $p$ is

$\begin{aligned} \mathbb{E}_p \left[ \mathfrak{B}(p) \right] &= p(\{ \omega_1 \}) \cdot \mathfrak{B}(p, \omega_1) \,\,+ \,\, p(\{ \omega_2 \}) \cdot \mathfrak{B}(p, \omega_2) \\ &= 1/2 \cdot \left[ -(1-1/2)^2 - (1/2)^2 \right] \,\,+ \,\, 1/2 \cdot \left[ -(1-1/2)^2 - (1/2)^2 \right] \\ &= - 1/2 \end{aligned}$

And the $p$-expected Brier accuracy of $c$ is likewise

$ \begin{aligned} \mathbb{E}_p \left[ \mathfrak{B}(c) \right] &= p(\{ \omega_1 \}) \cdot \mathfrak{B}(c, \omega_1) \,\,+ \,\, p(\{ \omega_2 \}) \cdot \mathfrak{B}(c, \omega_2) \\ &= 1/2 \cdot \left[ -(1-1/2)^2 - (1/2)^2 \right] \,\,+ \,\, 1/2 \cdot \left[ -(1-1/2)^2 - (1/2)^2 \right] \\ &= - 1/2 \end{aligned}$

So there is a probabilistic $p$ and a credence function $c \neq p$ such that $p$ expects $c$ to be just as Brier accurate as $p$ is itself. So the Brier measure of accuracy is not strictly proper.

Some have used the term ‘strict propriety’ differently than I defined it above. In the first place, Brier himself did not intend his measure to apply to credence functions, which are functions from arbitrary propositions to the unit interval, but rather forecasts, which he treated as assignments of real numbers from the unit interval to each individual state $\omega_i \in \Omega$. (Brier even required these numbers to sum to 1.) If you are in a context where you are evaluating, not credence functions, but forecasts, then you might want to define the notion of strict propriety like this:

Strict Propriety for Forecasts
A measure of accuracy $\mathfrak{A}$ is strictly proper for forecasts iff, for every probabilistic forecast $p$ and every forecast $f \neq p$, the $p$-expectation of $p$'s $\frak{A}$-accuracy is strictly greater than the $p$-expectation of $f$'s $\frak{A}$-accuracy. That is: for every probabilistic forecast $p$ and every forecast $f \neq p$,

$$\sum_{i = 1}^N p(\{\omega_i\}) \cdot \mathfrak{A}(p, \omega_i) \,\,>\,\, \sum_{i = 1}^N p(\{\omega_i\}) \cdot \mathfrak{A}(f, \omega_i) $$

And the Brier measure is strictly proper for forecasts. It’s just not strictly proper as epistemologists have been using that term, applied to arbitrary credence functions.

What is a strictly proper measure of accuracy for credence functions is this quadratic measure, which is also sometimes called the Brier measure (though it’s not the measure Brier himself explicitly endorsed):

$$\mathfrak{Q}(c, \omega) = - \sum_{A \in \mathscr{F}} ( \chi_A(\omega) - c(A) )^2 $$

(Here, ‘$\chi_A(\omega)$’ is the characteristic function for the proposition $A$, which maps a state $\omega$ to $1$ if $A$ is true in that state and $0$ otherwise.)