Exchangeability

← Back to Outline

Exchangeability

In many applications, the order of observations does not matter. (For example, training instances can be permuted without changing the probability distribution; words in the bag-of-words model can be permuted)

Formally,

Definition Random variables (y1,,yN)(y_1, \dots, y_N) are exchangeable if for all permutations τ\tau, P ⁣(y1,,yN)=P ⁣(yτ(1),,yτ(N))\prob{y_1, \dots, y_N} = \prob{y_{\tau(1)}, \dots, y_{\tau(N)}}

Definition Sequence {yi}\{y_i\}_\infty is infinitely exchangeable if any finite subsequence is exchangeable.

De Finetti's Theorem

If y1,y2,y_1, y_2, \dots are exchangeable, then the probability distribution P ⁣(y1,y2,)\prob{y_1, y_2, \dots} can be described succinctly due to the following theorem.

Theorem (De Finetti) If {yi}\{y_i\}_\infty is infinitely exchangeable where yiYy_i \in Y, then there exists some parameter space Φ\Phi and density function P ⁣(ϕ)\prob{\phi} such that for any NN observations, P ⁣(y1,,yN)=ΦP ⁣(ϕ)i=1NP ⁣(yiϕ)dϕ\prob{y_1, \dots, y_N} = \int_\Phi \prob{\phi}\prod_{i=1}^N\prob{y_i\mid\phi}\,d\phi

In other words, we can always draw a graphical model:

ϕ\phi
NN
yiy_i

If YY is finite, then Φ\Phi has finite dimension, and we can parameterize. (For example, if YY is Bernoulli, then we only need 1 parameter ϕ[0,1]\phi\in[0,1]).

However, if YY is infinite, Φ\Phi has infinite dimension and cannot be parameterized by a finite amount of parameters.

This is the motivation for using nonparametric methods which do not assume the (finite) number of parameters.

Exported: 2021-01-02T21:26:11.922355