James–Stein estimator

The James–Stein estimator is an estimator of the mean ${\boldsymbol {\theta }}:=(\theta _{1},\theta _{2},\dots \theta _{m})$ for a multivariate random variable ${\boldsymbol {Y}}:=(Y_{1},Y_{2},\dots Y_{m})$ .

It arose sequentially in two main published papers. The earlier version of the estimator was developed in 1956, when Charles Stein reached a relatively shocking conclusion that while the then-usual estimate of the mean, the sample mean, is admissible when $m\leq 2$ , it is inadmissible when $m\geq 3$ . Stein proposed a possible improvement to the estimator that shrinks the sample means ${{\boldsymbol {\theta }}_{i}}$ towards a more central mean vector ${\boldsymbol {\nu }}$ (which can be chosen a priori or commonly as the "average of averages" of the sample means, given all samples share the same size). This observation is commonly referred to as Stein's example or paradox. In 1961, Willard James and Charles Stein simplified the original process.

It can be shown that the James–Stein estimator dominates the "ordinary" least squares approach in the sense that the James–Stein estimator has a lower mean squared error than the "ordinary" least squares estimator for all ${\boldsymbol {\theta }}$ . This is possible because the James–Stein estimator is biased, so that the Gauss–Markov theorem does not apply.

Similar to the Hodges' estimator, the James-Stein estimator is superefficient and non-regular at $\theta =0$ .