| Yule–Simon |
|---|
|
Probability mass function Yule–Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.) |
|
Cumulative distribution function Yule–Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.) |
| Parameters |
shape (real) |
|---|
| Support |
 |
|---|
| PMF |
 |
|---|
| CDF |
 |
|---|
| Mean |
for  |
|---|
| Mode |
 |
|---|
| Variance |
for  |
|---|
| Skewness |
for  |
|---|
| Excess kurtosis |
for  |
|---|
| MGF |
does not exist |
|---|
| CF |
 |
|---|
In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.
The probability mass function (pmf) of the Yule–Simon (ρ) distribution is

for integer
and real
, where
is the beta function. Equivalently the pmf can be written in terms of the rising factorial as

where
is the gamma function. Thus, if
is an integer,
- !\,(k-1)!}{(k+\rho )!}}.}

The parameter
can be estimated using a fixed point algorithm.
The probability mass function f has the property that for sufficiently large k we have

This means that the tail of the Yule–Simon distribution is a realization of Zipf's law:
can be used to model, for example, the relative frequency of the
th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of
.