Statistical population

In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hypothetical and potentially infinite group of objects conceived as a generalization from experience (e.g. the set of all possible hands in a game of poker). A population with finitely many values $N$ in the support of the population distribution is a finite population with population size $N$ . A population with infinitely many values in the support is called infinite population.

A common aim of statistical analysis is to produce information about some chosen population. In statistical inference, a subset of the population (a statistical sample) is chosen to represent the population in a statistical analysis. Moreover, the statistical sample must be unbiased and accurately model the population. The ratio of the size of this statistical sample to the size of the population is called a sampling fraction. It is then possible to estimate the population parameters using the appropriate sample statistics.

For finite populations, sampling from the population typically removes the sampled value from the population due to drawing samples without replacement. This introduces a violation of the typical independent and identically distribution assumption so that sampling from finite populations requires "finite population corrections" (which can be derived from the hypergeometric distribution). As a rough rule of thumb, if the sampling fraction is below 10% of the population size, then finite population corrections can approximately be neglected.