Universal approximation theorem

In the mathematical theory of artificial neural networks, universal approximation theorems are theorems of the following form: Given a family of neural networks, for each function $f$ from a certain function space, there exists a sequence of neural networks $\phi _{1},\phi _{2},\dots$ from the family, such that $\phi _{n}\to f$ according to some criterion. That is, the family of neural networks is dense in the function space.

The most popular version states that feedforward networks with non-polynomial activation functions are dense in the space of continuous functions between two Euclidean spaces, with respect to the compact convergence topology.

Universal approximation theorems are existence theorems: They simply state that there exists such a sequence $\phi _{1},\phi _{2},\dots \to f$ , and do not provide any way to actually find such a sequence. They also do not guarantee any method, such as backpropagation, might actually find such a sequence. Any method for searching the space of neural networks, including backpropagation, might find a converging sequence, or not (i.e. the backpropagation might get stuck in a local optimum).

Universal approximation theorems are limit theorems: They simply state that for any $f$ and a criterion of closeness $\epsilon >0$ , if there are enough neurons in a neural network, then there exists a neural network with that many neurons that does approximate $f$ to within $\epsilon$ . There is no guarantee that any finite size, say, 10000 neurons, is enough.