Monte Carlo methods use repeated random sampling to determine the properties of some phenomenon.
Although random number generation is not part of the Monte Carlo methodology, the latter feeds from and has helped the development of the former.
Types of random numbers in decreasing order of randomness:
Monte Carlo methods use random or, more commonly, pseudorandom numbers. Quasi-Monte Carlo methods use quasi-random numbers.
A true random number generator (TRNG) or hardware random number generator is a physical device that generates random numbers from a physical stochastic process, such as thermal noise, the photoelectric effect involving a beam splitter, and other quantum phenomena.
Random number can also be generated from macroscopic processes, justified by unstable dynamic systems and chaos theory. Common practices use tools such as yarrow, coin, dice, roulette wheel, and lottery machines.
Hardware RNGs have limited random bits per second. As a trade-off between randomness and speed, TRNGs typically serve as the entropy source (seed provider) for CSPRNGs or PRNGs.
A pseudorandom number generator (PRNG) or deterministic random bit generator (DRBG) is an algorithm using deterministic polynomial time computable functions to generating numbers computationally indistinguishable from true random numbers. Symbolically, given polynomial time computable functions \(G_k : \{0,1\}^{k}\to \{0,1\}^{p(k)}\) where polynomial \(p: p(k) > k, \forall k \in \mathbb{Z}_+ \), for any probabilistic polynomial time binary classifier \(A\) and \(x, r\) chosen randomly from \( \{0,1\}^k \) and \( \{0,1\}^{p(k)} \) respectively, exists some negligible function \(\mu\) such that:
\[ \left| \Pr {A(G_k(x)) = 1} - \Pr {A(r) = 1} \right| < \mu(k) \]
Randomness tests are statistical tests on whether a binary sequence is patternless; a widely used collection of tests is the Diehard Battery of Tests {Marsaglia1996}. TestU01 {L’Ecuyer and Simard, 2007} is a popular randomness test suite written in ANSI C.
Classes of PRNGs in increasing order of quality:
The first fast and high-quality PRNG is Mersenne-Twister (MT, 梅森旋转) {Matsumoto and Nishimura, 1998}, which is a twisted generalised linear feedback shift register (GFSR) of rational normal form with period of a Mersenne prime. Mersenne-Twister is by far the most widely used general-purpose PRNG, and is the default PRNG in may software systems such as GLib, GMP, GSL, R and Python. MT19937 is the most commonly used version of Mersenne Twister, which has a 624-dimensional state space of 32-bit integers, a period of \( M_19937 = 2^19937-1 \) and is equi-distributed in up to 623 dimensions of 32-bit integers. Disadvantages of MT19937 include: large state buffer (2496 Bytes), slow, does not pass all randomness tests (linear complexity), very slow recovery from state spaces with a large number of zeros.
A PRNG is cryptographically secure (CSPRNG) if it is considered suitable for use in cryptography, such as key generation. Specifically, a PRNG \( G_k(s_i) = (s_{i+1}, y_i) \) with state length \(k\) and output block length \(t(k)\) is forward-secure if given a random initial state \(s_1\), \( (y_1, y_2, \dots, y_i, s_{i+1}) \) and \( (r_1, r_2, \dots, r_i, s_{i+1}) \) are computationally indistinguishable for any \(i \in \mathbb{Z}_+ \).
For a set \( P = \{x_1, \dots, x_N\} \), let $\lambda_s$ be the \(s\)-dimensional Lebesgue measure, \(A(B;P)\) the number of points in \(P\) that fall into \(B\), and \(J\) the set of \(s\)-dimensional closed-below-open-above boxes, its discrepancy is defined as:
\[ D_N (P) = \sup_{B \in J} \left| \frac{A(B;P)}{N} - \lambda_s (B) \right| \]
The discrepancy of a sequence is low if the proportion of points in the sequence falling into an arbitrary set is nearly proportional to the measure of the set. Low discrepancy is a property of uniformly distributed random sequences. But low discrepancy sequences typically look more even than pseudo-random sequences.
Transform uniformly distributed pseudo-random numbers into numbers distributed according to a given probability distribution.
Sampling from continuous distributions:
To sampling from a discrete distribution, take uniform pseudo-random numbers and search for the index \(i\) of the enclosing interval, whose width is equal to the probability \(P(i)\). Search algorithms: linear search, O(n); binary search, O(log(n)); indexed search (aka cutpoint method); Alias method, O(1);
multivariate integration with irregular boundaries:
In numerical integration, quasi-Monte Carlo methods normally have a faster order of convergence than Monte Carlo methods.
Stochastic evolution equations
Mean field particle Monte Carlo: