Random Number Generators: Design and Applications

The whole doc is available only for registered users

Pages: 40
Word count: 9770
Category: College Example Mathematics

A limited time offer! Get a custom sample essay written according to your requirements urgent 3h delivery guaranteed

Introduction

A random number generator, or RNG, is a device intended to create a sequence of numbers that does not have a discernable or calculable pattern. Random number generators can be physical, generated by the flip of a coin or a toss of the dice, the i-Ching’s yarrow sticks or a deck of cards. Many random number generators are computational. Some generators do not generate truly random numbers, but generate statistically or perceptually random numbers. These are known as pseudo-random number generators. Most computational random number generators are actually pseudo-random number generators, due to the deterministic nature of computers and the difficulty of providing a truly random number from them.

Random number generators require an input to produce a number as an output. Computerized number generators generally use one of two sources for these inputs, which can either be used directly to create the random number or can be used as a seed to algorithmically determine it (Schneier, 2000, p 99). One group uses seemingly random physical observations to create its numbers. These can include Geiger counters, microphone input, white noise receivers, visual input, air turbulence across the disk drives, gyroscope input or network packet arrival. The second group turns user movements, such as keystroke sequence or timing, mouse movements or microphone input to generate its numbers. Depending on the implementation, the random number generator may use either group’s input either directly or as a seed.

Computer-based pseudorandom number generators are simpler, and often use facilities such as the computer’s system clock or the date to generate a seemingly random, but reproducible, number. This is less intensive in terms of both system resources and hardware reliability than the true random number generator, but it is also less secure – the algorithm, given the same, easily determined, seed, will probably produce the same result.

Computer programming techniques make many uses of random number generators in game design, computer simulation and statistical sampling. However, the most important use of random number generators in computing is in cryptography. Random number generators, rather than the simpler pseudorandom number generators, were at one time preferred for cryptographic applications due to their complexity and the difficulty of reproduction. However, the Luby-Rackoff construction, which formally defined the block cipher construction, described a manner in which a pseudorandom permutation generator using four rounds of the Feistel permutation could increase the computational randomness of a pseudorandom generator to close to the limits of feasible computability, allowing for the use of pseudorandom number generators in cryptographic applications. A number of improvements on this technique have allowed for distributed or threshold, pseudorandom permutation generation, use of non-invertible pseudorandom generations and other improvements. Cryptographic security of random number generators depends on a single factor – the computational infeasibility of distinguishing its method of generation. Statisticians are also heavy users of random number generators, relying on them for modeling, Monte Carlo simulations, analysis of stochastic processes and resampling (Gentle, 2003, p 1).

History of Random Numbers

Gentle (2003, p 1) discussed the history of random numbers. He remarked that before the advent of computer-based random number generators, statistics texts and other sources that required random numbers were often supplied with reference tables of “random” numbers that could be used for analysis. The largest of these, published around 1955, had over one million numbers listed (Bewersdorff, 2005, p 97). While some statisticians still continue to use computerized versions of these charts, most statistics programs now implement a direct random or pseudorandom number generation routine.

One of the earliest discussions of random number generators in the literature of computing was in 1968, at a time when random number generators were already an established procedure. George Marsaglia described a method by which random numbers were generated at the time, known as the multiplicative congruential generator. This model, described by D.H. Lehmer, used a simple arithmetical formula –,

where I = current random integer,

I’ = new random integer,

K = constant multiplier,

M = remainder after overflow

An alternate formula, was used in some applications.

Marsaglia noted that this naïve method of generating random numbers, although seemingly effective for many applications, was not suitable for Monte Carlo simulations because the results fell in a crystalline pattern among a small number of parallel hyperplanes – in other words, the results weren’t random at all, but were perfectly arrayed in a crystalline structure. “Furthermore,” Marsaglia noted (1968, p25), “there are many systems of parallel hyperplanes which contain all of the points; the points are about as randomly spaced in the unit n-cube as the atoms in a perfect crystal at absolute zero.” Marsaglia demonstrated that there were a very limited number of planes containing all n-tuples, leading to a very small chance of actually obtaining a random number set from one of the above equations. Marsaglia pointed out that there are many Monte Carlo simulations which do not work with n-space regular “random” numbers; but more insidiously, this algorithm had been in use for twenty years without detection of this flaw, leading to the possibility that many Monte Carlo simulations had run and been accepted without understanding that the results were badly flawed due to the n-space regularity of the seemingly random numbers. As Gentle (2003, p 2) noted, both the randomness of the random numbers generated by an algorithm and the distribution of the numbers generated must be understood in order to make the numbers useful.

Modern pseudorandom number generators use a number of techniques to generate numbers. These include pseudorandom functions, pseudorandom permutations and other pseudorandom objects that can be used to generate a computationally indistinguishable sequence of numbers. Some pseudorandom number generators use a complex approach of external and internal random systems, with the external “shell” systems using the output from the internal systems as input for their own calculations (Maurer, 2002, 110). These systems are among the most secure and computationally infeasible of the generation methods available.

Random and Pseudorandom Numbers Defined

Random Numbers

Knuth (1981, pp??) defined random numbers in the following manner. Considering a set of numbers , where X = 0, 1, 2… b -1 for all numbers b > 2, a set of random numbers is

Equidistributed – for all

The set of random numbers must be evenly distributed with respect to . This can be expressed using the generalized equation

Giulani (1998, p 3) stated that all -distributed sequences would pass any correlative statistical test. However, sequences that are not -distributed will not pass correlative statistical tests.

Computable – Giulani (1998, p 5) defined a computable sequence rule as a sequence of computable functions where for all i. This defines a unique sequence of the b-ary sequence where X_i is in the subsequence if and only if .

Knuth (1981) defined randomness as

An instance of (1) is said to be random if for every effective algorithm which computes an infinite sequence of positive integers as a function of n and

Every infinite subsequence of defined by a computable sequence rule is 1-distributed.

Giulani (1998, p 5) remarked on a number of aspects of this definition. First, he noted that the definition implies that for any sequence effectively chosen, there is no rule that can be applied to produce an equidistant sequence. Although the original sequence must be -distributed, it is sufficient for the subsequence computed by the subsequence rule to be 1-distributed. Note, however, that as Knuth remarked, even perfectly random sequences may appear to demonstrate local non-randomness.

A further refinement of Knuth’s definition is required in order to make random numbers useful for statistical analysis. Giulani (1998, p 5) provides the following

Let X₁…,X_n be a b-ary sequence. This sequence is said to be random if each is independent of the other X_i, i.e. if P is the standard probability measure then for any “effective algorithm” which given X_j for all outputs .

Marsaglia’s examination of the linear congruential generator function’s sequential output provides an excellent example of what is not a random number. The output from the functions discussed appeared to be statistically random. However, on examination of the distribution of the sequences, Marsaglia found that they were in fact arrayed in a crystalline matrix. This meant that the numbers from these sequences were computationally feasible and could be predicted easily given a high enough number of queries. In order for random numbers to be considered random today, they must be both statistically randomized and computationally infeasible to recreate.

Computational Modeling of Randomness

There are a number of techniques that can be used to model pseudorandom numbers. The most obvious techniques are simple mechanical processes – flipping coins, tossing dice or similar processes. Slightly more sophisticated techniques involve the bit sampling of the output of Geiger counters or zener diodes (which leak electrons in a seemingly random fashion). However, bit-sampling methods are prone to local non-randomness, outputting long strings of 0 or 1 in sequence – Giuliani (1998, p 6) noted that this is particularly a problem if sampling is performed too close together. This makes raw bit sampling methods inadequate for producing statistically random numbers.

Because the methods of computation usable by computers are limited, there is necessarily a limit on the production of random sequences by computers. Most computationally generated random sequences are in fact pseudorandom sequences. These sequences are generated from less-random sequences using a number of different methods.

Giulani (1998, p 6) described one such computation method based on von Neumann’s “biased coin flip approach”. Using a source which generates a sequence with the following property, for all i and for some von Neumann described an algorithm to generate a random sequence by obtaining a subsequence of two bits from the source, and returning 0 if the bits are 01 and 1 if the bits are 10.

Giulani (1998, p6) described a second model of random number generation attributed to Santha and Vazirani. This model states

A source which produces pseudorandom sequences is said to be an S.V. source if for some fixed real number ,

If , the sequence is random. Otherwise, the sequence is said to be slightly random. However, if , the sequence can never be truly random.

Giulani (1998, p 6) further defines quasi-random sequences, which examines the randomness of the sequence as a whole rather than the randomness of the single bits. Quasi-random sequences are in many cases indistinguishable from random sequences. Quasi-random sequences can be defined as follows.

Considering the pseudo-random n-bit sequence , where . This source is quasi-random if for and

While random or quasi-random sequences can be generated in this fashion using a physical generator, purely computational methods typically produce pseudo-random sequences. This can be accomplished by using the output of the physical generator as a “seed” to produce a longer sequence of pseudorandom numbers. Giulani (1998, p 11) defines a pseudorandom bit generator as follows:

Let l,k be positive integers with a specified polynomial function of k. A (k,l)-pseudorandom bit generator (PRBG) is a polynomial time (in k) function . The input is called the seed and the output is called a pseudorandom bit string.

These sequences can be generated computationally rather than relying on the physical input from a bit-sampled stream. While as Giulani (1998, p13) stated, no pseudorandom generator will produce truly random numbers even given random seeds, this is not a relevant consideration because most applications of pseudorandom number generators do not require a truly random sequence. Additionally, the seeds chosen do not need to be truly random, but can produce seemingly random results with only quasi-random seeds.

A special application of pseudorandom number generators is to cryptography, or the conversion of plaintext data to a cipher. Although the above generation method for pseudorandom numbers is acceptable for statistical applications, in order to provide a stronger cipher cryptographic pseudorandom number generators must display a considerably higher degree of randomness. Cryptographic applications do not aim for complete indiscoverability, but instead for computational infeasibility – that is, the output of a random number generator can be reverse-engineered, but only at great computational expense. As Giulani (1998, p 14) remarked, the limits of computational feasibility today are probabilistic polynomial time; anything greater than that will be computationally infeasible. Giulani defined computational feasibility as follows.

“A deterministic function where D is a subset of for some positive integer k is called a predicate. B is said to be unapproximable if it is computationally infeasible to predict B(x) with probability . A deterministic permutation is called a friendship function for an unapproximable function B if for all it is computationally feasible to compute both f(x) and (B(f(x)). “ (Giulani, 1998, p 15).

Using this model, the strength of the encryption algorithm is directly related to the computational feasibility of B. If B is easily computed, given a long enough subsequence generated from the algorithm it would be possible to use previous-bit prediction to compute B and break the cipher.

Encryption algorithms depend on strong random or pseudorandom number generators to adequately protect the secret and significant bit at the heart of the algorithm. In order to ensure that this requirement is met, the random number generator must undergo statistical analysis to ensure it supplies an adequate amount of computational infeasibility, that is that it is strong enough to withstand examination for long enough to make it infeasible to work out by a brute-force method.

Design of Random and Pseudorandom Number Generators

Hastad (1999, p 2) defined a pseudorandom number generator in a number of different ways. The first definition was that of an intuitive understanding of the pseudorandom number generator. Hastad stated that the intuitive definition of a pseudorandom number generator was “a polynomial time computable function g that stretches a short random string x into a long string g(x) that “looks” like a random string to any feasible algorithm… that is allowed to examine g(x).”

There are a number of different random and pseudorandom number generators that have been described. The first type that was used computationally, the multiplicative congruential generator, was proven by Marsaglia to be ineffective. Further designs have been developed, and are described below. We focus on the pseudorandom number generators here because true random number generators do not allow for computability.

Blum, Blum, Shub (BBS) Generator

The BBS generator is based on the computational infeasibility of the quadratic residue problem. It is one of the simplest and most often used pure random number generators. This was described by Giulani (1998, p 16) as follows:

Let n be a positive integer. An integer is said to be a quadratic residue modulo n if there is an integer so that .This is denoted .

The BBS generator is designed using two k/2 bit primes p and q, where k is large enough such that , and let n=pq. Then,

Because computation of n is infeasible without p and q, the BBS generator is considered to be cryptographically strong.

RSA/Rabin Generator

The RSA/Rabin generator was also described by Giulani (1998, p 17). Like the BBS generator, the RSA/Rabin generator is considered to be a cryptographically strong pseudorandom number generator. The RSA/Rabin generator is seeded with two k/2 bit primes p and q, and a number e . A number d is then calculated using the equation , where is the Euler function.

After the numbers are chosen, the RSA/Rabin generator follows the pattern

Kaliski Elliptical Curve Generator

Giulani (1998, p 18) describes the Kaliski elliptical curve generator as follows. Given an elliptical curve E, defined as the set of solutions to the equation over the field , along with additional point , where is prime and . The elliptical curve discrete logarithm problem is given P and Q such that for some , determine . This, the elliptical curve pseudorandom number generator can be described as

where P=aG

Described in this fashion, and supplied with P, Q and p but not a, the elliptical curve problem is intractable, so the elliptical curve pseudorandom number generator is considered to be cryptographically strong.

Generation of Random Primes

Some random number generators are optimized to generate numbers that are probably prime. Distinguishability of prime numbers from composite numbers is a problem which is related to random number generation, although not identical. It can be solved in probabilistic polynomial time using a Las Vegas algorithm (that is, if an answer is obtainable, it is correct) (Beauchamin, 1999, p 2). Detection of prime numbers can be handled by the Rabin algorithm, which is a very effective method of determining prime numbers where they occur. However, as Beauchamin (1999, p 9) noted, generation of an even distribution of prime numbers is vital to the study of cryptography. Beauchamin suggested using Rabin’s algorithm to determine the probabilistic primeness of a given output sequence from a random number generator rather than attempting to generate these numbers directly, as the generation of prime numbers is too slow for practical use. The algorithm suggested is

Function GenPrime(l,k)

{l is the size of the prime to be produced;

k is a bounding parameter which specifies the degree to which the number must be assured to be prime}

repeat

n randomly selected l-digit odd integer

until RepeatRabin(n,k) = “prime”

return n.

This produces a transformation of the output of a given random or pseudorandom number generator to a equidistributed subsequence of probably-prime numbers. The degree of assurance that a number is prime can be tweaked by increasing the parameter k.

Generalization of Pseudorandom Number Generators

The random number generators described above are based in specific functions or permutations, which has the potential to detract from their indistinguishability. Hastad et al (1999, p 2) discussed the generalization of pseudorandom number generators; that is, the composition of a pseudorandom number generator using any one-way function as a seed input. A one-way function is a function that is easy to compute but difficult to invert, for example, quadratic residuosity (as discussed above). These functions provide an element of incomputability that is required in order to provide random and undiscoverable numbers.

Pseudorandom number generators built on a specific one-way function rely on one piece of the equation remaining hidden and meaningful. Hastad defined hidden as not being discoverable or computable from the output or knowledge of the other bits within a reasonable amount of time; meaningful is defined as being predictable from output bits if there are no time restrictions.

Hastad’s pseudorandom number generation methodology was based in the concept of computational entropy (1999, p 4). Hastad stated that the computational entropy of g(X) is at least the Shannon entropy of Y if g(X) and Y are indistinguishable. If g(X) is a pseudorandom generator, the computational entropy of g(X) is greater than the Shannon entropy of the same function; in effect g amplifies entropy and increases the amount of randomness. Hastad formally defined computational entropy:

Let f(x) be a(n;m(n))-standard function. Then, f has R-secure entropy s(n) if there is a -standard function such that f(X) and are R-secure computationally indistinguishable and , where and .

To utilize the computational entropy of a one-way function in order to generate pseudorandom numbers, Hastad (1999, p 7-15) performed the following steps. First, a hidden and meaningful bit is constructed from the chosen one-way function. The simplest method proposed by Hastad (1999, p 13) for construction of a hidden and meaningful bit is

If f is a R-secure one-way function, then is -secure hidden given .

Hastad’s second step (1999, p 14) is the permutation of the one-way function to a random generator. Hastad provided the following proposition.

Let f(x) be a R-secure one-way permutation. Let and define standard function . Then g is -secure pseudorandom generator.

Hastad (1999, p 15) then described methods of entropy manipulation via hash functions. Hastad used a universal hash function in his description of a generalized pseudorandom generator. A universal hash function is defined as

Let h(x,y) be a a(n,l(n);m(n))-standard function. Then h(x,y) is a (pairwise dependent) universal hash function if, for all for all ,

, where

Hastad (1999, p 18) then described two lemmas for smoothing distributions. The first lemma is expressed as

Let be a distribution that has Renyi entropy at least m(n), and let . Let e(n) be a positive integer valued parameter. be a -universal hash function. Let and let . Then,

This lemma provides a smooth or uniform distribution of random bits while retaining the original supply of random bits Y.

Hastad (1999, p 18) then provided a way to convert the pseudo-entropy generator described above to a pseudorandom number generator. The method described involved creating several copies of the pseudo-entropy generator (in order to equalize the Shannon entropy and the min-entropy), then using the hash function described above to convert the min-entropy to uniform entropy.

Distinguishability of Random Number Generators

The mathematical distinguishability of random number generators is a measure of how strongly a pseudorandom number generator mimics the behaviour of a random number generator. In order for a pseudorandom number generator to be useful for cryptographic applications, it must be information-theorectically indistinguishable from a true random number generator. Maurer (2002) discussed the problem of distinguishability of random systems, including functions and permutations. He noted that the security of cryptographic systems rested in the ability to reduce any means of breaking the system to an efficient distinguisher for the pseudorandom function from a uniform random function (110).

This distinguishability rests in the fact that although a random function is stateless (that is, it does not depend on any of its variables), a pseudorandom function, although it may appear to be stateless under statistical analysis, is in fact stateful (its output depends on one or more of its variables). Maurer (2002, p 117) considered the random system rather than the more specific subcategories of the random function, random permutation or random number because distinguishers can also be considered to be random systems, allowing for the generalization of technique across both defender and attacker of the cryptographic system. Two different distinguishing strategies can be used: adaptive and non-adaptive. As Maurer (2002, p 120) noted, non-adaptive strategies, or “brute force” methods are easier to study because their behaviour does not change in response to the inputs it receives. However, this cannot be relied upon because many distinguishers are designed to be adaptive, that is to change the method of distinguishing between two constructs based on the outputs received from its queries.

Maurer (2002, p 110) stated that the simplest definition of distinguishability was between two random variables. However, indistinguishability of two interactive systems (notated F and G by Maurer) is more complicated because the distinguisher can adapt its queries, or change them depending on the output it receives from previous queries (Maurer, 20002, p 111). Maurer defined a distinguisher D as a pair of complex random experiments, one from a query of F and one from a query of G. A security proof has to prove an upper bound for every D for the probability of a corresponding event between the security pairs, which is a very complex probability problem (Maurer, 2002, p 112).

In order to prove the security of any random system S for use in cryptography, Maurer (2002, 112) noted that the following steps must be performed.

Define the attacker’s capabilities, including the number and type of queries to S, specification of requirements to break S and definition of a perfect system P which is trivially secure.
Consider the idealized system I, which can be derived from S by replacing the pseudorandom function at the heart of S with a truly random function. Prove that I and P are computationally indistinguishable; that is, there is no adaptive computationally unbounded distinguishing algorithm D that can gain a statistically significant advantage without the execution of a computationally unfeasible number of queries.
S is computationally indistinguishable from I if the underlying function is truly pseudorandom (that is, indistinguishable from a random function by any computationally feasible means). Thus, S is also computationally indistinguishable from P is unbreakable by definition, ergo S is also unbreakable because there is no way in which one can computationally distinguish P and S. (Maurer, 2002, p 113).

While this appears to be a simple proof, the creation of a cryptographic construction that satisfies these criteria is far from straightforward. Maurer (2002, p 113) explored the problem of how to construct the quasi-random system S, containing a high enough degree of local randomness that it is computationally indistinguishable from a random system, and using as few random bits as possible for the greatest efficiency.

In order to examine the question at hand more effectively, Maurer (2002, 114) proposed a formal definition of a random function as follows:

A random function is a random variable which takes as values function . A deterministic system with state space is called an –automaton, and is described by an infinite sequence of functions, with where . is the state at time i, and an initial state is fixed. An –random automaton F is like an automaton but (where R is the space of the internal randomness), together with a probability distribution over specifying the internal randomness and the initial state. (Maurer, 2002, p 117).

Maurer (2002, p 117) then defined a random system in the following manner.

An –random system F is an infinite sequence of conditional probability distributions for . Two random automata F and G are equivalent, denoted , if they correspond to the same random system, i.e. if for .

In order to simplify analysis Maurer (2002) considered only monotone events, or those in which once the condition of F under consideration failed it continued to fail and did not self-heal or use polymorphism to change its pattern. However, many random systems contain more than one such condition, defined as a binary variable with one of two states (success or failure) (Maurer, 2002, 117). Two separate monotone event sequences, A and B, were defined by Maurer for consideration in F. was given as the monotone event sequence defined by for ; was given as the monotone event sequence defined by .

Maurer (2002, 118) discussed the concept of a cascade of random systems, that is, the invocation of an internal random system by the external random system. Maurer defined this construct as the -random system FG, formed from the -random system F and the -random system G by applying F to the initial input sequence and then applying G to the output of F.

Determining a distinguisher D between random systems F and G is a complex problem. Maurer (2002, p 118) defined D as a -random system that, using an initial value , outputs a binary decision value after at most k queries to F and G. Maurer considered the maximal advantage for distinguishing F and G of any D using k queries to be.

It is possible to apply distinguishability techniques of random systems to quasi-random functions (or those functions for which the random system F meets the condition for ), permutations, beacons and oracles (Maurer, 2002, p 124) and to use the general security proof methods as listed above to provide proof of these quasi-random objects. Maurer’s framework for a security proof based on deterministically stating the indistinguishability of the pseudorandom construct can be applied to any pseudorandom construct form.

Verification of Random Number Generators

In order for a random or pseudorandom number generator to be useful in a statistical or cryptographic application, it must be verified to be producing random or pseudorandom numbers. In most cases, standard statistical tests are used to determine the randomness of a random or pseudorandom number generator’s output. Maurer (1992) described a novel method of verifying the statistical randomness of a random bit generator that is intended for use in a cryptographic application. Because the strength of a cryptographic cipher is dependent on the computational infeasibility the random number generator it is built on, it is particularly important for a random number generator used in a cryptographic application to be verifiably random.

According to Maurer (1992, p 2), the most commonly used statistical tests to determine randomness are the common frequency test, poker test, serial test, autocorrelation test and run test. Maurer’s statistical test addressed issues related to the most common tests. Specifically, Maurer’s test was designed to detect defects that could be modeled by an ergodic stationary source with limited memory, which includes all known potential defects in a random generator, and includes all defects detected by the above tests. Second, Maurer’s test was designed to measure the cryptographic significance of the defect by measuring the per-bit entropy of the source, which affects the running time of a key search strategy in an attack based in a known defect in a randomness defect.

Maurer (1992, p 11) described the proposed test as follows. Three large positive integer parameters, L, Q and K are specified. The sequence output by the generator is divided into adjacent but non-overlapping blocks of length L. The total length of the chosen sample sequence should be , with K representing the number of steps of the test and Q the number of initialization steps. Let , describe the n-th block of length L in the sample sequence. For iterations the sequence is scanned for recurrence of the block. This detects repetitions in the blocks that indicate a lack of true randomness.

Cryptographic Applications of Random and Pseudorandom Number Generators

There are two models of cryptography currently in use. The first model, symmetric or secret-key cryptography, relies on the computational infeasibility of the significant bit to maintain its security (Maurer, 1992, p 2). The second cryptographic model, public-key cryptography, which is defined by the RSA security model, relies on the strength of the statistical randomness of the source of its pseudorandom number generator (Maurer, 1992,p 2). Maurer (1997, p 14) noted that there are several applications of randomness to cryptography. First, randomness is designed to attempt to create ciphers that are secure against attackers with unlimited time and computational resources. More commonly, randomness is used to create ciphers which are secure using a reasonable key length, rather than the unlimited resource use described in the first option. Random numbers are also used in generation of block ciphers.

Bellare (1997) discussed the implications of a weak pseudorandom number generator in cryptographic algorithms. The authors studied the DSS signature generation program and determined that the use of a linear congruential pseudorandom number generator allowed for an attacker to recover the secret key after examination of only a few signatures. This was extended to any pseudorandom number generator that used a modular linear equation as its source.

As the authored noted (p 4), linear congruential generators, or generators which use linear equations of the form , where a, b and M are chosen at random, and then fixed, are known to be weak; this was first described by Marsaglia (1964). While these algorithms are fast, their predictability is high and most fall easily within computational feasibility. Knuth recommended a truncated LCG as an additional security method, but these generators are also statistically weak. Cryptographic algorithms which rely on these generators, such as DSS, are required to keep the generation method secret in order to avoid compromise of the algorithm and key discovery.

Luby-Rackoff

The Luby-Rackoff pseudorandom permutation generator is a commonly used pseudorandom number generation technique. This generator was designed for the DES encryption standard; it described a method whereby a pseudorandom permutation generator could be constructed from any pseudorandom function generator, in effect creating a generalized pseudorandom number generator (Maurer, 1992, p 2). Naor and Reingold (1997) describe the Luby-Rackoff construction as a formalization of cryptographic block cipher methods. Block ciphers are defined by Naor and Reingold (p 2) as “private key encryption schemes such that the encryption of every plaintext-block is a single ciphertext-block of the same length.” DES and DSS are common block cipher encryption standards. The application of pseudorandom permutations to block ciphers are a boon because they protect against a chosen-plaintext attack, in which an attacker determines part of the contents of the block and translates the rest of the block on the basis of its content; strong pseudorandom permutations are secure against a number of different types of attack, including adaptive, chosen-plaintext and chosen-ciphertext (wherein an attacker has the ability to decrypt ciphertexts of its choice but still can’t distinguish between random and pseudorandom) (Naor & Reingold, 1997, p 2).

The construction of the Luby-Rackoff strong pseudorandom permutation involves four rounds of Feistel permutations (or three, for a pseudorandom permutation) of a function defined by the key. A Feistel permutation for a pseudorandom function f is given as , where L and R are the left and right part of the input and is bitwise XOR or any other group operation on (Maurer, 2003, 544).

Maurer presented a simplified and generalized description of the Luby-Rackoff construction. Maurer discussed the concept of local randomness, defined as “a family of functions is locally random of degree k if for every set of at most k arguments, the function values for these arguments for a randomly (from the family) chosen function are independent and uniformly distributed.” (Maurer, 1997,p 2). Maurer noted that a locally random or pseudorandom sequence generator could be obtained from a locally random function by simply “reading out” the function values for the arguments; however, the converse is not true because the sequence generator does not need to demonstrate that arbitrary digits can be accessed.

Maurer defined a random function as a function that assigns all arguments independent and completely random values ) (Maurer, 1997, p 8). He remarked that random functions are computationally expensive and in some cases the time and memory requirements for their computation are infeasible. Locally random or pseudorandom functions are an acceptable substitute for most applications. Maurer defined a locally random function as

A family of functions is an (n,m,k) locally

random function (LRF) with key space Z if for every subset of are uniformly distributed over and jointly statistically independent, when x is randomly selected from Z. (Maurer, 1997, 8).

In order to generalize the Luby-Rackoff construction, Maurer made two amendments, relaxing the condition of true randomness in any k function values by introducing a fourth parameter c to provide almost-randomness in k function values. The second generation of locally random functions was designed to make the definition consistent with other asymptotic functions in complexity theory (Maurer, 1997, p 10). This created a simplification of Luby-Rackoff to locally random as well as random functions and by extension to pseudorandom number generators. Maurer concluded that this was of value to cryptography because it created a less computationally expensive way in which to provide randomness.

Bellare (1996) also discussed the Luby-Rackoff construction. Bellare remarked that block ciphers are by their nature invertible, which weakens the security of a cryptographic algorithm relying on it. Bellare (1996, p 3) defined a block cipher as “a function that transforms an n-bit message block x into an n-bit string y under the control of a k-bit key .” This function, used in DES, triple-DES and RC5 encryption algorithms, is invertible because it is a permutation. The author noted that after blocks have been encrypted the algorithm becomes insecure because partial information about the message can begin to leak.

Maurer (2003) discussed the security of the Luby-Rackoff construction. He noted that the only non-trivial step in the security proof of most cryptographic algorithms based in pseudorandom functions was the analysis of the idealized construction when the pseudorandom function was replaced by a uniform random function, and that it is indistinguishable for a limited (usually large) number of queries using information theory techniques from a uniform random permutation, which constitutes the ideal degree of randomness. Maurer stated that the ideal test would include a number of queries close to the information-theoretic upper bound; if the expected entropy from these queries was greater than the internal randomness of the construction, then a distinguisher exists (545). Maurer defined this relation more formally as:

The r-round Luby-Rackoff construction of a permutation (hereafter denoted by consists of a cascade of r Feistel permutations involving independent URF’s [uniform random functions] , where a Feistel permutation for a URF f is defined as . Here L and R are the left and right part of the input and denotes bitwise XOR or, in this paper, any other group operation on (Maurer, 2003, 546).

Maurer (2003, 545) attempted to prove that was indistinguishable from by any computational unbounded adaptive distinguisher using a certain number of queries. The main theorem presented by Maurer (546) was :

Any computationally unlimited distinguisher, making at most k chosen plaintext/ciphertext queries has advantage at most in distinguishing from a random uniform permutation (URP). As a corollary we get that any distinguisher (as above) making at most queries has exponentially small (in n advantage in distinguishing (here r+1 must be a multiple of 6) from a URP if . This beats the best known bound for where we get (Maurer, 2003, 546).

Maurer considered only monotone conditions; that is, once the conditioned failed to hold it would continue to fail (2003, 547). He proved that the number of queries required to computationally distinguish a uniform random permutation from by a computationally unbounded adaptive distinguisher making plaintext/ciphertext queries approaches the upper bound as r increases (Maurer, 2003, 559).

Bellare suggested (1996, p 4) that rather than using a block cipher as described above, a non-invertible pseudorandom function should be used to generate pseudorandom numbers for use in cryptographic keys. The pseudorandom function generator provides both easier analysis and greater security of the block text (Bellare, 1996, p 5). However, rather than performing a Luby-Rackoff transformation of a pseudorandom function generator into a pseudorandom permutation generator, Bellare described the transformation of a pseudorandom permutation generator as used in block ciphers into a pseudorandom function generator.

Block ciphers are a major basis for modern cryptography, and the Luby-Rackoff formalization of the block cipher has increased security of these ciphers exponentially. Alteration of Luby-Rackoff, including Maurer’s reversal of the random permutation generator into a random function generator that produces non-invertible functions, have further increased the security of modern cryptography.

One modification of the Luby-Rackoff construction is the use of the construction to create a threshold permutation, which is used in threshold cryptography, or the sharing of crypto operations across a number of servers (Dodis, 2006, 1). While these distributed protocols are available for a number of different cryptographic applications, including public key encryption techniques, digital signatures, pseudo-random functions and so on, pseudorandom permutations as described by Luby-Rackoff were not yet distributable at the time of Dodis’ discussion. Dodis (2006, 2) noted that the availability of distributed algorithms for pseudorandom permutations allowed such applications as remotely keyed authenticated encryption and CBC encryption, which had previously been infeasible due to their dependence on a pseudorandom permutation.

Dodis (2006, 3) noted that the application of generic techniques for facilitating distribution of these permutations was infeasible because the techniques had a number of problems with computation time; they required a linear number of communication rounds, or ran in rounds but used zero-knowledge proofs for each gate, making them computationally infeasible. Dodis presented an algorithm which evaluated in rounds in a distributed fashion.

In order to implement the distributed pseudorandom function generator, Dodis defined a distributed Feistel transformation using a distributed exponentiation protocol, and then conversion into bit shares, allowing computation of a Feistel transformation across multiple machines in time(2006, 12). The servers involved in the distributed generation share four separate secret keys; these are either generated by the servers themselves or distributed from a centralized, trusted party. The user who wants to evaluate the pseudorandom function sends the input x to the servers, which convert x into bit shares and run the distributed Feistel permutation for four rounds as described in the classic Luby-Rackoff construction. Dodis (2006, 13) presented the algorithm for this task as follows:

for to 2l (in parallel)

end

return

Dodis noted that the security profile of the distributed Luby-Rackoff construct is slightly different from a single-source construct. He noted that due to the security property of distributed PRP, the information received from any given server is not enough to break the security of the algorithm or construct (2006, p 13).

There are a number of different applications for the threshold pseudorandom permutation algorithm. The first such application is CCA-secure symmetric encryption, which allows for distributed encryption and decryption (Dodis, 2006, p 17). The threshold pseudorandom permutation algorithm also allows for cipher block chaining (to encrypt messages of lengths greater than 2l) and variable-length cipher blocks, both of which extend the usefulness of the block cipher paradigm beyond its somewhat rigid fixed-length structure (Dodis, 2006, 18).

Cryptographic uses of random systems and random number generators have special requirements above the norm of random number generator designs. Although cryptographic systems are no longer constrained to the use of random number generators based on physical processes, pseudorandom systems must be cryptographically strong in order to be usable. Cryptographic strength is defined as computational infeasibility of examination of the means of generating random numbers, in addition to indistinguishability of the pseudorandom system from a truly random system. This indistinguishability can be examined using Maurer’s (2002) method of security proof. Threshold random permutation generators and random function generators can be used to provide greater cryptographic strength.

Other Uses of Random Number Generators

Random number generators are used in a wide variety of computing applications. While cryptography is one of the most complex and strict applications of random number generators, other applications include statistical analysis and computer game design. Neither statistical analysis nor game design requires the same degree of rigour in indistinguishability or other measures of cryptographic strength that cryptography does, and sometimes weaker methods of generation are used for these techniques.

Statistical Analysis

Random numbers and random number generators are both studied by and used in statistical analysis. As described above by Maurer (1997), random number generators are subjected to statistical tests including autocorrelation tests and others to determine the extent of their randomness. In addition to this analysis, random number generators are used to provide inputs for a number of statistical tests, including Monte Carlo tests, simulation and resampling (Gentle, 2003, p 1). Gentle discussed the application of random number generators to Monte Carlo tests.

Gentle considered that an important quality of a random sequence for statistical analysis was equidistribution, that is a predictable distribution of the sequence overall, one of the initial components of the definition provided by Knuth. Other qualities were the predictability of the sequence and the generation method. Many uses of random or pseudorandom number sequences in statistical analysis rely on the sequence mimicking some real-world situation; therefore, constraints are often placed on the sequence and outliers discarded; this method generates so-called quasi-random sequences (Gentle, 2003,p 93). Quasi-random sequences are known to correspond to samples from a U(0,1) distribution (Gentle, 2003, p 95). A number of different quasi-random sequences have been described. Among them are Halton sequences, which is formed by reversing the digits in the representation of a given base sequence of integers (Gentle, 2003, p 94) and Sobol’ sequences, based on a set of direction numbers which are chosen to satisfy a recurrence relation based in the coefficients of a primitive polynomial in the Galois field (Gentle, 2003, p 97).

Monte Carlo simulation is one of the primary uses of random numbers in statistical analysis. Gentle (2003, p 229) defined Monte Carlo simulation as “the use of experiments with random numbers to evaluate mathematical expressions.” Madras (2002, p 1) remarked that there are two types of Monte Carlo simulation: direct simulation of a naturally random system and addition of artificial randomness to a system. Gentle noted that very large and sparse equation systems such as evaluation of integrals over high-dimensional domains can be effectively computed using Monte Carlo methods. Monte Carlo methods can also be used to simulate physical processes, and is often used in statistical physics to solve such problems as evaluation of models with large numbers of particles (Gentle, 2003, p 229).

The Monte Carlo method uses Markov chains to simulate a solution to a problem rather than attempt to directly solve the problem (Fishman, 1996, p 3). Some forms of Monte Carlo simulation introduce upper bounds of sampling, known as variance-reducing techniques (Fishman, 1996, p 4). This decreases the processing time and increases the accuracy of the simulation.

Game Design and Game Theory

There are many applications of random number generators within computer science. In addition to the aforementioned cryptographic seeding and statistical modeling, random number generators and pseudorandom number generators are used within computer game design to create simulations of real-world phenomena. Bewersdorff (2005) discussed the application of random numbers to simulation of a number of different games, including blackjack, poker, snakes and ladders and Monopoly (121). Monte Carlo simulation, as discussed above, is often used for examination of game theory. Limited or quasi-random number generation is used in these simulations to provide a real-world picture of the decision paths chosen to create a given game.

Discussion

Random number generators are the workhorse of the statistical and cryptology fields. The algorithms that define them in cryptographic systems lie at the bottom of layer upon layer of overlapping safeguards, in an attempt to keep the secret bits secret. Block ciphers, reverse block ciphers and other methods of secure random number or function generation are tested statistically to ensure computational infeasibility. However, a weak random number generator can still lead to a cryptographic standard being broken, as happened with DSS in the late 1990s. A number of different methods, such as threshold cryptography (distribution of the cryptographic calculation among several unique servers), use of random functions rather than random permutations to provide non-invertible pseudorandom sequences and increasing the number of permutations used in block cipher algorithms, has increased the security level of secret key cryptography algorithms. Some cryptographic standards rely on the length of the key to protect the information; however, this creates an arms race situation where the keys must continually get longer as computing methods get faster and more efficient to maintain computational infeasibility.

For statistical analysis, secrecy is not key to a random number generator. However, the generator must produce equidistributed results which can be arrayed into a normal distribution in order to maintain real-world accuracy in simulations. If a random number generator does not maintain this, there is a chance that simulations will fail or worse succeed but give undetectably inaccurate results, as Marsaglia discovered in his examination of linear congruential generators in the 1960s. Methods of examination must continue to improve as the detectability of non-equidistribution increases.

Conclusion

Random number generators are a vital component of many practical mathematics applications, including cryptography and Monte Carlo statistical tests. These generators are based in physical processes such as radioactive decay or noise from a transistor or other electronic component as a seed for number generation. However, true random number generators are expensive because of their physical nature. For many applications, use of a pseudorandom number generator is sufficient. These generators simulate the output of the physical process, taking a small random sequence and spinning it out into a longer sequence. While a pseudorandom number generator is not truly random, it is indistinguishable from a random number generator in most statistical tests.

The first computational random number generators, multiplicative congruential generators, were used from the advent of computing up to the 1960s. In the 1960s, George Marsaglia made an observation regarding the output of these generators, which cast into doubt their use up to that point. He noted that rather than being random, these generators produced output that was easily arrayed into hyperplane crystalline structures. This lack of randomness caused some tests that required true randomness to fail for unknown reasons. More perniciously, some tests succeeded with this non-random random sequence, producing results of unknown and indeterminable quality.

A large number of random and pseudorandom number generators have been developed, including the BBS generator, the elliptical curve generator and the RSA/Rabin generator. Additionally, Maurer described a method whereby a unique pseudorandom number generator could be designed using any one-way function, or function that is easy to solve given all components but difficult to find the inverse of if one component is missing. Generation of random primes is a special application of random number generation. These numbers are used in cryptographic applications and can also be used to study prime numbers. This generation is accomplished by generating a stream of random numbers and applying the Rabin algorithm to determine the primeness of the output numbers. The Rabin algorithm is highly effective at finding prime numbers, but it is not perfect, leading to a situation where there is a high degree of assurance that the outputs are prime, but it is not absolute. The degree of safety can be tweaked by using a higher parameter for degree of assurance of primeness.

Random number generators must be tested for strength if they are to be used in cryptographic applications. Because the strength of a cryptographic algorithm rests in the security of a significant and secret bit (or piece of the equation which is not revealed), the function used to generate the numbers must be highly random and display high computational infeasibility. Statistical tests including common frequency tests, autocorrelation tests and poker tests have been used to determine the randomness of the output of random number generators for cryptographic use. Maurer described a more general statistical test which accounted for all practical defects in a cryptographic random number generator and also determined the cryptographic implications of the given defect. This statistical test was not applied to pseudorandom number generators, which are known to have a number of statistical defects despite their appearance of randomness.

Random permutation generators and random function generators, which produce random output, as well as other random systems, are commonly used in cryptography in the creation of block ciphers, or pieces of ciphertext the same length as the originating plaintext. These generators are exemplified by the Luby-Rackoff random permutation generator, which creates a random or pseudorandom permutation using four rounds of Feistel permutations (or three for the weaker pseudorandom permutation.) Luby-Rackoff can also be reversed for greater security by creating a random function generator from the random permutation generator.

The security of random systems as used in cryptographic applications is determined by one factor – the computational infeasibility of distinguishing a random system’s output or determining how it creates the output. The randomness and distribution of random number generation can be verified statistically. However, in order for a cryptographic system to be truly secure it must have a distinguishability limit close to the information theoretical upper limit in regard to the number of queries by the distinguisher. The number of queries considered should also be near the upper limit of computational feasibility in order to provide the maximum level of security. Maurer (2002) suggested a general framework for security proofs of random systems which included defining an ideal system corresponding to the random system under examination, but with the replacement of the pseudorandom or quasi-random generation mechanism replaced by a random system. Examination of the distinguishability of the random and pseudorandom generation mechanisms will create a proof of the security of the random system. This can be applied to any random system, including random functions, random permutations and so on.

Monte Carlo simulators are an example of a statistical application that requires random number generation. This modeling method takes a novel approach to solving problems: rather than attempting to deterministically solve the problem, it simply runs simulations until an answer is found. Monte Carlo simulation is used in many areas of inquiry including physics, chemistry, game design and chaotical modeling. These simulation and modeling applications work in one of two ways: either relying on their own inherent randomness in observed data sets or by incorporating randomness produced by a random number generator into the simulation in order to observe its effects. In some instances, random number generators for statistical applications are limited, producing a so-called quasi-random sequence which falls along the normal distribution for use in simulations which require representation of a typical data set.

Random number generation techniques continue to improve, and pseudorandom number generators gain higher degrees of randomness as calculation methods improve. The Luby-Rackoff random permutation generator, designed for the DES encryption standard, was the first step in a cascade of improvements in the block cipher algorithm which allowed for the use of a pseudorandom number generator in a strong cryptographic technique. A number of improvements have been made to the Luby-Rackoff construction, including simplification of the technique, use of a pseudorandom function generator rather than Feistel permutations in order to prevent inversion of the output, and the creation of a threshold random permutation generator using the technique. The threshold random permutation generator adds additional functionality to the block cipher paradigm by allowing for variable length cipher blocks and cipher block chaining, or a stateful recognition of the correct transmission order of cipher blocks in order to assemble a message of greater than 2l length.

Random number generators are also being used in mathematical analysis. The Rabin algorithm, used in conjunction with a pseudorandom or random number generator, can deterministically find very large numbers that are probably prime. These can be used for further analysis or can be used for seeding of a cryptographic standard algorithm which relies on prime numbers for its operational technique, such as DSS.

These are just some of the current applications and generation methods of random and pseudorandom numbers in use today. Numerous other methods exist, including both physical methods using atomic clocks, noise from leaky transistors and even, most whimsically, cameras aimed at an array of functioning lava lamps. Pseudorandom number generators increase in strength and randomness as well, becoming more useful for applications such as cryptography that require strongly random sequences. Both random and pseudorandom number generators continue to grow in importance as computational power becomes less expensive and more complicated simulation scenarios are designed.

Future horizons for random number generation include cheaper random number generation from physical sources as detection and sampling methods improve as well as increasingly secure techniques for random systems. However, as cryptographic applications grow more complex in an effort to maintain their security, attacking systems also become more complex. In order to maintain a balance and preserve the integrity of cryptographic applications, constant development and refinement of techniques must be used to keep ahead of attacking techniques.

References

*Beauchemin, Pierre, Brassard, Gilles, Crepeau, Claude, Goutier, Claude & Pomerance,

Carl (1999). The Generation of Random Numbers That are Probably Prime.

Bellare, Mihir, Goldwasser, Shafi & Miccaiancio (1997). “Pseudo-Random” Number

Generation within Cryptographic Algorithms: the DSS Case.” Advances in Cryptography – Crypto 97 Proceedings. Lecture Notes in Computer Science, 1294.

Bellare, Mihir, Krovetz, Ted & Rogaway, Phillip (1998). Luby-Rackoff Backwards:

Increasing Security by Making Block Ciphers Non-invertible. Advances in Cryptology- Eurocrypt 98 Proceedings, Lecture Notes in Computer Science Vol. 1403, K. Nyberg ed, Springer-Verlag, 1998.

Bewersdorff, Jorg (2005). Luck, Logic and White Lies: The Mathematics of Games.

Wellesley, MA (USA): AK Peters, Ltd.

Dodis, Yevgeniy, Yampolskiy, Aleksandr, & Yung, Moti (2006). “Threshold and Proactive

Pseudo-Random Permutations”. Presented at TCC 2006, Columbia University, New York, NY (USA). Retrieved electronically from < http://www.informatik.uni-trier.de/~ley/db/conf/tcc/tcc2006.html >

Ferguson, Niels & Schneier, Bruce (2003). Practical Cryptography. Indianapolis, IN

(USA): Wiley Publishing.

Fishman, George (1996). Monte Carlo: Concepts, Algorithms and Applications. New

York, NY (USA): Springer Publishing.

Gentle, James E. (2003). Random Number Generation and Monte Carlo Methods.

New York, NY (USA): Springer Publishing.

Giuliani, Kenneth (1998). Randomness, Pseudorandomness and its applications to

Cryptography.

Håstad, Johan, Levin, Leonid, Luby, Michael & Impagliazzo, Russell (1999).

Construction of a pseudo-random generator from any one-way function. SIAM Journal on Computing 28.4:1364-1396.

Knuth, Donald (1997). The Art of Computer Programming, Volume 1: The

Fundamental Algorithms. Harlow, England: Addison-Wesley Publishing.

Knuth, Donald (1997): The Art of Computer Programming, Volume 2: Seminumerical

Algorithms. Harlow, England: Addison-Wesley Publishing.

Madras, Neal Noah (2002). Lectures on Monte Carlo Methods. Providence, Rhode

Island (US): American Mathematical Society.

Marsaglia, G. (1968) Random Numbers Fall Mainly in the Planes. Proceedings of the

National Academy of Sciences, USA.

Maurer, Ueli (1992). A Universal Statistical Test for Random Bit Generators. Journal of

Cryptology 5.2:69-105.

Maurer, Ueli (1992). A Simplified and Generalized Treatment of Luby-Rackoff

Psuedorandom Permutation Generators. Advances in Cryptology – Eurocrypt ’92 Lecture Notes in Computer Science, Berlin: Springer-Verlag, 658:239-255.

Maurer, Ueli (May, 2002). Indistinguishability of Random Systems. Lecture Notes in

Computer Science vol 2332, pp 110-132.

Maurer, Ueli & Pietrzak, Krzystof (May 2003). The Security of Many-Round Luby-Rackoff

Pseudo-Random Permutations. Lecture Notes in Computer Science, pp 544-561.

Naor, Moni & Reingold, Omer (1997). On the Construction of Pseudo-Random

Permutations: Lucy-Rackoff Revisited. Proc. 29th Ann. ACM Symp. on Theory of Computing, pp. 189-199.

Schneier, Bruce (2000). Secrets & Lies: Digital Security in a Networked World.

Indianapolis, IN (USA): Wiley Publishing.

Random Number Generators: Design and Applications

Related Topics