|
|
|
|
|
|
|
|
|
Understand randomness and you
could win a Nobel prize, or clean up big time at the casino. So the stakes
are high indeed for a tiny band of mathematicians who reckon they are beginning
to crack it, says Dana Mackenzie
ONE DAY, while wasting time and money in your local casino,
you notice that a computer-animated craps game is behaving in a strange way.
For 100 consecutive rolls of the dice, the machine comes up with an odd number.
Visions of a big payout dance in your head. Should you start betting on odd
numbers? Or would that be superstitious
nonsense?
Your dilemma reflects a problem that has mystified gamblers and mathematicians
alike: how do you tell whether an individual
event is random or has an underlying pattern. Randomness is all around
us, in the motion of stock prices and atoms;
yet it is maddeningly elusive. If you succeed in trapping it, it's no longer
random.
Scientists are forever limited to living on the few islands of order in
the vast sea of randomness, because they can
only study what they can describe. A truly random event has no pattern to
it and hence is indescribable. Or so it would seem, according to classical
theories of randomness. But a revisionist view, called Kolmogorov
complexity, is now beginning to turn that conventional wisdom upside
down. Using this theory, mathematicians have found a curious thing: an area
of their world where the normal roles are reversed, where randomness is entirely
surrounded by order, rather like a lake surrounded
by land.
Such a find is a huge breakthrough for mathematicians. Computer scientists
even compare the discovery to the breakthroughs that led to
quantum physics. It means that even if this lake
of randomness is strictly off limits, it is possible to
map its shores. Armed with such a
map, they should be able to better explore the nature
of randomness and to tackle hitherto unsolvable problems. And in the real
world computer scientists, financial analysts and even gamblers could
benefit.
The idea that makes all this possible dates back to the mid-1960s when the
Soviet mathematician Andrei Kolmogorov and the Americans Roy Solomonoff and
Gregory Chaitin hit upon a way of defining
randomness. Their idea was that most numbers or
sequences of digits (they amount to more or
less the same thing in mathematical terms) are random in the sense that they
contain no pattern. The big question was how
to spot them, how to distinguish these numbers that by their very nature
lack any distinguishing features.
Kolmogorov, Chaitin and Solomonoff defined the notion of the complexity
of a number. Roughly speaking, the complexity of a sequence of digits
is the length of the shortest computer program that prints that sequence
and then stops running. A computer program can be written for any sequence
of digits simply by issuing the command "PRINT" followed by the numbers in
the sequence. So the Kolmogorov complexity of a sequence, as it later became
known, is never more than the length of the sequence plus the number of bits
it takes to encode the instruction "PRINT".
Of course, some nonrandom sequences can be printed by much shorter programs.
The first billion digits of the number pi are not random because a short
program can be written to generate them. And a sequence of one hundred
is can be compressed into a program such as "PRINT 1 100 TIMES" -precisely
because this sequence has a pattern. In fact, any number you can compress
can't be random because it contains a pattern.
This leads to a straightforward definition of randomness. Kolmogorov and
his colleagues defined a number as random if the shortest program for calculating
its digits turns out to be about the same length as the number itself. Random
numbers cannot be compressed. But as simple and appealing as this notion
seems, determining the Kolmogorov complexity of a number is fraught with
difficulty. Suppose you have a long sequence of digits and a program to generate
it that is about the same length. Does that prove that your number is random?
Absolutely not, since there could be a shorter program that also does the
job that you don't know about and the number could be compressible in some
hidden way. In fact, it turns out that you can never know whether you have
the shortest program or not. Not only is it practically impossible to find
the shortest program that computes a number, it's also theoretically
impossible.
That leaves mathematicians in a fix. It means they can only play with the
small proportion of numbers that have some kind of pattern. The rest-the
random numbers-are forever hidden since it is impossible to prove they really
are random.
For this reason, Kolmogorov complexity remained more of a curiosity than
a practical mathematical tool. But in the last few years Paul Vitanyi, an
information theorist at the Centre for Mathematics and Computer Science and
the University of Amsterdam, both in the Netherlands, and his long-time
collaborator Ming Li at the University of Waterloo in Canada, have made huge
progress in using Kolmogorov complexity. Their stunning result concerns a
famous geometric problem set out in the first half of this century by the
German mathematician Hans Heilbronn. If a number of pebbles (n) are
placed inside a square and triangles are drawn between them, what arrangement
of pebbles makes the smallest triangle as large as possible? (see Diagram).
Mathematicians call the largest possible size of the smallest triangle the
nth Heilbronn number.
The smallest triangle
Even with a small number of pebbles, the problem is extremely challenging.
With five pebbles, the Heilbronn number turns out to be 0.l924... (meaning
that the smallest triangle in this arrangement is 0.1924... times the size
of the square). The sixth Heilbronn number is the size of the square. But
even for as few as seven pebbles, the Heilbronn number is too difficult to
calculate. Heilbronn suspected that for large numbers of pebbles, the Heilbronn
number would be roughly proportional to 1/ n2, meaning
that the smallest triangle in a configuration of 1000 pebbles would have
an area no bigger than one-millionth the size of the square. But in 1982,
experts in geometry proved that Heilbronn's guess was wrong and that the
real power of n is somewhere between the numbers 8/7 and 2. But to
this day, nobody knows exactly how to work out the correct power.
There is another version of this problem that has traditionally been even
more difficult to crack. This is the one that Vitanyi and his colleagues
have chosen to tackle. Instead of looking for the very best configuration
of n pebbles, they asked what would happen if the pebbles fall at
random in the square-how big is the smallest triangle formed in this case?
This version has the added problem of randomness built in. If you decide
to study an arrangement of pebbles, how do you know that it really is random?
Enter Li, Vitanyi and Tao hang of McMaster University in Hamilton, Ontario.
They reasoned that although the answer itself must involve the notion of
randomness, any tiny deviation away from this answer would contain some order
and could therefore be spotted and eliminated using the ideas of Kolmogorov
complexity. In this way, they could approach the answer from above and below
by eliminating all the nonrandom possibilities until
what is left must be random. In a sense, they would be mapping the shores
of randomness without ever getting their feet wet.
'Mathematicians can only play with the small proportion of numbers that have
some kind of pattern. The rest, the random numbers, are forever hidden since
it is impossible to prove that they really are random'
At the heart of their approach is the idea that the positions
of the pebbles can be encoded using a coordinate system inside the square.
This means that any arrangement of pebbles can be represented by a sequence
of numbers. If you can compress this sequence by writing a shorter program
that produces the same sequence, then the arrangement cannot be random.
As in many mathematical proofs, the first step
is to guess an answer and then try to prove it right. Vitanyi and his colleagues
picked an answer and showed that if the spacing between the pebbles were
larger than this, the pebbles would have to adopt a regular pattern as they
were added to the square. Likewise, they showed that if the spacing were
smaller, at least three pebbles would fall in a straight line. This extra
information would allow the sequence of coordinates to be compressed and
so it cannot be random. Having proved that the answer cannot be larger or
smaller than the guess-having mapped the edge of randomness-the only other
option is that the guess is correct.
Using this method, the team showed that the area of the smallest triangle
is proportional to 1/n3 -no more, no less. So with 1000
pebbles randomly arranged in a square, the smallest triangle will have an
area roughly one-billionth of the square's. In February they published the
result on the Internet and it is currently being reviewed for
publication.
The work represents an extraordinary achievement. Mathematicians now have
a powerful way to make an exact statement about randomness. And not
just about one random number "Random objects are all interchangeable since
by definition they have no special properties that can be used to effectively
select a proper subset of them. They are like a sea of
water molecules which cannot be distinguished,"
says Vitanyi.
So the answer holds true for every random arrangement that it is possible
to generate. In one giant leap, mathematicians have gone from complete ignorance
of every random sequence to being able to say something about all of them.
And since the vast majority of numbers are random, this is an astonishing
achievement. For this reason, Vaughan Pratt, a leading expert on the complexity
of computer systems at Stanford University and one of the founders of Sun
Microsystems, calls Kolomogorov complexity a bulk theory of random numbers
that is comparable to the discovery of wave-particle duality in physics.
When physicists stopped thinking of subatomic particles as points in space
and started thinking of them as waves that spread throughout space, they
were better able to predict their behaviour. Similarly, by treating random
numbers as objects that cannot be pinned down, mathematicians have a powerful
new way to solve problems. "It lets you wrap your mind around the suite of
all possible computations," he says.
The significance goes beyond pure mathematics. Vitanyi's group and other
mathematicians are working on ways to apply their new found skills to problems
in computer science. One important problem is determining the average running
time of a given computer program. Essentially, this problem boils down to
the task of finding an average" number with which to test the program. An
average number is one that does not have any special sequence or pattern
that could make the program run especially quickly or slowly. If this sounds
familiar, there is a good reason. In mathematical terms, an average number
is very similar to a random number.
Computer scientists are painfully aware that finding a truly random number
is tough. Instead, they usually confine themselves to finding best and worst-case
scenarios, a problem that equates to finding numbers with a special sequences
that make the program run as quickly or as slowly as possible. Of course,
the worst case scenario may not bear any relation to how fast the program
runs under average conditions which is why the method is frustrating.
But a solution may be in sight. Vitanyi and his colleagues recently applied
the idea of mapping randomness to the problem of determining the average
running time of a list-sorting program called Shellsort, a problem that has
gone unsolved for some forty years. And the idea is spreading. According
to Xiaolu Wang, a mathematician and computer scientist in New Jersey, the
work may point the way to a better solution to one of Wall Street's trickiest
problems: how to determine the fair market value of derivatives. These are
essentially IOUs, promises to buy or sell a given stock by a certain date,
or when it reaches a certain price. Their value depends on how likely the
stock is to go up or down by the prescribed amount, and that in turn depends
on predicting the seemingly random fluctuations of the market.
Because the stock market is so complex, analysts like Wang do not use
mathematical formulae to price a derivative. (While economists
Fischer Black and Myron Scholes recently
won a Nobel prize for such a formula, its assumptions on how investors react
are now thought by many analysts to be too simplistic.) Instead, they simulate
the stock market many times on powerful computers and work out an average
estimate of how it will behave, a trick known as the Monte Carlo method.
"It's powerful and easy to apply to a wide range of problems," says Wang.
"The drawback is that its convergence [the time it takes to arrive at a reliable
estimate] can be agonisingly slow."
The inefficiency of the Monte Carlo method comes right back to Heilbronn's
problem, Wang says. When computers simulate random events, like the random
dots in the square, they produce too many "clusters" of points. "If you have
a cluster in a given region, you will emphasise that region too much," Wang
says. He has founded a financial consulting company called Advanced
Analytics that sells a proprietary method for generating sequences that
are more evenly distributed, although not random. So Vitanyi, Li and Jiang's
result acts like a mathematical guarantee of the efficacy of Wang's method.
By describing precisely what the degree of clustering is in random simulations,
it should allow Wang to determine just how much better he can do with more
uniform simulations.
And for the gambler playing computer animated craps, there is also hope.
In the example at the beginning of this story; betting on an odd number is
the right idea. A hundred odd numbers in a row is more than just a lucky
streak-it is a pattern than can easily be compressed and so is immensely
unlikely to be the result of a random process. If you haven't worked it out
already, the machine is almost certainly broken and you should bet on it
before the management finds out.
|
|
|
|
|
|
|
|
|
|