What Is Random

Thought experiment: How would you pick a random number between zero and positive infinity? Think carefully...

Very relevant link:

http://www.plosone.org/article/fetchArticle.action?articleURI=info:doi/10.1371/journal.pone.0000443

In ComputerScience randomness is important because computers can't do it! Computers are inherently deterministic and as such they have a bad time with being random - everything a computer does is entirely at the mercy of the instructions it is given. This is a problem when we want to use computers in statistical applications that demand randomness. It also raises PhilosophicalQuestions? such as AreWeCode - do we have FreeWill or not - and if so will we ever be able to build a computer with FreeWill or to act as a human does? It also raises issues for DataCompressor?s - the more random data is the harder it is to compress.

Defining randomness

The first question is then how do we define randomness? Our typical understanding of randomness is that it is something without plan or cause, spontaneous in its manifestation. Our mathematical understanding of randomness comes from statistics.

Take an unbiased coin flip. When we flip the coin we can say that the probability of heads is 50% and the probability of tails is 50%. We cannot be sure what the outcome is but we know it must be one or the other.

Now we have another coin and we want to test it against the other coin to see if there is any bias or not, that is we want to know of the probability of heads and tails are the same for this coin or not. The question becomes how can we use the unbiased coin to see what the bias of the other coin is? The answer is that we reason about what we should expect to see in an infinite number of coin flips.

Our unbiased coin says that heads and tails are equally likely. This means that if we perform an infinite number of flips then we should expect the flip to have been tails just as often as it was heads - 50% of the time. "Ah", so we reason, "if there was a bias in our other coin then in an infinite number of flips we should see that either heads or tails are favoured so that one side is more likely to occur than the other - 60% heads, 40% tails for example."

"But," I hear the pragmatist cry, "I can't sit around here all-day and make an infinite number of coin flips! How can I tell that such a thing as an unbiased coin really exists anyway? How do I know there isn't really some underlying process at work here so that if I could figure it out then I would always know what the next flip is? If I could figure out a system I could win big in the casinos!"

This is where things get interesting because we are left to reason about the properties of finite portions of an infinite number of flips - we have to measure how close reality gets to our idealised notions of what should happen in an infinite number of trials.

So take a small sequence of coin flips:

THTT

What can we discern from the coin from this small portion of trials?

Entropy is a measure of the disorder of a system. The more disordered, and hence random, a system is the higher the entropy is. Entropy in InformationTheory allows us discern how much information would be required to store a sequence of symbols. Take our coin flip example, if we equate heads to 0 and tails to 1 then it is clear to see that each flip of an unbiased coin requires 1 bit of information to store. Therefore we need 4 bits to store our example sequence above right?

Wrong - if there is a bias to our coin then there is more order in the system and our entropy drops. If the entropy has dropped then what InformationTheory says is that we can use fewer bits to store our sequence. A totally biased coin, one that only flipped heads for example, would require 0 bits of information to store any sequence - this is because any finite sequence we care to create from the infinite one can be done simply by creating the required length of T symbols.

Now our coin sequence above is clearly not just T symbols but we can also see that the sequence does have a bias towards T. Hence we know that, strictly speaking, fewer than 4 bits are needed to store the information in this system. It is this principle that DataCompressor?s rely on - using a fewer number of bits to encode a larger number of bits.

"So entropy tells us about the disorder of a system but," the gambler interjects, "that doesn't mean that there isn't some system at work that determines which side the coin comes up on each flip!" This is where the concept of determinism comes in.

In probability theory we can talk about the probability of one event given the occurrence of another event. For example in our small coin flip sequence we can say that P(T|H) is 100%, since a tail always follows a head, P(T|T) and (H|T) is 50% since TH and TT both occur in the sequence, and P(H|H) is 0% since that never occurs. We can also talk about longer sequences, P(T|HT) and P(TT|TH) is 100% since the sequence HT always leads to a T and the sequence TH always leads to TT. When the probability is 100% we can say the event is deterministic, that is to say that, in the given sequence, TH determines that TT follows. Now as computers are deterministic we can make use of this in our DataCompressor?s: we just need to find a set of 'sequence given sequence' probabilities that are 100% that allow us to completely determine the sequence from the beginning. If we can do this and encode this information in fewer bits than the entropy of the sequence then we have achieved our aim of compression. This problem relates to the KolmogorovComplexity measure that tries to ascertain just what the smallest sequence that can describe another one is.

From entropy and determinism we can see that randomness is essentially non-determinism: if a finite sequence has maximum entropy then we cannot build a compressor for it and hence we can not take advantage of the determinism of the finite sequence (that is to say the minimum amount of information we need to determine the entire sequence is the entire sequence). For infinite sequences this will mean that we can say that the sequence is non-deterministic - that is to say we cannot construct a deterministic encoding that describes the entire sequence.

This also gives us another property of random sequences - they are acyclic. Any infinite sequence that is deterministic will repeat itself eventually whereas no such global repetition can exist in a non-deterministic infinite sequence.

Pseudo-random number generators

These are deterministically based - they attempt to produce finite sequences that appear random but, because of the cyclical property, all such random number generators will eventually repeat themselves.

True random number generators

There are non-deterministically based - generally using some natural phenomena that is assumed to be random as the source, such as atmospheric noise, radioactive decay or some other application of quantum mechanics.


You limit the function of the compressor a great deal. You can compress data that has less than deterministic nature... e.g. if a particular sequence has a non-deterministic 99% probability, then you can still compress by encoding the 1% exceptions as things that break up the monotony of the 99% normal state. Then you can repeat the effort if, in the result, there is a next common sequence.

[Ah no, you see you missed the point. The whole point of a compressor is to solve the problem: Given a sequence X find a sequence Y that can deterministically describe it in fewer symbols than X. Now if we were non-deterministically generating sequence X then we would just use the probabilities for each symbol. Of course I'm not sure people would be very happy if their decompressors only had probabilities to go on - don't really want to have to decompress until you happen to get your original data back. As such if the sequence is highly deterministic then we can compress more - we need fewer symbols in Y to describe X. If the sequence is highly non-deterministic, as in your example, then we need more symbols in Y to describe X. What we essentially get from our compressor is a measure of just how deterministic our data is. The most deterministic data is a sequence of the same symbols. And the least deterministic... well that's when we know we have random symbols.] [Perhaps you misunderstood, or you haven't studied the same forms of compression as me. If there is a sequence of As and Bs where P(A|B) = 99%, P(A|A) = 99%, P(B|A) = 1%, P(B|B) = 1%, then there are no instances of the 100% deterministic sequence you mention. Compression, in this case, can be performed by creating symbols that represent long strings of As. E.g. there is only a 26% chance that a string of 30 sequential characters contains a B. Thus, if you create a symbol C to represent 30 As, you can compress every sequence of 30 As into just one C. Quite a few compression schemes on domain-specific data work by this mechanism or something similar (e.g. an alternative would be to introduce a three character sequence: CXA, where X represents a precise number of As in some finite range... like 4 to 200+).] [You have to consider all the run lengths. In the worst case you have P(S|NULL) = 100%, that is the probability of the sequence you are trying to encode given nothing is 100%, that is we can only determine the sequence given the sequence itself. We want something that can take advantage of runs. In your example we know that there's going to be a lot of runs of As with a few Bs dotted about. As such we can analyse all the smaller run lengths and encode P(S|NULL) in terms of sequences smaller than S, in your example C encodes 30 As so we have P(C|some other determinant) = 100%, in this case the other determinants being all the encodings which come before a run of 30 As, which can reconstruct the entire sequence from the start. ][That isn't required at all. You can even use C to start compressing arbitrary-length streams of data without ever examining it long enough to learn that C always or never follows any particular determinant. In the long run, P(C) = ~74% completely independently of what came before it, replacing a sequence of 30 sequential As. You can decompress the sequence "...AAAAABCCAAAAAAAAB..." without ever bothering to learn of what the "..." sequences consist. That this can be done without ever learning P(C|Whatever) is sufficient proof that neither said examination nor proof is necessary. This is all probability driven. Consider creating a substitution for, say, "BBB => X". It would only have a one in a million chance of being used, and would compress two characters. Total gain is (2ppm - cost of transmitting "X => BBB" - cost of adding X to alphabet). The substitution "30*A => C", however, has a 74% chance of being useful and compresses 29 characters. Total gain: (~71% - cost of transmitting "C => 30*A" - cost of adding C to alphabet). These are probability-driven compression schemes and do not require determinism. However, study is required to determine where such compression is profitable, especially when adding the cost of extending the alphabet (which may require using escape characters or sequences and might levy a percentage cost on the original stream)."]

You also missed the fact that pseudo-random number generators do not need to repeat themselves... but to avoid that entirely, they must not be finite state machines. Being cyclic is not a necessary property of deterministic sequences. E.g. the sequence of decimal digits to describe pi or the inverse natural log is acyclic and seemingly random, yet entirely deterministic. A compressor that cannot identify the fact that the sequence being compressed is a sequence of 1 million digits of pi following the one millionth digit of pi might think it random... yet it can be encoded in less than a sentence: "One million digits of pi following the one millionth digit of pi".

[On reflection you could be right about using pi in this manner.]

You do not touch on the random source, here... and, more importantly, on inputs to the process that affect the outputs. That seems a critical missing piece before you go much further. If you focus only on the output, then you lose the fact that, perhaps, the coin you flipped that seems random was in fact entirely deterministic... e.g. it lands whichever you face you had up when you flicked it off your thumb.

[ It is an idealised coin flip experiment. Unbiased coins don't exist either. Don't take the analogy too far now. ]

We can call something with inputs and outputs a process. A pure generator is a particular type of process... one without inputs (beyond is initial construction); it consumes no inputs. Any pure generator that operates by deterministic laws can be entirely described by one of its states and the fixpoint of the rules that determine its future states (i.e. including the rules that determine the rules that determine its future states).

Viewing generators and their outputs is rather limiting to any meaningful study of the subject of WhatIsRandom.

[It is, in fact, all we have to go on. The whole problem of randomness essentially relates to the problem of infinity.] [I'd say that a study of what randomness means to a process is worth pursuing.]


A more useful question is: does random even exist?

[ I think it's related to whether or not you think infinity can exist or not. In a strictly finite universe any sequence of events can be deterministically described as I demonstrated above. In an infinite universe you can't compute it.] [A finite universe only allows you to deterministically describe a sequence of events if those events are determined. It is often assumed, but not a given, that the past is determined after we experience it and that the present is determined after we observe it. The future is still largely unknown, but if you postulate deterministic physical laws as directing the universe (a relatively modern idea, but not a weak one) then you can posit that the future might also be determined for the finite universe. If the amount of information required to describe the universe in its broadest concept ('uber-verse' below) is finite, and the rules governing the universe are deterministic, then information entropy cannot exist... so long as you have all the information to start with. If the rules are non-deterministic, then the universe definitely possesses some form of information entropy... though whether it is patterned in a manner we'd call truly "random" is an open question. The other possibility... that the information required to describe the universe or its rules is infinite, but the rules to describe the universe are deterministic... I'm honestly not sure how to properly approach that one at the moment.]

If the universe is a pure deterministic generator, then its every state can be wholly described by its initial state and the rules that govern it. In such a universe, randomness cannot truly exist... not even of the sort discussed above. There is no information entropy, even if we can't see that from our rather limited perspective.

There are two possible means by which the universe is not a pure deterministic generator. One is that the universe receives inputs from... someplace else. This leads to arguments as to whether this "someplace else" should be considered part of our universe, or whether we should view the universe as being part of a multi-verse. In either case, one can ultimately expand the question to this larger 'uber'-verse that is constituted by our universe and everything that has a direct or indirect influence upon it.

The other possibility is that there are non-deterministic rules that govern our uber-verse (or non-deterministic rules that govern the rules that govern the universe, not forgetting that rules can change too).

If random is to exist in the universe, it must be because of the latter... because a random-source is part of the rules governing either our universe or one of those influencing it directly or indirectly.

How this random-source might exist is an open question - one left, at the moment, in the hands of philosophers more than computer scientists. If we are to possess Free Will, then it must be an inherent part of what we consider 'self'. If quantum physicists are right, then perhaps it's part of the underlying fabric of space... or perhaps a weird combination that somehow brings observation into question (Schroedinger's cat style). It is possible that multiple such sources exist, or that none do.


It's worth noting that any definition of random that derives purely from observations on output of a process, rather than this ultimate source of "randomness", is not contradictory to free will. This is because such definitions do not preclude patterned output or output that is influenced by the process (or the will). Instead, random is merely being used as another word to describe the fact that the output of such will is not deterministically compressible. This, combined with the fact that if you add process inputs, then random cannot truly exist in a deterministic universe despite the fact that information entropy (from a limited perspective, or in an infinite universe) can exist fact brings into question whether the initially offered definition of "random" should be dismissed as inadequate.

It was originally stated that "Our typical understanding of randomness is that it is something without plan or cause, spontaneous in its manifestation. Our mathematical understanding of randomness comes from statistics.".

This disparity leads too easily to the EquivocationFallacy. If Person A says that something "is neither non-deterministic nor random", that person is using the typical understanding of randomness; by "random" that person means something that is without plan or cause, and spontaneous in its manifestation. When Person B yells back: "non-deterministic and non-random? No such thing exists!", Person B is using the mathematical understanding of randomness, which essentially means "information entropy" on the output sequence. Person B has committed the EquivocationFallacy, albeit perhaps as a consequence of Person A not properly communicating his or her intent.

[The problem is that if you are going to use the mathematical term non-determinism with a non-mathematical understanding of randomness then you cannot really make sensible conclusions.] [You didn't properly define the mathematical term non-determinism (having only offered an informal description by way of analogy), but there is also a typical understanding of non-determinism. EquivocationFallacy could also happen on that issue. You can look up determinism in a dictionary or philosophy text; when Person A says that something is non-deterministic he or she effectively means the opposite of that definition (also known as indeterminism). An event or decision is non-deterministic if it lacks sufficient cause for its exact nature. I.e. a thrown baseball is non-deterministic in its path only if having all the laws of nature and all the inputs still cannot determine the exact path of the baseball. This is more the Western philosophic traditional view of Determinism (things get really funky in the Eastern philosophic tradition where sufficient cause is the volition of sentient beings within the universe that ultimately drives that which appears to us as physical and karmic law).]

It is probably worth exploring what that typical understanding of random really entails.


Random^H^H^H^H^H^H Spontaneous^H^H^H... A thought: if information entropy regarding outputs from an infinite system can exist even if its rules are entirely deterministic, would such a possibility not describe a system that is Deterministic AND Random?

If our universe is not discrete in space or time, then infinity exists in any given volume or space-time, much as there are infinite possibilities between 0.0 and 1.0. It can already be proven that it takes infinite digital information to fully describe some analog signals, and that it takes infinite digital information to fully describe arbitrary probabilities rather than approximations thereof, so if the universe is analog or based in quantum probability...

Which brings to a related thought: Is it possible that it takes infinite information to fully describe the will of a person? Whether it be materialistic (i.e. bound to the brain or body) or otherwise?


As discussed above, any form of non-determinism must ultimately arise from a non-deterministic rule in the universe. What would such a rule entail? This deserves at least a little consideration. Perhaps we need to first figure out WhatIsaRule?. However, from context it can be said that a rule, at the very least, describes the progression of a generator from one state to another (a phrasing that implies states are quantum) or aides one in describing a future state based on a past state given some variables (a phrasing that implies states are bound to progression in one dimension of time). I'm having some difficulty finding a physics-neutral definition for rule. Perhaps "a description that leads you from one condition to another given some description of direction". Adequate? Good enough, I think.

(exploratory definition) WhatIsaRule?: A description that leads from one condition to another, given some description of direction.

We can call the condition entering the rule the "input condition", and the condition returned by the rule the "output condition". It seems that the natural direction is "next condition", but that again implies a one-dimensional timeline... something that isn't much implied even by modern physics. In a complex generator, rules can change too. Rules that might change can be called "conditional rules", and it is possible that a system has two identical physical states except that the conditional rules set is entirely different. Conditional rules would also need to be part of any output "condition". I.e. the big "rule" used to lead from one condition to another must be able to utilize and integrate over conditional rules.

The description of a rule need not be digital in nature. It is possible that rules can only be described by analog or quantum descriptors, or something else entirely. In fact, there is no need that a rule be describable within the system it describes... no more than the rules of peano arithmetic need be describable using just the peano language. By analogy, for computers or humans there is an evaluation engine, and it that engine must know the rules, not the system itself. For conditional rules, even for those the rules need not be found within the system... they can be tacked on, like a two-piece description (conditional rules, constion state). Rules can be separate from state even if the two interact like a badly directed game of nomic.

Given this description of rule (which is quite generic, but specific enough to make some conclusions), which possibilities allow for non-determinism?

It seems that there are at least two parts of a rule you can focus upon.

One is the direction. In a simple system there is only one "forward" direction (so the direction parameter can essentially be optimized down to "in the future", whether time is discrete or not; discrete time would imply a single "next" step, while non-discrete time would imply that a state exists at every possible time-descriptor, and that we experience only fuzzy snapshots of moments). In a less simple rules system, there are multiple directions... at least two. And, if there is nothing determining which direction is taken from a given condition, it might be worth conceptualizing them all as being taken. If time is three-dimensional and discrete, there are 2^3 paths from any given condition, though it is possible that some directional paths converge (such that R(R(C0 (+1,0,0)) (0,+1,+1)) == R(C0 (+1,+1,+1))). Output condition being independent of path is not a necessary property, only a possible one. Following all paths does not lead to a non-deterministic universe... but it does mean that we, who experience time only on one dimension (but who may have an infinite or finite but absurdly large number of... twins... on other paths) may experience the universe as non-deterministic since we can't ever see which direction we took in the path of time. If not all paths are followed, it would imply that something is sitting outside our universe and giving direction... but that requires that we now expand our universe back to the uber-verse so it grabs that something as part of it.

One can also look at the other side... the output of the rules on a generator. There are many possibilities here. The simple possibility is that there is exactly one output condition and it can always be calculated exactly from the rule, the initial condition, and the direction in a functional manner. A slightly more complicated possibility is that the output of the (rule condition direction) is a union of possible conditions (Ca|Cb|Cc|Cd|...|Cn), possibly not even a bounded number. This situation indicates that the rule is ambiguous, not necessarily non-deterministic. This would imply that there are many possible "futures" even given a single direction in which to apply that future. This, again, leads to the "all are followed" hypothesis, since nothing outside the uber-verse exists to pick just one direction, and again leads to the illusion of non-determinism for those within the uber-verse (since those denizens of the uber-verse will maximally learn enough about the universe to learn that the future will be (Ca|Cb|Cc|...) and they won't know which until they get there). Another possibility is that you are returned a single condition Cx, but that this output condition cannot be determined prior to the rule's application; the same (rule condition direction) will not always return the same output condition... i.e. same as the simple possibility, but with no referential transparency. This would imply that something in the uber-verse, whatever it is, introduces either judgement or (non-judgemental) chance (or both) into determining the path taken by the universe, and that Cx is one of many possible paths prior to selection by chance or judgement. In the case of judgement, this choice might depend on more than just the current condition of the universe, but also on the possibilities introduced by the next ten, next hundred, next million steps. This is true non-determinism since (by description) you cannot determine the precise result prior to asking for the result.

"Random", to us denizens of the universe, would be the path we experience among many possible paths, whether that path is selected by chance or there are a finite but large number of 'twins' experiencing various other paths. Thus, if the rules of the universe are ambiguous, or there are multiple directions the universe can take from any given condition, then our experience in the universe is non-deterministic even if the rules of the universe are not.

If there is judgement, however, then there is something that is both non-deterministic and not truly random... not by that typical understanding of the term used by Person A.


Free will and the Universe

For free will to exist, it must somehow exert itself upon which possible paths are experienced by the denizens of the universe, e.g. by eliminating paths where a denizen possessing free will would perform action contrary to that will. If the path from a given condition is not ambiguous, this means that the free will of every denizen possessing it must be part of what determines the future path of the universe. If the path is ambiguous, that means that the free will of every denizen possessing it must be part of what constrains the future path of the universe. In a sense, free will acts much like that "judgement" mentioned above; if there is something like a god, individuals with free will must be part of it. Such a will must, itself, not be determined entirely by the condition of the universe... though it may be influenced by it (like the rules in a game of nomic).

Such a thing seems magical, but is neither more nor less so than any true random "chance" that would select the future paths of a universe from a set of possibilities.

Occam's razor cuts free will away... it isn't needed to explain anything we've observed thus far. Free will won't be required until evidence exists to introduce it as a necessary hypothesis. How could such evidence be obtained? Probably not from within the universe defined by a single state, unfortunately.


"If the path is ambiguous, that means that the free will of every denizen possessing it must be part of what constrains the future path of the universe. In a sense, free will acts much like that "judgement" mentioned above; if there is something like a god, individuals with free will must be part of it."

I don't understand your reasoning that the denizen must be part of the god constraining the path. Can you expound on that? -- BrucePennington

Ambiguity opposes precision. If the rules governing our universe are ambiguous with regards to, say, the exact path taken through space by a thrown baseball, then no matter how much you know about the rules of our universe and its initial conditions, it remains impossible to predict the exact path observed by a thrown baseball. It might or might not be true that the exact path of the baseball is ambiguous in our universe; we cannot yet model such things as quantum fluctuations regarding energy in a vacuum well enough to assert that they are precise in nature and, indeed, modern thinking outside string theory is much the opposite. Such fluctuations can conceivably modify the path of the baseball to a (very) small degree. In any case, if the rules governing the path of a baseball are ambiguous, the most you can do is constrain a prediction regarding its path to an (admittedly small) volume of possibilities. That's as precise as it is possible to get.

The difference between a precise universe and an ambiguous one is only that, in a precise universe, every single observation can be reduced to one (and exactly one) possibility. I.e. that path of the baseball would be reducible to exactly one path. You might need a higher-order-of-infinity amount of digital information to make that prediction, or your own exact replica of our universe and everyone and everything within it running a few seconds ahead, but it is conceptually doable.

However, ambiguity does not oppose determinism. An ambiguous yet deterministic universe might take every path. We (being limited to observations on one dimension of time) would only observe one path, and it will probabilistically be a common one. An ambiguous non-deterministic universe might truly take one path of the many possibilities at random. Again, it will probabilistically be a common one; we cannot, with our current technologies or understandings, physically distinguish these two possibilities. It's worth keeping in mind that the ambiguous, non-deterministic universe also does not imply the existence of free will.

If free will is to exist in an ambiguous universe, it can only mean that either (a) we choose which of many possible paths we individually observe, or (b) we constrain the set of possible paths we'll experience to a smaller set than what is physically possible. Both (a) and (b) are really the same thing: (a) can be correctly viewed as constraining oneself to exactly one path. That is what choosing is all about: taking a set of possibilities, and eliminating some of them (often all but one). If multiple denizens are to share free will in an experienced universe, then they must collectively choose or constrain the set of possible paths; those that do not choose or do not constrain are not, in any way, exercising free will.

In a sense, the ability to choose or constrain the path of the universe is like unto a god. It is in this sense that the phrase you quoted above was uttered: if there is something like a god, individuals with free will must be part of it.

Of course, the mechanism for constraining the path taken by the universe can be very limited (and very un-god-like) and still allow for free will. Free will does imply power to move the universe, but does not imply the presence of a lever long enough or someplace to stand. I.e. if all you are free to do is pick your nose or not at some particular times, you still have free will... just not much power to leverage it. Perhaps if you attached yourself to a great war machine and communicated to it by picking your nose, you could freely rule the world. More likely, you'll just be rid of an itch for a few seconds before it returns. Generally, though, when 'free will' is discussed with regards to humans, one means the ability to choose certain high-level actions that are within our physical limitations and our intellectual comprehension... e.g. perhaps to aim the gun, possibly to pull the trigger, but not necessarily then to hit the target. One sometimes also means conscious decision, but that, too, is not necessary (conscious thought is limited to a very small section of the brain and overall motor decision). If all your free will allowed was consciously choosing when to pick your nose, it'd be hard to determine whether you were responsible for the shooting; perhaps you should have picked your nose twice more at the gun shop? Hmmm...


CategoryDefinition


EditText of this page (last edited March 6, 2014) or FindPage with title or text search