Goedel Numbering

Since the page was accreting incorrect statements, I've tried to add a more detailed explanation:

Goedel invented GoedelNumbering as a way of assigning a unique positive integer to every possible formula in a mathematical system, so that mathematical statements could be translated into numbers, which could then appear within other statements, allowing mathematical statements to become FirstClass values, which he then used to prove his two famous theorems.

The clever method he used for this encoding was to take advantage of the Fundamental Theorem of Arithmetic: every positive integer has a unique factorization. In GoedelNumbering, the factorization of an integer yields the statement that the integer encoded, and does so uniquely.

To do so, every symbol in the alphabet to be used for statements is assigned a unique number, e.g. "a"=100, "b"=200, "c"=300 (large values picked for clarity of exposition) etc., and every possible position in the statement is assigned a PrimeNumber, for instance, position 1 = 2, position 2 = 3, position 3 = 5, position 4 = 7, position 5 = 11, position 6 = 13, etc. The encoding is performed by taking the position encoding and raising it to the power of the symbol encoding at that position, and cumulatively multiplying together each such encoded position/symbol.

For instance, the message "bc" has "b" in position 1; "b" is represented by 200, position 1 is represented by the prime 2, so the first position/value pair is uniquely encoded by 2^200. The second symbol is "c", which is represented by 300, while position 2 is represented by the prime 3, so the second position value pair is uniquely encoded by 3^300, and the whole message is equal to 2^200 * 3^300, multiplied out in full:

 21997612563412852819736854443172936461863101231358274561146636833168
 58315086690275754329270589426383652921493514482045416477153318639590
 74292125411622867441287958061576051884137642597711028147158438117376

The original message can be recovered (in principle) by finding the unique factorization of that number, which has only one unique answer, 2^200 * 3^300, which tells us which symbols were encoded at which positions.

This example uses 200 to encode "b" and 300 to encode "c" merely because use of 2 and 3 would be visually confusingly similar to the encodings of positions 1 and 2, but arithmetically would cause no problems and would yield the correct answers. However, even with minimum encodings, even small statements tend to have very large GoedelNumbers?, frequently too large to be practical to work with, although factoring them is comparatively easy in principle because the factors come from a known "pool", so the factorization is tedious but straightforward.

Regardless, the purpose of GoedelNumbering is to allow statements to be made about statements.

A simple GoedelNumbering for a program written in ASCII text is to treat the entire program as a big base-256 number

I wonder if there is a parallel in GoedelNumbering with the concept of PrimeNumbers; do prime goedel's constitute new 'primitive' operations? -- WilliamUnderwood Is math on GoedelNumberings ever useful? Other than the obvious case of concatenating two programs with a (very large) arithmetic operation; I can't think of any mathematical transformations that wouldn't result in a file containing utter rubbish. Because these numbers are a mathematical vehicle you could dispense with them altogether in any practical application. Just keep in mind, that whatever representation you use, it could be translated into one number (in a way you can see this in any number program: consider the whole memory as one large number in base-256 notation - for the bytes).

But "any mathematical transformations" would include factorization, which doesn't lead to "utter rubbish". When dealing with large numbers, be tolerant of large amounts of arithmetic.


Hmmm. This bears some evident relation to the computation of CRC values. In computing a CRC, the entire message is treated as one large number (string-o'-bits) which is then divided (using digital slight of hand) by an appropriate prime polynomial, yielding a (nearly unique) remainder. One is never really interested in the "value" of the message as a number, but only in whether the math (performed again later, on retrieval) produces the same remainder.

Not directly a numerical interpretation, but evidently the consequence of this property.

Do I get an "amen" on this?

It bears some relation, but CRC values are not unique - it's quite possible to alter a file and keep the same CRC as a similar file

While this is true, it can be pretty darn hard to keep the length and essential content the same (given adequate CRC size), and the "similar" file would have to be really contrived -- which explains why checksums aren't worthless to begin with. No, the CRCs are not unique, but for a given length of text with substantially the same content, they pretty much are.

We're not describing self-correcting encoding schemes and Hamming distances. The point was that the numeric nature of the message can be exploited this way. The fact that it submits to math in any sort of useful way seems to relate to the "essentially unique number" property of otherwise non-numeric data.


GoedelNumbering is explained among other things in the book GoedelEscherBach.


EditText of this page (last edited November 3, 2005) or FindPage with title or text search