Three Pointers In One Word And One Bit

Read TwoPointersInOneWord for background on this topic.

How about three pointers in one word and one bit? At one time, I owned assembler code (which I did not write) that used a variation of TwoPointersInOneWord to add yet a third pointer. This code was put into production in the late 1970's (or early 80's?) and is still in use today on hundreds of thousands of machines - in fact I run this code virtually every work day millions of times.

The data structure using these pointers implements a binary tree which is tucked inside a each page of a larger B-tree like structure. Since each page is no larger than 64K and aligned on a 64K boundary, only 16 bit addressing (i.e offsets) are needed. The three pointers encoded are parent node, left child, and right child. The XOR technique is used, but the additional trick that makes it work is that each node also has a bit which indicates if it is a left child or a right child. On top of that, the two children of any parent are always allocated at the same time, and occupy contiguous memory. Each node is only three bytes, so the cost of adding two more offsets would have increased storage requirements by 133%!

The XOR relationship is this: each node's stored "pointer" is the XOR of its parent's offset and its left child's offset. The right child is always immediately follows the left child in memory. Starting with the offset of any parent/child pair lets you do traversals of the the tree. (The root is considered its own parent to get started.) As you hit leaf nodes and back up, the left/right bit assists in knowing whether to find the next node by simply adding three to get to the right child, or whether to traverse back up another node the tree until you find another "left" node. There is more complexity but this gives the idea. (Leaf nodes were encoded differently and used the three bytes differently.)

The tree can be traversed left to right, or right to left.

The application is the the kernel level indexing ("machine index") for the IBM System/38 (1980-88) IBM AS/400 (1988-2000) and IBM eServer iSeries (2001-?) machines. This indexing support is used in storage management (paging directories), DB2 index support (yes, IBM DB2/400 is implemented largely in the kernel), security objects, directory objects, and numerous other places. The data structure's design has been basically unchanged since its inception. However, it has since been re-written in C++ with the move to the PowerPc based processor. A new version was also written to handle longer index keys and more of them. -- JohnVriezen

[The above method of counting would, of course, allow an unlimited number of pointers per word.] I'm not sure what you mean by this. Can you elaborate a bit?

For example, try FivePointersInOneWordAndTwoBits: instead of having just left&right children, have four children, and have 2 bits of indexing instead of one.


This should come with a disclaimer: Professional driver on closed course, do not attempt. :-)


CategoryPointer


EditText of this page (last edited November 8, 2014) or FindPage with title or text search