Two Pointers In One Word

Consider the case where information arrival and lifetime suggest the use of a linked list, and access patterns indicate the need for double linking, but the per-node storage overhead of two pointers per node is unacceptable given acute memory shortage. You must encode two pointers worth of bits into the space of one by factoring out some information that can be held elsewhere.

Therefore: ExclusiveOr? the two pointers together and store the result in one word. Make sure that you always have pointers to at least two adjacent nodes. Add header or trailer nodes if necessary. Exclusive-or the address of one node with the data in the adjacent node to reconstruct one pointer, do the reverse to reconstruct the other. Add and subtract can be used in place of exclusive-or with careful attention to overflow.

This tip was offered by an undergraduate algorithms instructor at Purdue University in the early '70s under the name Two pointers for the price of one plus tax. I considered it largely pedagogical, though I would not ever hesitate to use it should the need arise. It hasn't yet. -- WardCunningham


If memory is so tight that adding a second pointer to a data structure is unacceptable, I would suggest not using linked lists at all (or dynamic memory allocation for that matter). This is often the case in embedded systems and they usually make do with fixed size tables. Exclusive-Or-ing of pointers is interesting, but not a practical solution. --WayneMack

One thing I wonder is how much larger the code segment gets when you do this trick. You may save some of the heap, but if you sacrifice more on the code segment, you aren't necessarily winning. Then again, using a singly-linked list instead may also create more code depending on the algorithm. Moreover, there's a balance between cycles and memory. A lot of skill on embedded devices doesn't come from being ThreeStarProgrammers (though, that helps), but from knowing how to make that trade off. -- SunirShah

I'm speculating here, but it seems that the trick will cost you dozens of bytes in terms of code (I'll try it ASAP). The trick hurts understanding of the code, but the magic can be encapsulated in iterators. What worse, acting this way you loose the ability to come to some element of the list (say, through some lookup of pointer to it) and then traverse to any direction. You're limited to traversing from the head or from the tail. But for me this is just plain fun. Never come to situation where I've to do such tricks :)) -- PavelPerikov

If you use this trick and then want to build a conservative garbage collector into your program, you will regret it. :-) Anyway, I agree with what Wayne said above: the main use for doubly, as opposed to singly, linked lists is the ability to grab any node and walk in either direction from it. You lose that when you do the xor-links trick.

However, note that now the same code will either walk the list from begin to end, or from end to begin. So if you need to do both, you save one routine and code size might actually decrease.

-- StephanHouben

Can someone give a simple example? I fail to see how this XOR technique saves storage at all. Surely, neither XOR nor addition/subtraction is magically a method of compressing addresses, whether for a list or any other purpose.

        #include <stdio.h>
        #include <stdlib.h>
        #include <assert.h>

/* pray that a long is the size of a pointer */ typedef unsigned long pointer;

typedef struct { pointer next_prev; int payload; } link;

link* add_data(int payload, link* list) { link *new_link = malloc(sizeof(link)); assert(new_link); new_link->next_prev = (pointer)list; new_link->payload = payload; if (list != NULL) list->next_prev ^= (pointer)new_link; return new_link; }

void walk_list(link *list) { pointer prev = 0; while (list != NULL) { pointer next = prev ^ list->next_prev; printf("%d ", list->payload); prev = (pointer)list; list = (link*)next; } printf("\n"); }

int main(void) { /* create a list */ link *l2 = add_data(2, NULL); /* add something to the front ... */ link *l1 = add_data(1, l2); /* add something to the back ... */ link *l3 = add_data(3, l2);

/* walk from front to back */ walk_list(l1); /* walk from back to front */ walk_list(l3);

return 0; }

I said "simple". Suppose three linked records, stored at A, B and C are to hold a zero value each (plus linking info). Assuming one opts for the add/subtract version, what are the link values? On second thought, I follow the method, but the original description is a bit confusing. Moreover, the external links need to be doubled since one needs two (consecutive) links to start traversing. Things get slightly awkward if one finds a record in one list, but then needs to access the adjacent nodes of a second list containing the same record. Where does one get the extra link needed so that one can use the second list?

You'd have to iterate through the second list, just to get the link. See Pavel's contribution above. -- ThomasHolenstein?

You're right, and I perceive the duplication. But the whole point of wanting double linking in the first place is to avoid searching through the second list. If that weren't the case, one would make the secondary lists singly linked. Moreover, where does one hold any (root) address of the second list, when there are numerous such lists?

Never mind about the duplication. Furthermore, you are completely right about your second point: It would be a bad idea to use this pattern if you have numerous such lists. (As a side note: I should mention I am not the one who wrote the nice code above) -- ThomasHolenstein?

I was reading rather fast initially. Maybe the intro could have said, "Store xptr at each node, where xptr is the Xor of the pointers to the two adjacent nodes; either pointer is then obtainable as the Xor of the other with xptr." Adding 'at each node' and avoiding 'do the reverse' seems to help. When I first used double linking, it was thought it might assist debugging, which turned out to be correct. Thorough testing would also help, of course, especially if one builds in recovery from 'out of memory'.


See also ThreePointersInOneWordAndOneBit.


It goes without saying that this sorta thing is technically UndefinedBehavior, and thus not portable. (This goes for PointerSwizzling? in general). In addition, if you use a GarbageCollector with your C code, it will probably break things (if there are no visible pointers to a heap object in the RootSet? or anything reachable from it, the GC may delete the object in question).

It might used if they are using offsets into a file rather than memory pointers, though.


There is an excercise in TheArtOfComputerProgramming to determine how to achieve the linkage with only room for one address. This is the answer, and I did not come up with it. (It's perfectly well defined in MIX, of course.)

I've been surprised by other magical techniques in there, like using a single 4-word data structure for heterogenous data types, and not allocating space for the unused portion. This lead to several lines of assembly that each contained a pointer to three lines before them, where there was nothing sensible. It took me longer than is reasonable to see that those were each a one-element circular list.

--JesseMillikan


See citation in ThreeStarProgrammer, and FunWithExclusiveOr


CategoryCoding CategoryPointer


EditText of this page (last edited November 8, 2014) or FindPage with title or text search