Little Endian

little-endian

CPU instructions to use multi-byte numbers can take a BigEndian approach, where the most significant bytes of a multi-byte number have the lowest memory addresses. Or the approach can be little-endian with the least significant bytes first (example: ExEightySix).

A related but distinct problem is the approach chosen for indexing a bit in a register. Some CPU instructions have a 3-bit field that designate what bit of a byte is to be operated on. On some processors, a zero in this field accesses the most significant bit of the byte (as determined by the CPU's other instructions for arithmetic operations). Other processors use 7 to address the most significant bit.


One of the reasons little-endian was chosen for many processors a few decades ago was that a pointer to a little endian number can be more usefully treated as a pointer to multiple data types. For example, consider the number

01 02 03 04

as a little endian number (two hexadecimal digits per byte). This then has the value

4*2^12 + 3*2^8 + 2*2^4 + 1

when treated as a 32-bit value. When treated as a 16-bit value, it has the value

2*2^4 + 1

i.e., the 16-bit truncation of the former value. BigEndian numbers do not have this property. --AdamBerger

Pardon the ThreadMode, but what does that have to do with usefulness? Consider that the 16-bit interpretation of the number is a different value than the 32-bit. I think the real underlying reason is that Intel chose LittleEndian to reduce the number of transistors needed for manipulation, and Motorola chose BigEndian to reduce the burden of the programmer looking at memory.

Regardless of the motivation of the respective cpu architects, Adam is correct that LittleEndian and BigEndian have mutually exclusive useful properties -- and algorithms that make use of those properties are the ones that cause problems in porting programs between the two kinds of architectures.

Adam is saying that LittleEndian architectures allow storage of 16 bit integers simply by truncating the 4-byte stream that would be emitted for a full 32 bit integer.

Another handy trick on LittleEndian widely used in interpreted languages (and first pioneered by Lisp) is to use the low order bits of a word to encode the type of the word's value (e.g. a union of a char, int, and word pointer). The equivalent on BigEndian encodes the info in the high order bits, but depending on the scheme, one architecture or the other can be noticeably more efficient due to need for shift versus only a mask, etc. Alas, when porting such algorithms to a cpu with the opposite byte order, one may have to completely redesign the encoding scheme.

Obviously people have long advocated avoiding such tricks and just storing things in network byte order (BigEndian), but it is not uncommon for these issues to arise in inner loops where using network byte order can actually significantly slow down a program (I have personally seen 30% and even 300% slowdowns from using non-native byte order in common programs such as interpreters).


Contrast with BigEndian. See also OnHolyWarsAndaPleaForPeace.


EditText of this page (last edited July 28, 2006) or FindPage with title or text search