Insertion Sort

See also SortingAlgorithms. For more complete reference material, see http://www.nist.gov/dads/HTML/sort.html


The Insertion Sort works by picking an element, figuring out where it should go, then putting it there.

Using BinarySearch to figure where the item should go gives the following runtime analysis:

                Worst case  Average case  Best case
 Comparisons    O(n log n)   O(n log n)   O(n log n)
 Swaps            O(n^2)       O(n^2)        O(1)

If you use LinearSearch to find the item location this changes to:

                Worst case  Average case  Best case
 Comparisons      O(n^2)       O(n^2)        O(n)
 Swaps            O(n^2)       O(n^2)        O(1)

If you expect your data to be mostly sorted the LinearSearch version might be faster, but BubbleSort might work better anyway. Beware of BinarySearch - it is subtle and error-prone, even when using versions from the published literature.

If comparisons are cheap and swaps expensive you'd be better using SelectionSort.

InsertionSort, SelectionSort and MergeSort are all StableSorts, whereas the famous QuickSort is not.


Code example in C

Note that this code is pretty severely broken. Correct examples welcome.

    void
    insertion_sort (int a[], int n /* the size of array */)
    {
        int i;
        for (i = 1; i < n; i++)
        {
            /* Assume items before a[i] are sorted. */
            /* Pick an number */
            int b = a[i];

/* Do binary search to find out the point where b is to be inserted. */

int low = 0, high = i - 1, k;

while (high - low > 1) { int j = (high + low) / 2;

if (b <= a[j]) high = j; else low = j; }

/* Shift items between high and i by 1 */ for (k = i; k > high; k--) a[k] = a[k - 1];

a[high] = b; } }


Code using upper_bound of STL

  /* Algorithm from stl_algo.h in the SGI stl library */
  int
  upper_bound (int a[], int low, int high, int value)
  {
int len = high - low;
while (len > 0)
{
int half = len / 2;
if (value < a[low + half])
len = half;
else
{
low = low + half + 1;
len = len - (half + 1);
}
}
return low;
  }

void insertion_sort (int a[], int n /* the size of array */) { int i; for (i = 1; i < n; i++) { /* Assume items before a[i] are sorted. */

/* Pick up a number */ int b = a[i];

int up, k; up = upper_bound (a, 0, i, b);

for (k = i; k > up; k--) a[k] = a[k - 1];

a[up] = b; } }

Is this still too long? --TakuyaMurata


Couldn't resist:

  void
  insertion_sort (int a[], int n /* the size of array */)
  {
int i;
for (i = 1; i < n; i++)
{
/* Assume items before a[i] are sorted. */

int high = findInsertionPoint(a, i);

int b = a[i];/* OK, so this was added during refactoring ... */ shiftUp(a, high, i); a[high] = b; } }

int findInsertionPoint (int a[], int i) { /* Pick a number */ int b = a[i];

/* Do binary search to find out the point where b is inserted at. */

int low = 0, high = i - 1;

while (high - low > 1) { int j = (high + low) / 2;

if (b <= a[j]) high = j; else low = j; } return high; }

int shiftUp(int a[], int high, int i) { /* Shift items between high and i by 1 */ int k; for (k = i; k > high; k--) a[k] = a[k - 1]; }

I think this is an example of YAGNI YouArentGonnaNeedIt. -- TakuyaMurata

I don't. See the comment in SelectionSort. It also means you can UnitTest the binary search, which is notoriously difficult to get right first time.

I agree that binary search is rather difficult to implement correctly. -- TakuyaMurata

Aren't most things notoriously difficult to get right first time? In the above case, it's very easy to find a correct algorithm in a book or internet site.

Well, yes, but I think the point about abstracting for clarity still stands. In this case the real YAGNI benefit comes from using a library routine and not writing or looking up any algorithm

Sure, we can use upper_bound of STL though you have to implement it in C. -- TakuyaMurata

I meant that you shouldn't code a sort at all. Are there any languages for which this isn't in the standard library or the language itself now?

Many languages have implementations of sort, but not all implement a stable version when it's possible.

And if you want one of those, you probably shouldn't choose InsertionSort. A copy of Knuth to hand would be a good idea.


Not only is a binary search difficult to implement correctly, it doesn't gain you anything in this case. Every element in the already-sorted set greater than the element being inserted will have to be accessed in order to make room for the element being inserted, so you may as well combine the movement of the data with the comparison:

 void insertion_sort(int a[], int elementCount) {
int i;
for (i = 1; i < elementCount; i++) {
int insertee = a[i], j;

for (j = i - 1; j >= 0 && insertee > a[j]; j--) a[j+1] = a[j]; a[j] = insertee; } }

Incidentally, this approach is also stable, unlike the binary-search version. It's also about half the size.


I think I should design and write a test case that drives my code with the kind of data that I think demands a special kind of sort, and that confirms that the sort routine answers within whatever performance criteria my customer requires. In a call center, for example, the initial redirect to an agent must take place within five seconds or the caller will hang up (this has been confirmed empirically with decades of experience). I think that any sort routine that passes my test is appropriate, regardless of algorithm or origin. If I think I really need an insertion sort, I should find a test case that demonstrates the need.

May we discuss an algorithm without worrying about how we would do it while extreme programming? Pretty please, can we just discuss the algorithm?

Why? It's not as if sort algorithms were a poorly-understood area. Any decent algorithms text will cover them adequately, and there's always Knuth if you want all the details

I think a library of test cases that illustrate the key differences of the various sorting algorithms might be a useful component of any modern instructional text.

So, the recipe (in favor of ExtremeProgramming) should be:

In other words, if you are a computer scientist student, want to kill spare time and want to have fun (like me), write sorting code. -- TakuyaMurata


Moved here from SortingAlgorithms (RefactorMe):

InsertionSort: assume the first section of your array is sorted, take the next element and put it where it has to go. You can use a binary search to find the right place to put it, but on average when you do insert it you have to move half the items already sorted. Time-complexity is either (n-1)*(n-2)/2 comparisons if you use linear search to find the right place to put your item, or a much smaller and more complex expression if you use binary search (roughly log2(n)), and an expected (n-1)*(n-2)/2 moves. Can easily be stable.


CategoryAlgorithm


EditText of this page (last edited April 22, 2008) or FindPage with title or text search