Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Insertion sort

Old News

See Also Recommended Books Recommended Links Donald Knuth
Animations Test suits History Humor Etc

There are two major modifications of insertion sort: linear and binary. They can be distinguished by the method of finding a place to insert the current element. It's natural to expect that for large sets binary search can  speed insertions several times and thus achieve higher sorting speed at the cost of some additional complexity.

But linear insertion is easier to explain and let's start with it. We will assume that we are sorting an array in ascending order. 

Insertion sort always maintains two zones in the array to be sorted: sorted and unsorted. At the beginning the sorted zone consist of one element. On each step the algorithms expand it by one element inserting the first element from the unsorted zone in the proper place in the sorted zone and shifting all larger elements one slot down. It is an algorithms that many people intuitively use for sorting cards and it is very easy to illustrate on the deck of cards.  Here is an example (sorted zone is in blue, unsorted is in red):

5 | 3 1 7 0 -> 3 5 | 1 7 9 -> 1 3 5 | 7 9 ->  1 3 5 7 | 9 -> 1 3 5 7 9

Since multiple keys with the same value are placed in the sorted array in the same order that they appear in the input array, insertion sort is stable.

If input data contains already sorted fragments insertion sort works slightly faster, but unless the sorted fragment is found of the very beginning of the data set, it moved keys one by one.

If the input data are sorted in the opposite direction, insertion soft cannot take advantage of this.

Out of three elementary sort algorithms insertion sort uses fewer comparisons, so it might be faster when comparisons are expensive, for example long similar strings. 

C implementation usually contains two nested loops:

Please note that C language control structures are not very well suited for precise implementation of the optional control graph of sorting or searching algorithms one should avoid temptation to adjust algorithm to the capabilities of the language, the error that is typical in so many textbooks.

Here is sample non-optimal insertion sort implementation that contains the flaws that I mentioned:

void insertionSort(int a[], int array_size)
    {
      int i, j, current;

      for (i=1; i < array_size; i++)
      {
        current = a[i];
        j = i; # index of the end of sorted region
        while ((j > 0) && (a[j-1] > current))
        {
          a[j] = a[j-1];
          j = j - 1;
        }
        a[j] = current;
      }
    }

Here is a slightly better implementation that avoids using the second for loop that I prepared for my algorithms class many years ago (note that the author is no Knuth and algorithms is somewhat skewed to use available in C control structures; this is not how this algorithm should be programmed in assembler):

/* insert sort */
/* Nikolai Bezroukov, 1996 */

#include <stdio.h>
#include <stdlib.h>

typedef int itemtype;     // type of item to be sorted
int total_comp=0;
int total_moves=0;

void insertSort(
itemtype *a, // array to be sorted
int n  // size of the array; note that (n-1) is the upper index of the array
)
{
    itemtype t;
    int i, j;

   /* Outer "boundary definition" loop */
    for (i = 1; i < n; i++) {
         if ( a[i-1]<=a[i] ) { total_comp++; continue;}

        t = a[i]; total_moves++;

        /* inner loop: elements shifted down until insertion point found  */
        for (j = i-1; j >= 0; j--) {
             total_comp++;
             if ( a[j] <= t ) { break; }   
             a[j+1] = a[j]; total_moves++;
        }
        /* insert */
        a[j+1] = t; total_moves++;
    }

}


 


Notes:
  • Those pages are written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • This is a Spartan WHYFF (We Help You For Free) site. It cannot replace the best teachers and the best books.
  • The site contain some obsolete pages as it develops like a living tree... Some links on older pages are broken. Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.

Search Amazon by keywords:

Google   
Open directory

Research Index

 

Recommended Links

Insertion Sort (ciips.ee.uwa.edu.au)

Insertion Sort (nice description)

Insertion Sort (www.personal.kent.edu)

Insertion Sort simple animation

CCAA - Sorting Algorithms

binary insertion sort -- an optimization of the insertion sort where the proper location is found by binary search.

Binary Insertion Sort

bender...004librarysort.ps PS.gz  PS  PDF  Image  Update  Help

Insertion Sort (linux.wku.edu)

The insertion sort works just like its name suggests - it inserts each item into its proper place in the final list. The simplest implementation of this requires two list structures - the source list and the list into which sorted items are inserted. To save memory, most implementations use an in-place sort that works by moving the current item past the already sorted items and repeatedly swapping it with the preceding item until it is in place.

Like the bubble sort, the insertion sort has a complexity of O(n2). Although it has the same complexity, the insertion sort is twice as efficient as the bubble sort.

Pros: Relatively simple and easy to implement.
Cons: Inefficient for large lists.

 

Abstract: Traditional INSERTION SORT runs in O(n&sup2;) time because each insertion takes O(n) time. When people run INSERTION SORT in the physical world, they leave gaps between items to accelerate insertions. Gaps help in computers as well. This paper shows that GAPPED INSERTION SORT has insertion times of O(log n) with high probability, yielding a total running time of O(n log n) with high probability. (Update)

insertionsort

Dichotomic insertion consists in using a dichotomic search to determine where to insert the new item, taking advantage of the fact the sub array is already sorted. This yield the most efficient algorithm in terms on number of comparisons, with a guaranteed bound of sum(log2 i) for i in [1..n-1], or log2 (n-1)!, which is strictly less than n log2 n. Unfortunately, this algorithm is not efficient in practice, because, to the opposite of Selection sort, it does on average of n*n exchanges. There are two variants, depending if we assume the array to be almost sorted (DichoSort) or random or inverted (DichoSort2).


Copyright © 1996-2007 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: February 28, 2008