Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Heapsort

News

See Also Recommended Books Recommended Links Recommended Papers Donald Knuth
Animations Test suits Reference Presentations Humor Etc

Heapsort is a really cool sorting algorithm. It has an extremely important advantage of worst-case O(n log n) runtime while being an in-place algorithm. Heapsort is not a stable sort. It mainly competes with mergesort, which has the same time bounds, but requires Ω(n) auxiliary space, whereas heapsort requires only a constant amount.

As worst case comparison estimate is lower that in quicksort, heapsort is safer then the latter. And it actually works pretty well in most practically important cases like already sorted arrays (quicksort requires O(n2) comparisons in this case).

Heapsort does not have a locality of reference advantage and thus might runs quicker then alternatives on CPU architectures with small or slow data caches. On the other hand, mergesort has several advantages over heapsort:

Notes:
  • Those pages are written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • This is a Spartan WHYFF (We Help You For Free) site. It cannot replace the best teachers and the best books.
  • The site contain some obsolete pages as it develops like a living tree... Some links on older pages are broken. Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.

Search Amazon by keywords:

Google   
Open directory

Research Index

 

Old News ;-)

The External Heapsort

The memory model assumed for this discussion is a paged memory. The N elements to be sorted are stored in a memory of M pages, with room for S elements per page. Thus N=MS.

The heapsort algorithms seen so far are not very well suited for external sorting. The expected number of page accesses for the basic heapsort in a memory constrained environment (e. g. 3 pages of memory) is tex2html_wrap_inline1347 where N is the number of records and M is the number of disk blocks. We would like it to be tex2html_wrap_inline1353 which typically is orders of magnitude faster. (example: 8 byte element, 4KB page, translates into 512 elements pr page).

NOTE: Be aware that the words 'page access' does not necessarily imply that disk accesses are controlled by the virtual memory subsystem. It may be completely under program control.

One may wonder if the d-heaps described in chapter 5 can be used effeciently as an external sorting method. If we consider the same situation as above, it is seen that if we use the number of elements per page S as the fanout, and align the heap on external memory so that all siblings fit in one page, the depth of the tree will drop with a factor tex2html_wrap_inline1357 and thus the number of page accesses will drop accordingly. As observed in the chapter on d-heaps, the cost of a page access should be weighted against the cost of S comparisons needed to find the maximal child. Anyway, because the cost of a disk access is normally much higher than the time it takes to search linearly through one page of data, this idea is definitely an improvement over the traditional heap.

The drawback of the idea is twofold. Firstly it only improves the external cost with a factor tex2html_wrap_inline1357 , not the desired factor S, and secondly it increases the internal cost with a factor tex2html_wrap_inline1365 . Thus it actually makes sorting much slower in the situation where all data fits in memory. Considering that gigabyte main memories are now commonplace and that terabyte memories will eventually be affordable, it is reasonable to require a good internal performance even of external methods.

The External Heapsort

Heapsort is an internal sorting method which sorts an array of n records in place in O(n log n) time. Heapsort is generally considered unsuitable for external random-access sorting. By replacing key comparisons with merge operations on pages, it is shown how to obtain an in-place external sort which requires O(m log m) page references, where m is the number of pages which the file occupies. The new sort method (called Hillsort) has several useful properties for advanced database management systems. Not only does Hillsort operate in place, i.e., no additional external storage space is required assuming that the page table can be kept in core memory, but accesses to adjacent pages in the heap require one seek only if the pages are physically contiguous. The authors define the Hillsort model of computation for external random-access sorting, develop the complete algorithm and then prove it correct. The model is next refined and a buffer management concept is introduced so as to reduce the number of merge operations and page references, and make the method competitive to a basic balanced two-way external merge. Performance characteristics are noted such as the worst-case upper bound, which can be carried over from Heapsort, and the average-case behavior, deduced from experimental findings. It is shown that the refined version of the algorithm which is on a par with the external merge sort.

Assumptions and Implications of the Heapsort Example

[PDF] Heapsort / Treesort
File Format: PDF/Adobe Acrobat - View as HTML
This is what the procedure Heapify (the key procedure in Heapsort),. in effect,
does. We can describe Heapsort in template form as follows; ...
https://www.cs.tcd.ie/courses/ ba/2ba2/14_heapsort/heapsort.pdf - Similar pages

HEAPSORT
There are three procedures involved in HeapSort: Heapify, Buildheap, and HeapSort
itself, ... Now at last we describe Heapsort, beginning with Heapify. ...
www.cs.umd.edu/class/spring2005/cmsc351/notes/heapsort/ - 17k - Cached - Similar pages

Heapsort
In the figure you see the function heapsort within the functional structure of the
... The function heapsort is not called by any function of the algorithm. ...
olli.informatik.uni-oldenburg.de/ heapsort_SALA/english/Funktionen/heapsort.html - 11k -

Class 1 Introduction

Recommended Links

Citations Algorithm 232 Heapsort - Williams (ResearchIndex)

Princeton Computer Science Technical Reports

Analysis of Heapsort (Thesis)
Authors: Schaffer, Russel W.
Date: June 1992
Pages: 85
Download Formats:

Sorting, Part 2

[PDF] 8.3 Heapsort

Eppstein's paper Sorting by divide-and-conquer

Priority Queues and Heapsort in Java

NIST heapsort A sort algorithm that builds a heap, then repeatedly extracts the maximum item. Run time is O(n log n). A kind of in-place sort. Specialization (... is a kind of me.)adaptive heap sort, smoothsort. See also selection sort.

Note: Heapsort can be seen as a variant of selection sort in which sorted items are arranged (in a heap) to quickly find the next item.

Author: PEB

Lang's definitions, explanations, diagrams, Java applet, and pseudo-code (C), (Java), (Java), Worst-case behavior annotated for real time (WOOP/ADA), including bibliography, (Rexx).

demonstration. Comparison of quicksort, heapsort, and merge sort on modern processors.

Robert W. Floyd, Algorithm 245 Treesort 3, CACM, 7(12):701, December 1964.
J. W. J. Williams, Algorithm 232 Heapsort, CACM, 7(6):378-348, June 1964.
Although Williams clearly stated the idea of heapsort, Floyd gave a complete, efficient implementation nearly identical to what we use today.

Heapsort - Wikipedia, the free encyclopedia

Handbook of Algorithms and Data Structures

Heapsort Introduction

Heap Sort - CS245 Project By Joseph George

The heapsort algorithm uses the data structure called the heap. A heap is defined as a complete binary tree in which each node has a value greater than both its children (if any). Each node in the heap corresponds to an element of the array, with the root node corresponding to the element with index 0 in the array. Considering a node corresponding to index i, then its left child has index (2*i + 1) and its right child has index (2*i + 2). If any or both of these elements do not exist in the array, then the corresponding child node does not exist either. Note that in a heap the largest element is located at the root node. The code for the algorithm is:

    Heapify(A, i){
        l <- left(i)
        r <- right(i)
        if l <= heapsize[A] and A[l] > A[i]
            then largest <- l
            else largest <- i
        if r <= heapsize[A] and A[r] > A[largest]
            then largest <- r
        if largest != i
            then swap A[i] <-> A[largest]
                Heapify(A, largest)
    }

    Buildheap(A){
        heapsize[A] <- length[A]
        for i <- |length[A]/2| downto 1
            do Heapify(A, i)
    }

    Heapsort(A){
        Buildheap(A)
        for i <- length[A] downto 2
            do swap A[1] <-> A[i]
            heapsize[A] <- heapsize[A] - 1
            Heapify(A, 1)
    }

 Heap Sort operates by first converting the initial array into a heap. The HeapSort algorithm uses a process called Heapify to accomplish its job. The Heapify algorithm, as shown above, receives a binary tree as input and converts it to a heap. The root is then compared with its two immediate children, and the larger child is swapped with it. This may result in one of the left or right subtrees losing their heap property. Consequently, the Heapify algorithm is recursively applied to the appropriate subtree rooted at the the node whose value was just swapped with the root and the process continues until either a leaf node is reached, or it is determined that the heap property is satisfied in the subtree.

    The entire HeapSort method consists of two major steps:

    STEP 1: Construct the initial heap using adjust (N/2) times on all non-leaf nodes.
    STEP 2: Sort by swapping and adjusting the heap (N-1) times.

    In step 2, since the largest element of the heap is always at the root, after the initial heap is constructed, the root value is swapped with the lowest, rightmost element. This node corresponds to the last element of the array and so after the swapping the righmost element of the array has the largest value, which is what we want it to have. Consequently, the rightmost element is correctly place, and so the corresponding node becomes a light gray in the tree display depicting the heap. Also, the elements (nodes) chosen for swapping at this point of the algorithm are shown as yellow nodes.

    Thereafter, since the value of the lowest, rightmost node has moved up to the root, the heap property of the rest of the heap (minus the gray nodes) has been disturbed. Consequently, now the Heapify process is applied to the rest of the heap (treating the light gray nodes as though they were no longer a part of the heap). Once the rest of the heap is again converted to a proper heap, the swapping process as described in the previous paragraph, followed by the Heapify process is applied again. This continues until the root node of the heap turns light gray - i.e., the root has its correct value. Then, the array is sorted.

 At each step of the Heapify algorithm, a node is compared to its children and one of them is chosen as the next root. One level of the tree is dropped at each step of this process. Since the heap is a complete binary tree, there are atmost log(N) levels. Thus, the worst case time complexity of Heapify is O(log(N)). In the Heapsort algorithm, Heapify is used (N/2) times to construct the initial heap and (N-1) times to sort. Thus the Heapsort algorithm has a worst case time complexity of O(N*log(N)).

 

Heapsort

 

Priority Queues and Heapsort Lecture 18

Heapsort

Lecture 4 - heapsort

Data Structures and Algorithms Heap Sort

 

Presentations

34-heapsort

PDF] • Implementation • HeapSort • Bottom-Up Heap Construction • Locators
File Format: PDF/Adobe Acrobat - View as HTML

Animation

Heapsort-Animation

Java Sorting Animation Page - Heap Sort

Data Structures and Algorithms Heap Sort with nice animation

Heap sort visualization: Java applet
Developed by Mike Copley, UH for ICS 665 (Prof. J. Stelovsky)

Animation30

Reference

GNU Scientific Library -- Reference Manual - Sorting

Function: void gsl_heapsort (void * array, size_t count, size_t size, gsl_comparison_fn_t compare)
 

This function sorts the count elements of the array array, each of size size, into ascending order using the comparison function compare. The type of the comparison function is defined by,

int (*gsl_comparison_fn_t) (const void * a,
                            const void * b)

A comparison function should return a negative integer if the first argument is less than the second argument, 0 if the two arguments are equal and a positive integer if the first argument is greater than the second argument.

For example, the following function can be used to sort doubles into ascending numerical order.

int
compare_doubles (const double * a,
                 const double * b)
{
    if (*a > *b)
       return 1;
    else if (*a < *b)
       return -1;
    else
       return 0;
}

The appropriate function call to perform the sort is,

gsl_heapsort (array, count, sizeof(double), 
              compare_doubles);

Note that unlike qsort the heapsort algorithm cannot be made into a stable sort by pointer arithmetic. The trick of comparing pointers for equal elements in the comparison function does not work for the heapsort algorithm. The heapsort algorithm performs an internal rearrangement of the data which destroys its initial ordering.

 

HeapSort (Docs for JDSL 2.1) Java

Less Popular Downloadable Utilities

Heapsort implementation unit for Turbo Pascal

Index of -ftp-benchmarks-heapsort

Heapsort in C see heapsort for the source

Heapsort.c results are included below.

The program (heapsort.c) and latest results (heapsort.tbl) can be obtained via anonymous ftp from 'ftp.nosc.mil' in directory 'pub/aburto'. The ftp.nosc.mil IP address is: 128.49.192.51

Please send new results (new machines, compilers, compiler options) to: aburto@nosc.mil. I will keep the results up-dated and post periodically to 'comp.benchmarks'. Any comments or advice or whatever greatly appreciated too.

Heapsort.c Test Results: MIPS is relative to unoptimized gcc 2.1 (gcc -DUNIX) instruction count using assembly output for 80486 (80386 code). The instruction count is divided by the average runtime (for 1 loop) to obtain the RELATIVE MIPS rating for memory sizes from 8000 bytes up to 2048000 bytes. These will not necessarily be like other MIPS results because the instruction mix and weightings are different! This is just a heap sort test program which I have attempted to calibrate to a relative MIPS rating. I suppose we can call this a 'HeapMIPS' rating :)

PLEASE NOTE: The Sun system results can be quite erratic unless the '-Bstatic'option is used (Sun C 2.0/3.0 comment).

Heapsort data

HeapSort - A compact routine that sorts data in place in Basic

Manual Page For heapsort(3) BSD and Apple

heapsort.python

Code-Source Code (Heapsort)

 


Copyright © 1996-2007 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: February 28, 2008