|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
|
| See Also | Recommended Books | Recommended Links | Lecture Notes and E-books | Donald Knuth | TAOCP Volume3 | Animations | |
| Linear Search | Binary Search | Hashing | Trees | AVL trees | Fibonacci search | Humor | Etc |
Searching algorithms are closely related to the concept of dictionaries. Dictionaries are data structures that support search, insert, and delete operations. One of the most effective representations is a hash table. Typically, a simple function is applied to the key to determine its place in the dictionary. Another efficient search algorithms on sorted tables is binary search.
If the dictionary is not sorted then heuristic methods of dynamic reorganization of the dictionary are of great value. One of the simplest are cache-based methods: several recently used keys are stored a special data structure that permits fast search (for example is always sorted). For example keeping the last N recently found values at the top of the table (or list) dramatically improved performance. Other cache-based approach are also possible. In the simplest form the cache can be merged with the dictionary:
move-to-front method A heuristic that moves the target of a search to the head of a list so it is found faster next time.
transposition method Search an array or list by checking items one at a time. If the value is found, swap it with its predecessor so it is found faster next time.
In the paper
On self-organizing sequential search heuristics, Communications of the ACM, v.19
n.2, p.63-67, Feb. 1976 , R.L. Rivest presents a set of methods for
dynamically reordering a sequential list containing
|
About: C Minimal Perfect Hashing Library is a portable LGPL library to create and to work with minimal perfect hashing functions. The library encapsulates the newest and more efficient algorithms available in the literature in an easy-to-use, production-quality, fast API. The library is designed to work with big entries that cannot fit in the main memory. It has been used successfully for constructing minimal perfect hashing functions for sets with billions of keys.
Changes: This version adds the internal memory bdz algorithm and utility functions to (de)serialize minimal perfect hash functions from mmap'ed memory regions. The new bdz algorithm for minimal perfect hashes requires 2.6 bits per key and is the fastest one currently available in the literature.
This research is all concerned with the development of an analysis and simulations of the effect of mixed deletions and insertions in binary search trees. Previously it was believed that the average depth of a node in a tree subjected to updates was decreased towards an optimal O(log n) when using the usual algorithms (See e.g. Knuth Vol. 3 [8] ) This was supported to some extent by a complete analysis for trees of only three nodes by Jonassen and Knuth [7] in 1978. Eppinger [6] performed some simulations showing that this was likely false, and conjectured, based on the experimental evidence, that the depth grew as O(log^3 n), but offered no analysis or explanation as to why this should occur. Essentially, this problem concerning one of the most widely used data structures had remained open for 20 years.Both my MSc [1] and Ph.D. [2] focused on this problem. This research indicates that in fact the depth grows as O(n^{1/2}). A proof under certain simplifying assumptions was given in the theoretical analysis in Algorithmica [3], while a set of simulations was presented together with a less formal analysis of the more general case, in the Computer Journal [4] , intended for a wider audience. A complete analysis for the most general case is still open.
More recently P. Evans [5] has demonstrated that asymmetry may be even more dangerous in smaller doses! Algorithms with only occasional asymmetric moves tend to develop trees with larger skews, although the effects take much longer.
Publications
- Joseph Culberson. Updating Binary Trees. MSc Thesis, University of Waterloo Department of Computer Science, 1984. (Available as Waterloo Research Report CS-84-08.) Abstract
- Joseph Culberson. The Effect of Asymmetric Deletions on Binary Search Trees. Ph. D. Thesis University of Waterloo Department of Computer Science, May 1986. (Available as Waterloo Research Report CS-86-15.) Abstract
- Joseph Culberson J. Ian Munro. Analysis of the standard deletion algorithms in exact fit domain binary search trees. Algorithmica, vol 6, 295-311, 1990. Abstract
- Joseph Culberson J. Ian Munro. Explaining the behavior of binary search trees under prolonged updates: A model and simulations. The Computer Journal, vol 32(1), 68-75, February 1989. Abstract
- Joseph C. Culberson and Patricia A. Evans Asymmetry in Binary Search Tree Update Algorithms Technical Report TR94-09
Other References
- Jeffery L. Eppinger. An empirical study of insertion and deletion in binary trees. Communications of the ACM vol. 26, September 1983.
- Arne T. Jonassen and Donald E. Knuth A trivial algorithm whose analysis isn't. Journal of Computer and System Sciences, 16:301-322, 1978.
- D. E. Knuth Sorting and Searching Volume III, The Art of Computer Programming Addison-Wesley Publishing Company, Inc., Reading, Massachusetts, 1973.
In [3], R.L. Rivest presents a set of methods for dynamically reordering a sequential list containing
N records in order to increase search efficiency. The methodA (fori i between 1 andN ) performs the following operation each time that a recordR has been successfully retrieved: MoveR forwardi positions in the list, or to the front of the list if it was in a position less thani . The methodA 1 is called the transposition method, and the methodA is called the move-to-front method. N -1
Linear search - encyclopedia article about Linear search.
In the previous labs we have already dealt with linear search, when we talked about lists. Linear Search in Scheme is probably the simplest example of a storage/retrieval datastructure due to the number of primitive operations on lists that we can use. For instance, creation of the datastructure just requires defining a null list.
There are two types of linear search we will examine. The first is search in an unordered list; we have seen this already in the lab about cockatoos and gorillas. All of the storage retrieval operations are almost trivial: insertion is just a single cons of the element on the front of the list. Search involves recurring down the list, each time checking whether the front of the list is the element we need. Deletion is similar to search, but we cons the elements together as we travel down the list until we find the element, at which point we return the cdr of the list. We also did analysis on these algorithms and deduced that the runtime is O(n).
C++ Examples - Linear Search in a Range
On the Competitiveness of Linear Search
C++ Notes: Algorithms: Linear Search
CSC 108H - Lecture Notes
... Linear Search (Sequential). Start
with the first name, and continue looking until
x is found. Then print corresponding phone number. ... Linear
Search (on a Vector). ...
Linear Search: 1.1 Description/Definition
The simplest of these searches is the Linear
Search. A Linear Search is simply searching
through a line of objects in order to find a particular item. ...
Linear Search
Linear Search. No JDK 1.3 support, applet not shown.
www.cs.usask.ca/.../csconcepts/2002_7/static/ tutorial/introduction/linearsearchapp/linearsearch.html - 2k -Cached - Similar pages
Move to the front (self-organizing) modification
Linear Search the Transpose Method
Here the heuristic is slightly different from the Move-To-Front method : once found, the transition is swapped with the immediately preceding one, performing an incremental bubble sort at each access.
Of course, these two techniques provide a sensitive speed-up only if the probabilities for each transition to be accessed are not uniform.
These are the seven representations that caught our attention at first but that set of containers will be broadened when a special need is to be satisfied.
What is obvious here is that each of these structures has advantages and drawbacks, whether on space complexity or on time complexity. All these constraints have to be taken in account in order to construct code adapted to particuliar situations.Linear Search : the Move-To-Front Method
Self-organizing linear search
... Self-organizing linear search. Full text, pdf formatPdf (1.73 MB). ... ABSTRACT Algorithms
that modify the order of linear search lists are surveyed. ...
portal.acm.org/citation.cfm?id=5507 - Similar pages
Definition: Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval covering the whole array. If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is empty.
Generalization (I am a kind of ...)
dichotomic search.Aggregate parent (I am a part of or used in ...)
binary insertion sort, ideal merge, suffix array.Aggregate child (... is a part of or used in me.)
divide and conquer.See also linear search, interpolation search, Fibonaccian search, jump search.
Note: Run time is O(ln n). The search implementation in C uses 0-based indexing.
Author: PEB
Implementation
search (C) n is the highest index of the 0-based array; possible key indexes, k, are low < k <= high in the loop. (Scheme), Worst-case behavior annotated for real time (WOOP/ADA), including bibliography.
en.wikipedia.org/wiki/Binary_search
Binary Search Trees http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/niemann/s_bin.htm
Binary Search Tree http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/binarySearchTree.htm
(see also Libraries -- there are a several libraries that implementat AVL trees)
(see The Fibonacci Numbers and Golden section in Nature for information about Fibinacci, who introduced the decimal number system into Europe and first suggested a problem that lead to Fibonacci numbers)
Here is the method for the continuous case:
The fibonacci search method minimizes the maximum number of evaluations needed to reduce the interval of uncertainty to within the prescribed length. For example, it will reduce the length of a unit interval [0,1] to 1/10946 (= .00009136) with only 20 evaluations. In the case of a finite set, fibonacci search finds the maximum value of a unimodal function on 10,946 points with only 20 evaluations, but this can be improved -- see lattice search.
For very large N, the placement ratio (g_N) approaches the golden mean, and the method approaches the golden section search. Here is a comparison of interval reduction lengths for fibonacci, golden section and dichotomous search methods. In each case N is the number of evaluations needed to reduce length of the interval of uncertainty to 1/F_N. For example, with 20 evaluations dichotomous search reduces the interval of uncertainty to .0009765 of its original length (with separation value near 0). The most reduction comes from fibonacci search, which is more than an order of magnitude better, at .0000914. Golden section is close (and gets closer as N gets larger).
Evaluations Fibonacci Dichotomous Golden section
N 1/F_N 1/2^|_N/2_| 1/(1.618)^(N-1)
============================================================
5 .125 .25 .146
10 .0112 .0312 .0131
15 .00101 .00781 .00118
20 .0000914 .000976 .000107
25 .00000824 .0002441 .0000096
=============================================================
Here is a biography of Fibonacci.
N F_N
==============
0 1
1 1
2 2
3 3
4 5
5 8
6 13
7 21
8 34
9 55
10 89
20 10946
50 2.0E10
100 5.7E20
==============
Named after the 13th century mathematician who discovered it, this sequence has many interesting properties, notably for an optimal univariate optimization strategy, called fibonacci search
Copyright © 1996-2007 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: May 06, 2008