Information Retrieval

By C. J. van Rijsbergen
Introduction

Information retrieval is a wide, often loosely-defined term but in these pages I shall be concerned only with automatic information retrieval systems. Automatic as opposed to manual and information as opposed to data or fact. Unfortunately the word information can be very misleading. In the context of information retrieval (IR), information, in the technical meaning given in Shannon's theory of communication, is not readily measured (Shannon and Weaver). In fact, in many cases one can adequately describe the kind of retrieval by simply substituting 'document' for 'information'. Nevertheless, 'information retrieval' has become accepted as a description of the kind of work published by Cleverdon, Salton, Sparck Jones, Lancaster and others. A perfectly straightforward definition along these lines is given by Lancaster: 'Information retrieval is the term conventionally, though somewhat inaccurately, applied to the type of activity discussed in this volume. An information retrieval system does not inform (i.e. change the knowledge of) the user on the subject of his inquiry. It merely informs on the existence (or non-existence) and whereabouts of documents relating to his request.' This specifically excludes Question-Answering systems as typified by Winograd and those described by Minsky]. It also excludes data retrieval systems such as used by, say, the stock exchange for on-line quotations....
The structure of the book
The introduction presents some basic background material, demarcates the subject and discusses loosely some of the problems in IR. The chapters that follow cover topics in the order in which I would think about them were I about to design an experimental IR system. They begin by describing the generation of machine representations for the information, and then move on to an explanation of the logical structures that may be arrived at by clustering. There are numerous methods for representing these structures in the computer, or in other words, there is a choice of file structures to represent the logical structure, so these are outlined next. Once the information has been stored in this way we are able to search it, hence a discussion of search strategies follows. The chapter on probabilistic retrieval is an attempt to create a formal model for certain kinds of search strategies. Lastly, in an experimental situation all of the above will have been futile unless the results of retrieval can be evaluated. Therefore a large chapter is devoted to ways of measuring the effectiveness of retrieval. In the final chapter I have indulged in a little speculation about the possibilities for IR in the next decade.
The two major chapters are those dealing with automatic classification and evaluation. I have tried to write them in such a way that each can be read independently of the rest of the book (although I do not recommend this for the non-specialist).
Outline
  • Chapter 2: Automatic Text Analysis - contains a straightforward discussion of how the text of a document is represented inside a computer. This is a superficial chapter but I think it is adequate in the context of this book.
  • Chapter 3: Automatic Classification - looks at automatic classification methods in general and then takes a deeper look at the use of these methods in information retrieval.
  • Chapter 4: File Structures - here we try and discuss file structures from the point of view of someone primarily interested in information retrieval.
  • Chapter 5: Search Strategies - gives an account of some search strategies when applied to document collections structured in different ways. It also discusses the use of feedback.
  • Chapter 6: Probabilistic Retrieval - describes a formal model for enhancing retrieval effectiveness by using sample information about the frequency of occurrence and co-occurrence of index terms in the relevant and non-relevant documents.
  • Chapter 7: Evaluation - here I give a traditional view of the measurement of effectiveness followed by an explanation of some of the more promising attempts at improving the art. I also attempt to provide foundations for a theory of evaluation.
  • Chapter 8: The Future - contains some speculation about the future of IR and tries to pinpoint some areas of research where further work is desperately needed.
Read More/Download

Computational Modeling and Complexity Science

By Allen Downey

This book is about data structures and algorithms, intermediate programming in Python, complexity science and the philosophy of science: Data structures and algorithms: A data structure is a collection that contains data elements organized in a way that supports particular operations. For example, a dictionary organizes key-value pairs in a way that provides fast mapping from keys to values, but mapping from values to keys is generally slower. An algorithm is an mechanical process for performing a computation. Designing efficient programs often involves the co-evolution of data structures and the algorithms that use them. For example, the first few chapters are about graphs, a data structure (nested dictionaries) that is a good implementation of a graph, and several graph algorithms that use this data structure.
Python programming: This book picks up where Think Python leaves off. I assume that you have read that book or have equivalent knowledge of Python. As always, I will try to emphasize fundmental ideas that apply to programming in many languages, but along the way you will learn some useful features that are specific to Python. Computational modeling: A model is a simplified description of a system that is useful for simulation or analysis. Computational models are designed to take advantage of cheap, fast computation.
Philosophy of science: The models and results I will present raise a number of questions relevant to the philosophy of science, including the nature of scientific laws, theory choice, realism and instrumentalism, holism and reductionism, and Bayesian epistemology.
There are two kinds of computational models:
Continuous: Many computational models compute discrete approximations of equations that are continuous in space and time. For example, to compute the trajectory of a planet, you could describe planetary motion using differential equations and then compute a numerical approximation of the position of the planet at discrete points in time. The fields of numerical methods and scientific computing tend to focus on these kinds of models.
Discrete: Discrete models include graphs, cellular automata, and agent-based models. They are often characterized by structure, rules and transitions rather than by equations. They tend to be more abstract than continuous models; in some cases there is no direct correspondence between the model and a physical system. Complexity science is an interdiscipinary field—at the intersection of mathematics, computer science and physics—that focuses on these kinds of models.

And that’s what this book is about.
Read More/Download

Global Optimization Algorithms – Theory and Application

By Thomas Weise
This e-book is devoted to global optimization algorithms, which are methods to find optimal solutions for given problems. It especially focuses on Evolutionary Computation by discussing evolutionary algorithms, genetic algorithms, Genetic Programming, Learning Classifier Systems, Evolution Strategy, Differential Evolution, Particle Swarm Optimization, and Ant Colony Optimization. It also elaborates on other metaheuristics like Simulated Annealing, Extremal Optimization, Tabu Search, and Random Optimization. The book is no book in the conventional sense: Because of frequent updates and changes, it is not really intended for sequential reading but more as some sort of material collection, encyclopedia, or reference work where you can look up stuff, find the correct context, and are provided with fundamentals. With this book, two major audience groups are addressed:
  • It can help students since we try to describe the algorithms in an understandable, consistent way and, maybe even more important, includes much of the background knowledge needed to understand them. Thus, you can find summaries on stochastic theory and theoretical computer science in Part IV on page 455. Additionally, application examples are provided which give an idea how problems can be tackled with the different techniques and what results can be expected.
  • Fellow researchers and PhD students may find the application examples helpful too. For them, in-depth discussions on the single methodologies are included that are supported with a large set of useful literature references.
If this book contains something you want to cite or reference in your work, please use the citation suggestion provided in Chapter D on page 591.

Topics
  • Evolutionary Algorithms
  • Genetic Algorithms
  • Genetic Programming
  • Learning Classifier Systems
  • Hill Climbing
  • Simulated Annealing
  • Example Applications
  • Sigoa – Implementation in Java
  • Background (Mathematics, Computer Science)
Read More/Download

Fast Fourier Transforms

By C. Sidney Burrus
This book focuses on the discrete Fourier transform (DFT), discrete convolution, and particularly, the fast algorithms to calculate them. These topics have been at the center of digital signal processing since its beginning, and new results in hardware, theory and applications continue to keep them important and exciting.
As far as we can tell, Gauss was the first to propose the techniques that we now call the fast Fourier transform (FFT) for calculating the coefficients in a trigonometric expansion of an asteroid's orbit in 1805. However, it was the seminal paper by Cooley and Tukey in 1965 that caught the attention of the science and engineering community and, in a way, founded the discipline of digital signal processing (DSP).
The impact of the Cooley-Tukey FFT was enormous. Problems could be solved quickly that were not even considered a few years earlier. A flurry of research expanded the theory and developed excellent practical programs as well as opening new applications. In 1976, Winograd published a short paper that set a second flurry of research in motion. This was another type of algorithm that expanded the data lengths that could be transformed efficiently and reduced the number of multiplications required. The ground work for this algorithm had be set earlier by Good [and by Rader. In 1997 Frigo and Johnson developed a program they called the FFTW (fastest Fourier transform in the west), which is a composite of many of ideas in other algorithms as well as new results to give a robust, very fast system for general data lengths on a variety of computer and DSP architectures. This work won the 1999 Wilkinson Prize for Numerical Software.
It is hard to overemphasis the importance of the DFT, convolution, and fast algorithms. With a history that goes back to Gauss and a compilation of references on these topics that in 1995 resulted in over 2400 entries, the FFT may be the most important numerical algorithm in science, engineering, and applied mathematics. New theoretical results still are appearing, advances in computers and hardware continually restate the basic questions, and new applications open new areas for research. It is hoped that this book will provide the background, references, programs and incentive to encourage further research and results in this area as well as provide tools for practical applications.
Studying the FFT is not only valuable in understanding a powerful tool, it is also a prototype or example of how algorithms can be made efficient and how a theory can be developed to define optimality. The history of this development also gives insight into the process of research where timing and serendipity play interesting roles.
Read More/Download

Data Structures and Algorithms: Annotated Reference with Examples

By Granville Barnett and Luca Del Tongo

Every book has a story as to how it came about and this one is no different, although we would be lying if we said its development had not been somewhat impromptu. Put simply this book is the result of a series of emails sent back and forth between the two authors during the development of a library for the .NET framework of the same name (with the omission of the subtitle of course!). The conversation started of something like, Why don't we create a more aesthetically pleasing way to present our pseudocode?" After a few weeks this new presentation style had in fact grown into pseudocode listings with chunks of text describing how the data structure or algorithm in question works and various other things about it. At this point we thought, What the heck, let's make this thing into a book!" And so, in the summer of 2008 we began work on this book side by side with the actual library implementation. When we started writing this book the only things that we were sure about with respect to how the book should be structured were:
  1. always make explanations as simple as possible while maintaining a moderately fine degree of precision to keep the more eager minded reader happy; and
  2. inject diagrams to demystify problems that are even moderately challenging to visualize (. . . and so we could remember how our own algorithms worked when looking back at them); and finally
  3. present concise and self explanatory pseudo code listings that can be ported easily to most mainstream imperative programming languages like C++, C#, and Java.
A key factor of this book and its associated implementations is that all algorithms (unless otherwise stated) were designed by us, using the theory of the algorithm in question as a guideline (for which we are eternally grateful to their original creators). Therefore they may sometimes turn out to be worse than the "normal" implementations|and sometimes not. We are two fellows of the opinion that choice is a great thing. Read our book, read several others on the same subject and use what you see fit from each (if anything) when implementing your own version of the algorithms in question. Through this book we hope that you will see the absolute necessity of understanding which data structure or algorithm to use for a certain scenario. In all projects, especially those that are concerned with performance (here we apply an even greater emphasis on real-time systems) the selection of the wrong data structure or algorithm can be the cause of a great deal of performance pain
Therefore it is absolutely key that you think about the run time complexity and space requirements of your selected approach. In this book we only explain the theoretical implications to consider, but this is for a good reason: compilers are very different in how they work. One C++ compiler may have some amazing optimisation phases specically targeted at recursion, another may not, for example. Of course this is just an example but you would be surprised by how many subtle differences there are between compilers. These differences which may make a fast algorithm slow, and vice versa. We could also factor in the same concerns about languages that target virtual machines, leaving all the actual various implementation issues to you given that you will know your language's compiler much better than us...well in most cases. This has resulted in a more concise book that focuses on what we think are the key issues.
One final note: never take the words of others as gospel; verify all that can be feasibly veri¯ed and make up your own mind. We hope you enjoy reading this book as much as we have enjoyed writing it.
Read More/Download

Optimization Algorithms on Matrix Manifolds

Written by P.-A. Absil, R. Mahony & R. Sepulchre

Foreword by Paul Van Dooren
Constrained optimization is quite well established as an area of research, and there exist several powerful techniques that address general problems in that area. In this book a special class of constraints is considered, called geometric constraints, which express that the solution of the optimization problem lies on a manifold. This is a recent area of research that provides powerful alternatives to the more general constrained optimization methods. Classical constrained optimization techniques work in an embedded space that can be of a much larger dimension than that of the manifold. Optimization algorithms that work on the manifold have therefore a lower complexity and quite often also have better numerical properties (see, e.g., the numerical integration schemes that preserve invariants such as energy). The authors refer to this as unconstrained optimization in a constrained search space.
The idea that one can describe difference or differential equations whose solution lies on a manifold originated in the work of Brockett, Flaschka, and Rutishauser. They described, for example, isospectral flows that yield time-varying matrices which are all similar to each other and eventually converge to diagonal matrices of ordered eigenvalues. These ideas did not get as much attention in the numerical linear algebra community as in the area of dynamical systems because the resulting difference and differential equations did not lead immediately to efficient algorithmic implementations.
An important book synthesizing several of these ideas is Optimization and DynamicalSystems (Springer, 1994), by Helmke and Moore, which focuses on dynamical systems related to gradient flows that converge exponentially to a stationary point that is the solution of some optimization problem. The corresponding discrete-time version of this algorithm would then have linear convergence, which seldom compares favorably with state-of-the-art eigenvalue solvers.
The formulation of higher-order optimization methods on manifolds grew out of these ideas. Some of the people that applied these techniques to basic linear algebra problems include Absil, Arias, Chu, Dehaene, Edelman, Eld´en, Gallivan, Helmke, H¨uper, Lippert, Mahony, Manton, Moore, Sepulchre, Smith, and Van Dooren. It is interesting to see, on the other hand, that several basic ideas in this area were also proposed by Luenberger and Gabay in the optimization literature in the early 1980s, and this without any use of dynamical systems.
In the present book the authors focus on higher-order methods and include Newton-type algorithms for optimization on manifolds. This requiresa lot more machinery, which cannot currently be found in textbooks. The main focus of this book is on optimization problems related to invariant subspaces of matrices, but this is sufficiently general to encompass well the two main aspects of optimization on manifolds: the conceptual algorithm and its convergence analysis based on ideas of differential geometry, and the efficient numerical implementation using state-of-the-art numerical linear algebra techniques.
The book is quite deep in the presentation of the machinery of differential geometry needed to develop higher-order optimization techniques, but it nevertheless succeeds in explaining complicated concepts with simple ideas. These ideas are then used to develop Newton-type methods as well as other superlinear methods such as trust-region methods and inexact and quasi-Newton methods, which precisely put more emphasis on the efficient numerical implementation of the conceptual algorithms.
This is a research monograph in a field that is quickly gaining momentum. The techniques are also being applied to areas of engineering and robotics, as indicated in the book, and it sheds new light on methods such as the Jacobi-Davidson method, which originally came from computational chemistry. The book makes a lot of interesting connections and can be expected to generate several new results in the future.
Read More/Download

Algorithms and Data Structures

Algorithms and Data Structures is written by N. Wirth. This book starts with a chapter on data structure for two reasons. First, one has an intuitive feeling that data precede algorithms: you must have some objects before you can perform operations on them. Second, and this is the more immediate reason, this book assumes that the reader is familiar with the basic notions of computer programming. Traditionally and sensibly, however, introductory programming courses concentrate on algorithms operating on relatively simple structures of data. Hence, an introductory chapter on data structures seems appropriate....
The second chapter treats sorting algorithms. It displays a variety of different methods, all serving the same purpose. Mathematical analysis of some of these algorithms shows the advantages and disadvantages of the methods, and it makes the programmer aware of the importance of analysis in the choice of good solutions for a given problem. The partitioning into methods for sorting arrays and methods for sorting files (often called internal and external sorting) exhibits the crucial influence of data representation on the choice of applicable algorithms and on their complexity. The space allocated to sorting would not be so large were it not for the fact that sorting constitutes an ideal vehicle for illustrating so many principles of programming and situations occurring in most other applications. It often seems that one could compose an entire programming course by deleting examples from sorting only.....
Third Chapter discusses recursive algorithms and fourth chapter is included with dynamic information structures. Finally fifth chapter is for hashing.
Contents

1 Fundamental Data Structures
1.1 Introduction
1.2 The Concept of Data Type
1.3 Primitive Data Types
1.4 Standard Primitive Types
1.4.1 Integer types
1.4.2 The type REAL
1.4.3 The type BOOLEAN
1.4.4 The type CHAR
1.4.5 The type SET
1.5 The Array Structure
1.6 The Record Structure
1.7 Representation of Arrays, Records, and Sets
1.7.1 Representation of Arrays
1.7.2 Representation of Recors
1.7.3 Representation of Sets
1.8 The File (Sequence)
1.8.1 Elementary File Operators
1.8.2 Buffering Sequences
1.8.3 Buffering between Concurrent Processes
1.8.4 Textual Input and Output
1.9 Searching
1.9.1 Linear Search
1.9.2 Binary Search
1.9.3 Table Search
1.9.4 Straight String Search
1.9.5 The Knuth-Morris-Pratt String Search
1.9.6 The Boyer-Moore String Search
Exercises
2 Sorting
2.1 Introduction
2.2 Sorting Arrays
2.2.1 Sorting by Straight Insertion
2.2.2 Sorting by Straight Selection
2.2.3 Sorting by Straight Exchange
2.3 Advanced Sorting Methods
2.3.1 Insertion Sort by Diminishing Increment
2.3.2 Tree Sort
2.3.3 Partition Sort
2.3.4 Finding the Median
2.3.5 A Comparison of Array Sorting Methods
2.4 Sorting Sequences
2.4.1 Straight Merging
2.4.2 Natural Merging
2.4.3 Balanced Multiway Merging
2.4.4 Polyphase Sort
2.4.5 Distribution of Initial Runs
Exercises
6
3 Recursive Algorithms
3.1 Introduction
3.2 When Not to Use Recursion
3.3 Two Examples of Recursive Programs
3.4 Backtracking Algorithms
3.5 The Eight Queens Problem
3.6 The Stable Marriage Problem
3.7 The Optimal Selection Problem
Exercises
4 Dynamic Information Structures
4.1 Recursive Data Types
4.2 Pointers
4.3 Linear Lists
4.3.1 Basic Operations
4.3.2 Ordered Lists and Reorganizing Lists
4.3.3 An Application: Topological Sorting
4.4 Tree Structures
4.4.1 Basic Concepts and Definitions
4.4.2 Basic Operations on Binary Trees
4.4.3 Tree Search and Insertion
4.4.4 Tree Deletion
4.4.5 Analysis of Tree Search and Insertion
4.5 Balanced Trees
4.5.1 Balanced Tree Insertion
4.5.2 Balanced Tree Deletion
4.6 Optimal Search Trees
4.7 B-Trees
4.7.1 Multiway B-Trees
4.7.2 Binary B-Trees
4.8 Priority Search Trees
Exercises
5 Key Transformations (Hashing)
5.1 Introduction
5.2 Choice of a Hash Function
5.3 Collision handling
5.4 Analysis of Key Transformation
Exercises
Appendices
A The ASCII Character Set
B The Syntax of Oberon
Index
You can download or read this data structures and algorithm book from the following link.
Read More/Download
Related Data structures and Algorithms ebooks
  1. Online & downloading versions of Data Structure and Algorithms EBooks
  2. Free Data Structures and Algorithms Ebooks
Buy Data Structures and Algorithms Books
  1. Algorithms and Data Structures by N. Wirth
  2. Algorithms in a Nutshell (In a Nutshell (O'Reilly))
  3. Data Structures and Algorithms in Java
  4. Algorithm Design

Download free Data Structures and Algorithms ebooks

This posting provides you free Datastructures and algorithms ebook site links which teaches you to study datastructures through different languages, various alogrithms, complexities, alogrithms directories, graph theories, computer datastructures etc.
You can get free downloads in datastructures, alogrithms, etc. Following are the free ebooks dowloads written by different authors.
  1. Algorithmic Information Theory By G J Chaitin
  2. Algorithms and Complexity by Herbert S. Wilf
  3. Algorithms and Data Structures in VLSI Design By Christoph Meinel and Thorsten Theobald
  4. Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani
  5. Algorithms for Modular Elliptic Curves By J. E. Cremona
  6. Algorithms for programmers By Jorg Arndt
  7. Art of Programming Contest - C Programming Tutorials Data Structures Algorithms
  8. Average Case Analysis of Algorithms on Sequences by Wojciech Szpankowski
  9. Combinatorial Algorithms By Jeff Erickson
  10. Computer Programming Algorithms Directory
  11. Data Structure in Java By Sandra Andersen
  12. Data Structures And Algorithms - CSEd Tools
  13. Data Structures and Algorithms By Alison Cawsey
  14. Data Structures and Algorithms By John Morris
  15. Data Structures and Algorithms in C++ By Michael T. Goodrich, Roberto Tamassia and David M. Mount
  16. Data Structures and Algorithms with Object-Oriented Design Patterns in C# By Bruno R.Preiss
  17. Data Structures and Algorithms with Object-Oriented Design Patterns in C++ By Bruno R. Preiss
  18. Data Structures and Algorithms with Object-Oriented Design Patterns in Java By Bruno R. Preiss
  19. Data Structures By Peter M Williams
  20. Data Structures:All Chapters From Wikibooks, the open-content textbooks collection
  21. Dictionary of Algorithms and Data Structures
  22. Efficient Algorithms for Sorting and Synchronization By Andrew Tridgell
  23. Foundations of Computer By Lawrence C Paulson
  24. Graph-Theoretic Algorithms By Therese Biedl
  25. Java Data Structures (2nd edition)
  26. Object Oriented Datastructures using Java By Nell Dale, Daniel T. Joyce and Chip Weems
  27. Planning Algorithms By Steven M. LaValle
  28. Problems on Algorithms By William Gasarch and Ian Parberry
  29. Sorting and Searching Algorithms: A Cookbook By Thomas Niemann
  30. The FXT library: Fast transforms and low level algorithms

Computer Programming Algorithms Directory

Encryption Algorithms
  • Advanced Encryption Standard (AES), Data Encryption Standard (DES), Triple-DES and Skipjack Algorithms - Offers descriptions of the named encryption algorithms.
  • Blowfish - Describes the Blowfish encryption algorithm. Offers source code for a variety of platforms.
  • KremlinEncrypt - Cryptography site provides an overview of cryptography algorithms and links to published descriptions where available.
  • PowerBASIC Crypto Archives - Offers PowerBASIC source code for many algorithms including:
    Hashing - RIPEMD-160, MD5, SHA-1, SHA-256, CRC-16, CRC-32, Adler-32, FNV-32, ELF-32
    Encryption - RSA-64, Diffie-Hellman-Merkle Secure Key Exchange, Rijndael, Serpent, Twofish, CAST-128, CAST-256, Skipjack, TEA, RC4, PC1, GOST, Blowfish, Caesar Substitutional Shift, ROT13
    Encoding - Base64, MIME Base64, UUEncode, yEnc, Neuronal Network, URLEncode, URLDecode
    Compression - LZ78, LZSS, LZW, RLE, Huffman, Supertiny
    Psuedo-Random Number Generation (PRNG) - Mersenne Twister Number Generator, Cryptographic PRNG, MPRNG, MOAPRNG, L'Ecuyer LCG3 Composite PRNG, W32.SQL-Slammer
  • TEA - Tiny Encryption Algorithm - Describes the TEA encryption algorithm with C source code.
  • xICE - Has links towards the bottom of the page to the description of the xice encryption algorithm as well as the xice software development kit which contains the algorithm's full source code in C++, ASP, JScript, Ruby, and Visual Basic 6.0.

Genetic Algorithms

  • Artificial Life - Offers executable and source for ant food collection and the travelling salesman problems using genetic algorithms
  • Genetic Ant Algorithm - Source code for a Java applet that implements the Genetic Ant Algorithm based upon the model given in Koza, Genetic Programming, MIT Press
  • Introduction to Genetic Algorithms - Introduces fundamentals, offers Java applet examples
  • Jaga - Offers a free, open source API for implementing genetic algorithms (GA) and genetic programming (GP) applications in Java
  • SPHINcsX - Describes a methodology to perform a generalized zeroth-order two- and three-dimensional shape optimization utilizing a genetic algorithm

GIS (Geographic Information Systems) Algorithms

  • Efficient Triangulation Algorithm Suitable for Terrain Modelling - Describes algorithm and includes links to source code for various languages
  • Prediction of Error and Complexity in GIS Algorithms - Describes algorithms for GIS sensitivitiy analysis
  • Point in Polygon Algorithm - Describes algorithm

Sorting Algorithms

  • Andrew Kitchen's Sorting Algorithms - Describes parallel sorting algorithms:
    Odd-Even Transposition Sort has a worst case time of O(n), running on n processors. Its absolute speed up is O(log n), so its efficiency is O((log n)/n)
    Shear Sort has a worst case time of O(n½ log n), running on n processors. Its absolute speed up is O(n½), so its efficiency is O(1/n½)
  • Ariel Faigon's Library of Sorting Algorithms - C source code for a variety of sorting algorithms including Insertion Sort, Quick Sort, Shell Sort, Gamasort, Heap Sort and Sedgesort (Robert Sedgewick quicksort optimization)
  • Flash Sort - Describes the FlashSort algorithm which sorts n elements in O(n) time
  • Michael Lamont's Sorting Algorithms - Describes common sorting algorithms:
    O(n²) Sorts - bubble, insertion, selection and shell sorts
    O(n log n) Sorts - heap, merge and quick sorts
  • Sequential and Parallel Sorting Algorithms - Describes many sorting algorithms:
    Quicksort
    Heapsort
    Shellsort
    Mergesort
    Sorting Networks
    Bitonic Sort
    Odd-Even Mergesort
    LS3-Sort
    4-way Mergesort
    Rotate Sort
    3n-Sort
    s^2-way Mergesort

Search Algorithms

  • Exact String Matching Algorithms - Details 35 exact string search algorithms.
  • Finding a Loop in a Singly Linked List - Outlines several methods for identifying loops in a singly linked list.
  • Fibonaccian search - Describes an O(log n) search algorithm for sorted arrays that is faster than a binary search for very large arrays.

Tree Algorithms

  • B-Trees: Balanced Tree Data Structures - Introduction to B-Trees. Describes searching, splitting and inserting algorithms.

Computational Geometry Algorithms

  • CGAL - Offers open source C++ library of computational geometry algorithms.
  • FastGEO - Offers source code for a library of computational geometry algorithms such as geometrical primitives and predicates, hull construction, triangulation, clipping, rotations and projections using the Object Pascal language.
  • Wykobi - FastGEO library ported to C++.

Phonetic Algorithms

  • Lawrence Philips' Metaphone Algorithm - Describes an algorithm which returns the rough approximation of how an English word sounds. Offers a variety of source code listings for the algorithm.
  • Soundex Algorithms - Describes the NYSIIS VS Soundex and R. C. Russell's soundex algorithms.

Project Management Algorithms

  • Calculations for Critical Path Scheduling - Describes the algorithms for calculating critical paths with both ADM and PDM networks.
  • Resource Leveling Using the Minimum Moment Heuristic - Offers a Windows 3.1 download that includes a .PDF document describing the algorithm.
  • Resource-Constrained Project Scheduling - (.PDF) Describes several algorithms for resource leveling:
    Basic Single Mode RCPSP
    Basic Multi-Mode RCPSP
    Stochastic RCPSP
    Bin Packing related RCPSP
    Multi-resource constrained project scheduling problem (MRCPSP)

Miscellaneous Algorithms

  • AI Horizon - Has a variety of algorithms, from basic computer science data structures such as 2-3 trees to AI-related algorithms such as minimax and a discussion of machine learning algorithms.
  • Hash Algorithms - Overview and source code (in C, Pascal and Java) for many general purpose hashing algorithms.
  • Porter Stemming Algorithm - Describes a process for removing the commoner morphological and inflexional endings from words in English. Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems.
  • Rubik's Cube - Solves a Rubic's Cube using the BestFast search algorithm and profile tables.
  • Simulated Annealing - The fundamental idea is to allow moves resulting in solutions of worse quality than the current solution (uphill moves) in order to escape from local minima. The probability of doing such a move is decreased during the search.
  • The Stony Brook Algorithm Repository - Offers a collection of algorithm implementations for over seventy of the most fundamental problems in combinatorial algorithms:
  1. Data Structures - Dictionaries, Priority Queues, Suffix Trees and Arrays, Graph Data Structures, Set Data Structures, Kd-Trees
  2. Numerical Problems - Solving Linear Equations, Bandwidth Reduction, Matrix Multiplication, Determinants and Permanents, Linear Programming/Simplex Method, Random Number Generation, Factoring and Primality Testing, Arbitrary Precision Arithmetic, Knapsack Problem, Discrete Fourier Transform
  3. Combinatorial Problems - Sorting, Searching, Median and Selection, Permutations, Subsets, Partitions, Graphs, Calendrical Calculations, Job Scheduling, Satisfiability
  4. Graph Problems - Polynomial Time Problems (Connected Components, Topological Sorting, Minimum Spanning Tree, Shortest Path, Transitive Closure and Reduction, Matching, Eulerian Cycle / Chinese Postman, Edge and Vertex Connectivity, Network Flow, Drawing Graphs Nicely, Drawing Trees, Planarity Detection and Embedding)
  5. Graph Problems - Hard Problems (Clique, Independent Set, Vertex Cover, Traveling Salesman Problem, Hamiltonian Cycle, Graph Partition, Vertex Coloring, Edge Coloring, Graph Isomorphism, Steiner Tree, Feedback Edge/Vertex Set
  6. Computational Geometry - Robust Geometric Primitives, Convex Hull, Triangulation, Voronoi Diagrams, Nearest Neighbor Search, Range Search, Point Location, Intersection Detection, Bin Packing, Medial-Axis Transformation, Polygon Partitioning, Simplifying Polygons, Shape Similarity, Motion Planning, Maintaining Line Arrangements, Minkowski Sum
  7. Set and String Problems - Set Cover, Set Packing, String Matching, Approximate String Matching, Text Compression, Cryptography, Finite State Machine Minimization, Longest Common Substring, Shortest Common Superstring
  • NIST Dictionary of Algorithms and Data Structures - Some entries have links to implementations.

Click to Read More

Dictionary of Algorithms and Data Structures

This web site is hosted in part by the Software Quality Group of the Software Diagnostics and Conformance Testing Division, Information Technology Laboratory.
This is a dictionary of algorithms, algorithmic techniques, data structures, archetypal problems, and related definitions. Algorithms include common functions, such as Ackermann's function. Problems include traveling salesman and Byzantine generals. Some entries have links to implementations and more information. Index pages list entries by area and by type. The two-level index has a total download 1/20 as big as this page.
Don't use this site to cheat. Teachers, contact us if we can help.
To define or correct terms, please contact Paul E. Black. We need help in automata theory, combinatorics, parallel or randomized algorithms, heuristics, and quantum computing. We do not include algorithms particular to business data processing, communications, operating systems or distributed algorithms, programming languages, AI, graphics, or numerical analysis: it is tough enough covering "general" algorithms and data structures. However, if you want to tackle one of these areas, we'll consider including them.
Some terms with a leading variable, such as n-way, m-dimensional, or p-branching, are under k-. You may find terms dealing with hardware, the computer industry, slang, etc., in the Free On-Line Dictionary Of Computing or in A Glossary of Computer Oriented Abbreviations and Acronyms.

Data Structures and Algorithms Course

By John Morris

This course will focus on data structures and algorithms for manipulating them. Data structures for storing information in tables, lists, trees, queues and stacks will be covered. Some basic graph and discrete transform algorithms will also be discussed.
Good Programs

There are a number of facets to good programs: they must
  • run correctly
  • run efficiently
  • be easy to read and understand
  • be easy to debug and
  • be easy to modify.

The first of these is obvious - programs which don't run correctly are clearly of little use. "Efficiently" is usually understood to mean in the minimum time - but occasionally there will be other constraints, such as memory use, which will be paramount. As will be demonstrated later, better running times will generally be obtained from use of the most appropriate data structures and algorithms, rather than through "hacking", i.e. removing a few statements by some clever coding - or even worse, programming in assembler!

This course will focus on solving problems efficiently: you will be introduced to a number of fundamental data structures and algorithms (or procedures) for manipulating them.

Click to Read More

Data Structures and Algorithms

By Alison Cawsey
This course will have two main sections. The first, shorter section will introduce inheritance and polymorphism in object oriented programming. This is material makes the implementation of some datastructures more elegant and more re-useable. However this section of the course is fairly independent of the next. The second, longer section will look at different sorts of algorithms, and in particular string processing algorithms and graph algorithms.
There are various concepts from Data Structure & Algorithms which will be assumed. The basic idea of an abstract datatype is important. You also need to be familiar with pointers for some of the first section. You should also be able to implement and use a stack and a queue (with or without the help of a textbook).
Why should you want to know about algorithms? There are a number of possible reasons:
  • To avoid re-inventing the wheel. As you might expect, for many programming problems, someone has already developed a good algorithm to solve that problem. For many of these algorithms, people have formally analysed their properties, so you can be confident in their correctness and efficiency. For example, we know that merge sort and quick sort both work correctly, and have average case complexity O(n log n), so we can straightforwardly use either algorithm in our programs. We might be able to further choose between these two algorithms, depending on properties of our data (is it already almost sorted?).
  • To help when developing your own algorithms. Many of the principles of data-abstraction and algorithm design, illustrated by some of the algorithms discussed here, are important in all programming problem. Not all tasks have ``off-the-shelf'' algorithms available, so you will sometimes have to develop your own. Merge sort illustrates a widely applicable technique: split the problem in two, solve each separately, then combine the results. A knowledge of well known algorithms will provide a source of ideas that may be applied to new tasks.
  • To help understand tools that use particular algorithms, so you can select the appropriate tool to use. For example, documentation of various compression programs will tell you that pack uses Huffman coding, compress uses LZW, and Gzip uses the Lempel-Ziv algorithm. If you have at least some understanding of these algorithms you will know which is likely to be better and how much reduction in file size you might expect for a given type of file (e.g., does the file involve a lot of repetition of common sequences of characters).
  • Because they're neat. Many surprisingly simple and elegant algorithms have been invented. If you have even a slightly Mathematical disposition you should find them interesting in their own right.

For all these reasons it is useful to have a broad familiarity with a range of algorithms, and to know what they may be used for.

Click to Read More

Algorithms for programmers

By Jorg Arndt

This is a draft of a book about selected algorithms. The audience in mind are programmers who are interested in the treated algorithms and actually want to create and understand working and reasonably optimized code.
The style varies somewhat which I do not consider bad per se: While some topics (as fast Fourier transforms) need a clear and explicit introduction others (like the bit wizardry chapter) seem to be best presented by basically showing the code with just a few comments.
The pseudo language Sprache is used when I see a clear advantage to do so, mainly when the corresponding C++ does not appear to be self explanatory. Larger pieces of code are presented in C++. C programmers do not need to be shocked by the ‘++’ as only a rather minimal set of the C++ features is used. Some of the code, especially in part 3 (Arithmetical algorithms), is given in the pari/gp language as the use of other languages would likely bury the idea in technicalities.

Algorithms for Modular Elliptic Curves

By J. E. Cremona
This book is in three sections. First, we describe in detail an algorithm based on modular symbols for computing modular elliptic curves: that is, one-dimensional factors of the Jacobian of the modular curve $X_0(N)$, which are attached to certain cusp forms for the congruence subgroup $\Gamma_0(N)$. In the second section, various algorithms for studying the arithmetic of elliptic curves (defined over the rationals) are described. These are for the most part not new, but they have not all appeared in book form, and it seemed appropriate to include them here. Lastly, we report on the results obtained when the modular symbols algorithm was carried out for all $N\le1000$. In a comprehensive set of tables we give details of the curves found, together with all isogenous curves (5113 curves in all, in 2463 isogeny classes. [In the first edition, these numbers were given as 5089 and 2447 respectively, as the curves of conductor 702 were inadvertently omitted.] Specifically, we give for each curve the rank and generators for the points of infinite order, the number of torsion points, the regulator, the traces of Frobenius for primes less than 100, and the leading coefficient of the $L$-series at $s=1$; we also give the reduction data (Kodaira symbols, and local constants) for all primes of bad reduction, and information about isogenies..........
Contents of Second Edition
  • Introduction
  • Modular symbol algorithms
  • Modular Symbols and Homology
  • The upper half-plane, the modular group and cusp forms
  • The duality between cusp forms and homology
  • Real structure
  • Modular symbol formalism
  • Rational structure and the Manin-Drinfeld Theorem
  • Triangulations and homology
  • M-symbols and $\Gamma_0(N)$
  • Conversion between modular symbols and M-symbols
  • Action of Hecke and other operators
  • Working in $H^+(N)$
  • Modular forms and modular elliptic curves
  • Splitting off one-dimensional eigenspaces
  • $L(f,s)$ and the evaluation of $L(f,1)/\period(f)$
  • Computing Fourier coefficients
  • Computing periods I
  • Computing periods II: Indirect method
  • Computing periods III: Evaluation of the sums
  • Computing $L^{(r)}(f,1)$
  • Obtaining equations for the curves
  • Computing the degree of a modular parametrization
  • Modular Parametrizations
  • Coset representatives and Fundamental Domains
  • Implementation for $\Gamma_0(N)$ Appendix to Chapter II. Examples
  • Example 1. N=11
  • Example 2. N=33
  • Example 3. N=37
  • Example 4. N=49
  • Elliptic curve algorithms
  • Terminology and notation
  • The Kraus--Laska--Connell algorithm and Tate's algorithm
  • The Mordell--Weil group I: finding torsion points
  • Heights and the height pairing
  • The Mordell--Weil group II: generators
  • The Mordell--Weil group III: the rank
  • The period lattice
  • Finding isogenous curves
  • Twists and complex multiplication
  • The tables
  • Elliptic curves
  • Mordell--Weil generators
  • Hecke eigenvalues
  • Birch--Swinnerton-Dyer data
  • Parametrization degrees
  • Bibliography

Average Case Analysis of Algorithms on Sequences

by Wojciech Szpankowski
A timely book on a topic that has witnessed a surge of interest over the last decade, owing in part to several novel applications, most notably in data compression and computational molecular biology. It describes methods employed in average case analysis of algorithms, combining both analytical and probabilistic tools in a single volume.
  • Tools are illustrated through problems on words with applications to molecular biology, data compression, security, and pattern matching.
  • Includes chapters on algorithms and data structures on words, probabilistic and analytical models, inclusion-exclusion principles, first and second moment methods, subadditive ergodic theorem and large deviations, elements of information theory, generating functions, complex asymptotic methods, Mellin transform and its applications, and analytic poissonization and depoissonization.
  • Written by an established researcher with a strong international reputation in the field.

Click to Read More

Planning Algorithms

By Steven M. LaValle
This book presents a unified treatment of many different kinds of planning algorithms. The subject lies at the crossroads between robotics, control theory, artificial intelligence, algorithms, and computer graphics. The particular subjects covered include motion planning, discrete planning, planning under uncertainty, sensor-based planning, visibility, decision-theoretic planning, game theory, information spaces, reinforcement learning, nonlinear systems, trajectory planning, nonholonomic planning, and kinodynamic planning.
Planning to Plan
Planning is a term that means different things to different groups of people. Robotics addresses the automation of mechanical systems that have sensing, actuation, and computation capabilities (similar terms, such as autonomous systems are also used). A fundamental need in robotics is to have algorithms that convert high-level specifications of tasks from humans into low-level descriptions of how to move. The terms motion planning and trajectory planning are often used for these kinds of problems. A classical version of motion planning is sometimes referred to as the Piano Mover’s Problem. Imagine giving a precise computer-aided design (CAD) model of a house and a piano as input to an algorithm. The algorithm must determine how to move the piano from one room to another in the house without hitting anything. Most of us have encountered similar problems when moving a sofa or mattress up a set of stairs. Robot motion planning usually ignores dynamics and other differential constraints and focuses primarily on the translations and rotations required to move the piano. Recent work, however, does consider other aspects, such as uncertainties, differential constraints, modeling errors, and optimality. Trajectory planning usually refers to the problem of taking the solution from a robot motion planning algorithm and determining how to move along the solution in a way that respects the mechanical limitations of the robot.

Data Structures And Algorithms

CSEd Tools
This book covers some commonly used data structures (such as arrays, linked-lists, trees, and graphs) and algorithms (such as sorting, searching, and various graph algorithms) that are used in Computer Science. The book is intended primarily for the readers who are already familiar with these data structures and algorithms, but would like to get some information on how to perform some advanced or non-trivial operations on/using them. Hence, for each data structure and algorithm, I have generally given a lighter treatment to the basic issues (since they are generally already covered elsewhere) and given more attention to more advanced issues related to them.
The book is actually still in writing, and I will add more material to it. So make sure to check this Web site from time-to-time to see the latest updates.
Note: This book contains several psuedocodes. Implementing these psuedocodes using a programming language will require rewriting them appropriately using language-specific constructs. The pseudocodes may also make some simplifying assumptions, may not check some boundary conditions such as nil input values, and may not check for invalid input values. The solutions provided here may not be the best possible solutions. In fact, simpler and more practical solutions have often been preferred over theoretically-better but more complicated solutions.

Object Oriented Datastructures using Java

By Nell Dale, Daniel T. Joyce and Chip Weems
Welcome to the first edition of Object-Oriented Data Structures using Java. This book has been written to present the algorithmic, programming, and structuring techniques of a traditional data structures course in an objectoriented context. You’ll find that all of the familiar topics of lists, stacks, queues, trees, graphs, sorting, searching, Big-O complexity analysis, and recursion are still here, but covered from an object-oriented point of view using Java. Thus, our structures
are defined with Java interfaces and encapsulated as Java classes. We use abstract classes and inheritance, as appropriate, to take advantage of the relationships among various versions of the data structures. We use design aids, such as Class-Responsibility-Collaborator (CRC) Cards and Universal Modeling Language (UML) diagrams, to help us model and visualize our classes and their interrelationships. We hope that you enjoy this modern and up-to-date approach to the traditional data structures course.
Abstract Data Types

Over the last 16 years, the focus of the data structures course has broadened considerably. The topic of data structures now has been subsumed under the broader topic of abstract data types (ADTs)—the study of classes of objects whose logical behavior is defined by a set of values and a set of operations.
The term abstract data type describes a domain of values and set of operations that are specified independently of any particular implementation. The shift in emphasis is representative of the move towards more abstraction in computer science education. We now are interested in the study of the abstract properties of classes of data objects in addition to how the objects might be represented in a program.
The data abstraction approach leads us, throughout the book, to view our data structures from three different perspectives: their specification, their application, and their implementation. The specification describes the logical or abstract level. This level is concerned with what the operations are and what they do. The application level, sometimes called the user level, is concerned with how the data type might be used to solve a problem. This level is concerned with why the operations do what they do. The implementation level is where the operations are actually coded. This level is concerned with the how questions.
Using this approach, we stress computer science theory and software engineering principles, including modularization, data encapsulation, information hiding, data abstraction, stepwise refinement, visual aids, the analysis of algorithms, and software verification methods. We feel strongly that these principles should be introduced to computer science students early in their education so that they learn to practice good software techniques from the beginning.
An understanding of theoretical concepts helps students put the new ideas they encounter into place, and practical advice allows them to apply what they have learned. To teach these concepts we consistently use intuitive explanations, even for topics that have a basis in mathematics, like the analysis of algorithms. In all cases, our highest goal has been to make our explanations as readable and as easily understandable as possible.

Data Structures in Java - A Laboratory Course

By Sandra Andersen

To learn a subject such as computer science, you need to immerse yourself in it --learning by doing rather than by simply observing. Through the study of several classic data structures and algorithms, you will become a better informed and more knowledgeable computer science student and programmer. To be able to professionally choose the best algorithm and data structure for a particular set of resource constraints takes practice.

An emphasis on learning by doing is used throughout Data Structures in Java: A Laboratory Course. In each laboratory, you explore a particular data structure by implementing it. As you create an implementation, you learn how the data structure works and how it can be applied. The resulting implementation is a working piece of software that you can use in later laboratories and programming projects.

Click to Download

Data Structures in Java: A Laboratory Course

Algorithms and Data Structures in VLSI Design

By Christoph Meinel and Thorsten Theobald
One of the main problems in chip design is the huge number of possible combinations of individual chip elements, leading to a combinatorial explosion as chips become more complex. New key results in theoretical computer science and in the design of data structures and efficient algorithms can be applied fruitfully here. The application of ordered binary decision diagrams (OBDDs) has led to dramatic performance improvements in many computer-aided design projects. This textbook provides an introduction to the foundations of this interdisciplinary research area, with an emphasis on applications in computer-aided circuit design and formal verification.
Contents
1. Introduction
2. Basics
2.1 Propositions and Predicates - 2.2 Sets, Relations, and Functions - 2.3 Graphs - 2.4 Algorithms and Data Structures - 2.5 Complexity of Algorithms - 2.6 Hashing - 2.7 Finite Automata and Finite State Machines - 2.8 References
Part I: Data Structures for Switching Functions
3. Boolean Functions
3.1 Boolean Algebras - 3.2 Boolean Formulas and Boolean Functions - 3.3 Switching Functions - 3.3.1 Switching Functions with at most Two Variables - 3.3.2 Subfunctions and Shannon's Expansion - 3.3.3 Visual Representation - 3.3.4 Monotone Switching Functions - 3.3.5 Symmetric Functions - 3.3.6 Threshold Functions - 3.3.7 Partial Switching Functions - 3.4 References
4. Classical Representations
4.1 Truth Tables - 4.2 Two-Level Normal Forms - 4.3 Circuits and Formulas - 4.3.1 Circuits - 4.3.2 - Formulas - 4.4 Binary Decision Trees and Diagrams - 4.4.1 Binary Decision Trees - 4.4.2 Branching Programs - 4.4.3 Read-Once Branching Programs - 4.4.4 Complexity of Basic Operations - 4.5 References
5. Requirements on Data Structures in Formal Circuit Verification
5.1 Circuit Verification - 5.2 Formal Verification of Combinational Circuits - 5.3 Formal Verification of Sequential Circuits - 5.4 References
Part II: OBDDs: An Efficient Data Structure
6. OBDDs - Ordered Binary Decision Diagrams
6.1 Notation and Examples - 6.2 Reduced OBDDs: A Canonical Representation of Switching Functions - 6.3 The Reduction Algorithm - 6.4 Basic Constructions - 6.5 Performing Binary Operations and the Equivalence Test - 6.6 References
7. Efficient Implementation of OBDDs
7.1 Key Ideas - 7.1.1 Shared OBDDs - 7.1.2 Unique Table and Strong Canonicity - 7.1.3 ITE Algorithm and Computed Table - 7.1.4 Complemented Edges - 7.1.5 Standard Triples - 7.1.6 Memory Management - 7.2 Some Popular OBDD Packages - 7.2.1 The OBDD Package of Brace, Rudell, and Bryant - 7.2.2 The OBDD Package of Long - 7.2.3 The CUDD Package: Colorado University Decision Diagrams - 7.3 References
8. Influence of the Variable Order on the Complexity of OBDDs
8.1 Connection Between Variable Order and OBDD Size - 8.2 Exponential Lower Bounds - 8.3 OBDDs with Different Variable Orders - 8.4 Complexity of Minimization - 8.5 References
9. Optimizing the Variable Order
9.1 Heuristics for Constructing Good Variable Orders - 9.1.1 The Fan-In Heuristic - 9.1.2 Die Weight Heurisitic - 9.2 Dynamic Reordering - 9.2.1 The Variable Swap - 9.2.2 Exact Minimization - 9.2.3 Window Permutation - 9.2.4 The Sifting Algorithm - 9.2.5 Block Sifting and Symmetric Sifting - 9.3 Quantitative Statements - 9.4 Outlook - 9.5 References
Part III: Applications and Extensions
10. Analysis of Sequential Systems
10.1 Formal Verification - 10.2 Basic Operators - 10.2.1 Generalized Cofactors - 10.2.2 The Constrain Operator - 10.2.3 Quantification - 10.2.4 The Restrict Operator - 10.3 Reachability Analysis - 10.4 Efficient Image Computation - 10.4.1 Input Splitting - 10.4.2 Output Splitting - 10.4.3 The Transition Relation - 10.4.4 Partitioning the Transition Relation - 10.5 References
11. Symbolic Model Checking11.1 Computation Tree Logic - 11.2 CTL Model Checking - 11.3 Implementations - 11.3.1 The SMV System - 11.3.2 The VIS System - 11.4 References
12. Variants and Extensions of OBDDs12.1 Relaxing the Ordering Restriction - 12.2 Alternative Decomposition Types - 12.3 Zero-Suppressed BDDs - 12.4 Multiple-Valued Functions - 12.4.1 Additional Sinks - 12.4.2 Edge Values - 12.4.3 Moment Decompositions - 12.5 References
13. Transformation Techniques for Optimization13.1 Transformed OBDDs - 13.2 Type-Based Transformations - 13.2.1 Definition - 13.2.2 Circuit Verification - 13.3 Linear Transformations - 13.3.1 Definition - 13.3.2 Efficient Implementation - 13.3.3 Linear Sifting - 13.4 Encoding Transformations - 13.5 References
Bibliography
Index

Followers