|--------------------------------------------------------------------------|
| Pascal programming language sorting demonstrations |
| SORTDEMO.EXE, SWSDEMO.EXE/.PAS |
| Copyright 1996 Robert Manning, South Bay Computer Assistance |
|--------------------------------------------------------------------------|
INTRODUCTION
------------
These sorting demonstrations are meant for programmers using the Pascal
language, who would like a visual demonstration of what a sorting
algorithm does. The main program SORTDEMO.EXE illustrates several
common sorting algorithms using a text-based bar array, which
represents an array of data being sorted.
This visual representation can help those learning programming to
grasp what's actually happening when they program a sort on an array
of data. You can slow down the program to observe how a QuickSort
partitions data, or how the offset of a Shell sort makes that routine
work.
Any sorting routine will work well with a small amount of data, or
small files. The real differences between sorting routines will
appear when you try to use them on large files, or files with large
records. If you only have say, a few dozen records to sort, you will
notice little difference in any of these sorting routines, on a modern
machine. But this is rarely the case! What sort routine you need is
dependent on several factors, but the faster, the better.
THE SORT ROUTINES
-----------------
A description of the sorting routines used:
BUBBLE SORT
-----------
Bubble sorts in general work by looping through the array of data
comparing adjacent elements in the array. When it finds two that
are out of order (less than or greater than), it swaps them. The
drawback to bubble sorts is that we end up looping through the
array N-squared times, where N is the size of the array. Thus, it
takes quite a lot of moving through the array and moving of data.
This version of a bubble sort uses a method which detects when the
data is sorted, thus taking less time to sort than a brute force
bubble sort. Depending on how unsorted the data is, this version
can take just as much time as the brute force bubble sort. The
time for sorting is reduced when the data is partly sorted when
the algorithm starts.
BRUTE FORCE BUBBLE SORT
-----------------------
The brute force version of the bubble sort is similar to the one
mentioned above, but does not use the method for determining when
the file is sorted. Thus, it moves through the data
N^2 = 45^2 = 2025
times! It doesn't matter how unsorted or how sorted the data
starts out. This routine will always make the same number of
iterations through the data. It is an easy sort to code, but that
is about all that it has going for it. For small amounts of data,
this isn't a big deal. But the time required for this algorithm
to complete rises to the square of the number of items it has
to sort!
SHAKER SORT
-----------
A shaker sort is similar to a bubble sort, but it 'bubbles' in
both directions through the array. A bubble sort moves through
data in one direction. The shaker sort moves back and forth from
the beginning to the end, then back to the beginning of the data.
The sorting steps decrease by one at each end of the data, so there
is less comparing and moving of data as the algorithm works.
The time to complete a shaker sort isn't a lot different than a
regular bubble sort, but it is a slight improvement.
INSERTION SORT
--------------
The insertion sort works by finding the proper place for an item
to be within a sorted array. When we find the right place for an
item, we move all the data after the item one item down in the
list, then insert the item into that position.
Starting at the top of the array, we look at the first items. If
they're in order, we keep looking. If they are not, we save the
data in a temporary place, move all the data in the list down a
notch, and insert the saved data into the spot we vacated for it.
This sort works well for data that is partly sorted already, or
for small files with small records. There is still quite a bit of
data movement for an unsorted file, and it ends up that this sort
method isn't much better than a bubble sort in most cases.
SELECTION SORT
--------------
The selection sort works by looping through the data and finding
the smallest item in the array. When we find the smallest item,
we swap that item with the current data we're comparing it with.
The primary advantage to this sort method is reduced movement of
data. We still must loop through the file almost N-squared times,
but we don't swap data unless we need to. This feature makes this
sort very nice for files with very large records, as we reduce
the amount of data movement quite a bit.
It turns out that in this sorting demo, the selection sort is the
fastest performing sort. This isn't necessarily the case! Please
see the topic Comparing Sort Routines below for an explanation.
SHELL SORT
----------
The shell sort starts comparing elements that are far apart,
separated by the value of an offset, which is initially half
the distance between the first and last element. We loop through
the first half of the data, comparing an item with an item that
is (i + offset) records away. If the items need to be swapped,
we swap them, saving the index of the array (in the first half
of the data) that was swapped.
At the end of the first loop in the first half of the data, we
reset how much of the array we loop through to the difference
between where the last item was swapped and the initial offset.
This results in a reduced amount of data to search. We continue
this process until no more data is swapped.
When no more data is swapped, we divide the initial offset by 2,
then repeat the process. The entire process is continued until
the initial offset is divided to zero.
Depending on how unordered the data is, this sort method can
be somewhat slow, but it is much faster than any of the other
types of bubble sorts. It does resemble a bubble sort in that
we perform nested loops through the data, but in that respect
alone. There is more swapping of data than a selection sort,
and it appears from this demo that the selection sort is faster,
but with partly sorted data, this method is significantly faster.
Indeed, it can be faster than a quick sort in some cases.
QUICK SORT
----------
The quick sort is probably one of the best known sort methods.
It is a recursive procedure, which means that it invokes itself
during processing. It works very well for even large files, though
when the file is quite unsorted, it can slow down a lot. This sort
is very memory-intensive (due to recursion).
The idea is to rearrange and divide the data where there is one
item in the logical 'middle' of the range of all data. At this
point, we can subdivide the data, and repeat the process for the
smaller divisions of data. All data below this middle item will
be 'less than' it (depending on the sort field), and the data above
this middle item will be equal to or greater than the middle
item. The data below the middle is not yet sorted; it's just less
than the middle item. Also, the data above the middle is not yet
sorted either, it's just equal to or greater than the middle.
We then recursively invoke the quick sort on each of the lower
and upper sections of data. In turn, these recursive calls will
create their own 'middle' items. Eventually, all data is sorted.
The quick sort uses two procedures; one is recursive, the other
is not. The non-recursive portion calculates this 'middle' item
in the range of data given to it, regardless of the range. It
'rearranges' the data within the range it is given, so that the
value it returns resides in about the logical center of the
range. When the range is reduced to 1 item, the data is sorted
and the recursive calls end.
The quick sort is the method of choice for most sort applications.
It is the fastest sort in this demo, mathematically speaking, as
it does not take N-squared trips through the data to sort all
items, but N(log N) trips through the data. It can slow down
depending on how unsorted the data is, and it isn't much different
when we're sorting small amounts of data. But for large files or
large amounts of data, this is the sort method to use.
COMPARING SORT ROUTINES
-----------------------
In creating this demo, one of the ideas was to allow you to see how
sorts worked visually. The other was to allow you to see how fast
one sort is versus another. This sort comparison isn't as good as it
could be, as there are certain factors at work in the program.
First, I tried to observe the caveat that all redrawing of the text
bars was to take place after some data was swapped in the array. This
is probably the most fair method to use, but it doesn't give a very
accurate end comparison of the true speed difference of all of these
sorts.
For example, the selection sort benefits the most from this caveat.
Since it only swaps data when it needs to, it also only redraws the
bars when it needs to. So, you don't see the extra looping through
the array to find the smallest item in it. This extra looping slows
the sort down. In reality, the selection sort takes almost as many
trips through the array as a bubble sort. The difference as mentioned
above, is that it doesn't move the data around so much. So, it does
appear that it is the fastest sort method. This is not the case.
The fastest sort by far is the quick sort. Due to my caveat about
redrawing the bars, the quick sort loses some of it's speed because
it keeps creating partitions in the recursive 'middles' of the data
ranges, and during the process of creating these 'middles', it
moves data. So, it also gets to redraw the bars a lot. The benefit
to this, is you really get to see what a quick sort is doing. But
it also looks slow - depending on the initial array, maybe the third
fastest of this demo! This is an illusion, trust me.
The two bubble sorts, and the shaker sort are all about the same
speed, with the shaker slightly faster. The insertion sort and the
selection sort can be about the same speed (both faster than the
bubble/shaker methods), depending on how large the data to be moved
is, and how unsorted the original file is. These are middle of the
road sorts, so to speak. Depending on what needs to be done, these
can be perfectly fine sort methods to use. The shell sort can be as
slow as a bubble sort, but again, if the data is partly sorted, this
sort method can truly show a dramatic increase in speed. It is slower
than a quick sort in practise. If you have lots of memory, lots of
data, and want to sort it in a hurry, use the quick sort.
SWSDEMO.EXE/.PAS PROGRAM
------------------------
I've enclosed part of the SORTDEMO program as the SWSDEMO program
(ShareWare SortDEMO). Here, you can see how I set up the program
for drawing the bars, and how I'm capturing the time of start and
finish times, initializing the array, and so forth. I've put in the
brute force bubble sort and the improved bubble sort methods for
you to have fun with.
The rest of the sort method source code is available if you would
like to register (purchase) the sort demo program.
PROGRAM REGISTRATION
--------------------
If you'd like to see how the complete program is coded, please
send $10 check or money order in U.S. Dollars only to:
Robert Manning, PO Box 2011, Lomita, CA 90717 USA.
Please add $5 if the disk is to be mailed outside of the USA. Also,
if you'd like to receive the registered version by email, please
mention that with your registration.
I don't mean to offend anyone by reserving the source code for the
registered version ... it's just that I have a family to support!
I'm sure you understand. And if you don't, that's OK too; perhaps
one day you too will have a family to support! ;-)
Robert Manning, author
South Bay Computer Assistance
PO Box 2011, Lomita, CA 90717 USA
[[Email Removed]]
[[Email Removed]]
http://members.aol.com/robertm782/public/sbcapage.htm
|--------------------------------------------------------------------------|
| Pascal programming language sorting demonstrations |
| SORTDEMO.EXE, SWSDEMO.EXE/.PAS |
| Copyright 1996 Robert Manning, South Bay Computer Assistance |
|--------------------------------------------------------------------------|