Contents

Sorting Algorithms Comparison

So far we met several sorting algorithms:

Though we have not cover HeapSort itself in our problems, but the algorithm consist simply of filling the heap from array and then taking them back to array from it, already in order.

What is the difference between these methods?

Let us regard most important properties:

  1. Time Complexity is the relation between the number of elements in input data and time consumed by algorithm. For example to search for something in unordered array we need to check all elements which is written as O(N) where N stands for number of elements. But if array is sorted we can use binary search with roughly log2(N) comparisons, so it is written as O(log(N)).

  2. Extra space is the measure of how much additional memory is required. Usually it is also expressed in comparison to the size of input data.

  3. Stability tells whether relative order of equal elements could be changed or not - for example if we have a group of people John, Ian, Jake, Mary, David, Catherine then stable sort should preserve the order John-Jake-Mary even after sorting.


Quadratic sorts

Three simplest methods - Bubble-sort, Selection sort and Insertion sort have time complexity of O(N*N). This is poor quality since it means the time of execution grows very far with growing of input size. Suppose that 1000 elements can be sorted in a second, then we can calculate the following table:

elements:   1000   10000    100000    1 million    10 millions

sort time:   1s     100s    3 hours   12 days        4 years

However they are still suitable (and often used) for small arrays, perhaps as a part of more complex algorithms.

What is the difference between them three?

We can see that Bubble-sort and Selection sort both work by choosing maximum (or minimum) element at each round and then putting it to the end of unsorted part.

Both of them need to perform the same amount of comparisons between elements of still unsorted part, however Bubble-sort also performs many swaps (each consisting of 3 operations), while Selection sort only does assignment when it finds new candidate-maximum and only single swap at the end of round.

That is why the Selection sort usually at least 2 times faster than Bubble-sort.

Now what about Insertion sort? Since it needs to shift the part of already sorted array to insert new element, it also requires significant amount of assignments.

On the other hand it does less comparisons - and if binary search is implemented, the amount of comparisons is roughly proportional to N rather than to N*N.

So we usually should decide which operation is more time consuming - comparison or assignment. If the comparison is costly (for example, comparison of strings) then Insertion sort should be preferred over Selection sort.


Logarithmic sorts

Any sort based on comparing elements could not have time complexity better than O(N*log(N)) (see wiki). However there are several algorithms which reach this limit.

QuickSort is one of the most popular. It uses partitioning algorithm to split an array into two unsorted halves (one strictly greater than other) and then sort them recursively.

In optimal case array is split into equal chunks each time so after log2(N) splits we reach chunks of 1 element. This explains why in such case algorithm works in O(N*log(N)).

However there is a disadvantage. In some poor cases (for example, sorting reversely ordered array without choosing the pivot) array is always split extremely unequally - one part have a single element and another - all remaining. This makes it work for O(N*N) in such cases (and it is often used to break other's solutions in programming competitions).

Another important point is that algorithm needs to track the list of ends of all chunks "from top to bottom" - this means that we need to use from O(log2(N)) to O(N) additional memory. In recursive version of algorithm this memory is reserved for local variables on the stack.

Heap-sort is just filling the heap from array and then filling array back from heap. If we take elements from the array in order and grow the heap using the same array we can avoid using extra memory!

Time complexity of the algorithm is the complexity of putting and removing N elements to heap, which means also O(N*log(N)) since operation on heap take O(log(N)) time.

Generally it is slightly slower than QuickSort. However it will always consume the same time, neither less nor more - the initial order of elements and choosing the pivot is not important.

Another difference is that Heap-sort is not stable. Equal elements may change their relative placement in array after sorting and we need some additional care if we want to preserve such sub-ordering.