I finished the min-heap module with little sense of what it really was, so I went back for another look. A min-heap is an efficient method for creating a list in such a way that the minimum element is always instantly accessible without a search, and in popping that minimum, the next minimum rises (also efficiently) to the top.
It occurred to me that this could be used in a sorting algorithm: from the list you want sorted, append each item to the heap, then pop each item into a new list (or the same one, if you want to sort in place.)
Since the heap is based on a binary tree, for a list of n elements, if n is equal to 2^k for some k, the tree is k elements high. Thus the maximum number of swaps to go up or down the tree is k or log(n).
Since you need to go through the list twice (once for append and once for pop), and since each step can involve a maximum of log(n) additional steps, the time for my sort should increase as approximately 2nlog(n).
So … I decided to compare with an implementation of merge sort (known to sort in n*log(n) ) from a different course, and compare the times.
The heap_sort() function (sort-in-place version) is:
def heap_sort(lst):
min_heap = MinHeap()
while lst:
min_heap.add(lst.pop())
# remove minimum until min_heap is empty
while min_heap.count != 0:
lst.append(min_heap.retrieve_min())
… and when run alongside merge_sort() using the timeit module for timing, we see:
You can see that heap_sort follows n*log(n) closely, running linearly slower (up to 3x) than merge_sort. Interestingly, it looks as though somewhere between a list length of 10^5 and 10^6, heap_sort becomes faster than merge_sort, probably (I say with absolutely no authority!) due to inefficiencies inherent in recursion.