Sorting Algorithms (Merge Sort) - How does it work?

Hi, I wanted to ask this question in the course related topic but topic is dead with no replies for months. The following is the solution code for the merge sort algorithm where you split the list into individual components and merge them and I can’t understand how it works when the recursion comes in :frowning:

def merge_sort(items):
  if len(items) <= 1:
    return items

  middle_index = len(items) // 2
  left_split = items[:middle_index]
  right_split = items[middle_index:]

  left_sorted = merge_sort(left_split)
  right_sorted = merge_sort(right_split)

  return merge(left_sorted, right_sorted)

def merge(left, right):
  result = []

  while (left and right):
    if left[0] < right[0]:
      result.append(left[0])
      left.pop(0)
    else:
      result.append(right[0])
      right.pop(0)

  if left:
    result += left
  if right:
    result += right

  return result

My understanding is

  1. The list gets inputted into the merge_sort() function
  2. List gets split into half.
  3. It goes into the merge function. The merge function takes the smaller first element from the two lists and appends it to result.
  4. If both left/right still has elements, add it to the end of the results list
  5. The process repeats through the recursion somehow? This is the part where I no longer understand the code

Could anyone shed light how it works after that? Thanks a bunch.

Maybe some prints or loggin will help you track the flow, I’d suggest going through with pen and pencil for a short input first (maybe 3 to 4 elements) then checking if your logic matches the observed output. Something like the following would do it (may be a little over the top) but add/remove this hacky method as you will-

def merge_sort(items, count=0):

    count += 1
    print(f"Entering merge_sort, on level = {count}")

    if len(items) <= 1:
        return items
    middle_index = len(items) // 2

    left_split = items[:middle_index]
    right_split = items[middle_index:]

    print(f"Calling merge sort on left_split, {left_split}")
    left_sorted = merge_sort(left_split, count)

    print(f"Calling merge sort on right_split, {right_split}")
    right_sorted = merge_sort(right_split, count)

    print(f"Now leaving merge_sort, level {count}")
    return merge(left_sorted, right_sorted, count)
  • The list gets inputted into the merge_sort() function
  • List gets split into half until there is only one item recursively submitted to merge_sort(). This causes the first return from the function, which answers these calls for the first time:
 left_sorted = merge_sort(left_split)    #recursive left call
 right_sorted = merge_sort(right_split)  #recursive right call

 return merge(left_sorted, right_sorted) #does not run until merge_sort(left_split) and 
                                         #merge_sort(right_split) are called with one item
  • It goes into the merge function. The merge function takes the smaller first element from the two lists and appends it to result until left or right are empty inside the while loop.
  • After the while loop, if left or right still has elements, add it to the end of the results list
1 Like

Hi, thanks for the reply. I still have some confusion though… Why is it that after the first return when len(items) <= 1, where they return items, the function still returns a second merge(left_sorted, right_sorted)? Doesn’t a function end when it reaches the first return?

Also, if the function merge() splits the list into its individual elements in their own lists before going into merge_sort(), wouldn’t there be more than two lists (assuming if the input list has a length greater than 2)? Then how does it select which is the smaller one from there? Sorry, I’m very clueless :frowning:

This left_sorted = merge_sort(left_split) calls the function many times, so if your input list was 4, this would call the function a second time with list left_split of length 2. This would split again so that you called the function with a list of 1, each of these functions that were called are waiting for their return. Once the merge_sort(left_split) returns a single number, then then function called with 2 numbers will call right_sorted = merge_sort(right_split) with one number and receive its return. It then runs return merge(left_sorted, right_sorted) returning a list of 2 numbers to the original function call, which will then run right_sorted = merge_sort(right_split) with the right 2 numbers.

This is just a visual representation of function calls for merge_sort([1,2,3,4])

merge_sort([1,2,3,4])
merge_sort([1,2])  merge_sort([3,4])
merge_sort([1]) merge_sort([2]) merge_sort([3]) merge_sort([4])

The function calls left first, so [1] returns to [1,2], then [2] to [1,2], then [1,2] returns to [1,2,3,4], then [3] to [3,4], [4] to [3,4], [3,4] to [1,2,3,4] and finally the sorted list returns. 7 functions called, and so 7 returns required.

1 Like