FAQ: Recursion: Python - Recursive Data Structures

This community-built FAQ covers the “Recursive Data Structures” exercise from the lesson “Recursion: Python”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Learn Recursion: Python

FAQs on the exercise Recursive Data Structures

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

(Deleted by user…)

I found the big O calculation pretty confusing on this one.

In the other exercises, working out the big O efficiency for a recursive algorithm, just meant working out how many times the algorithm was called. And for this algorithm, it gets called 2N + 1 times, which would imply O(N).

(I modified the code to keep track of the number of function calls, and tried it from N=0 to N=19. The tests confirmed, the number of function calls is definitely 2N+1)

But instead, the hint talks about the number of elements that are copied at each level. And if i’m reading it right, it ignores the number of actual function calls. The calculation is adding up all the elements that are copied from one “level” down to the next level.

We could use better advice on what operations we’re actually supposed to be counting.

[[The hint also says that the number of elements is reduced by 1, for each level, and that that’s a constant factor. But it’s actually 1 element gone on the first level, 2 more on the next, 4 on the level after that… so I’m really confused]]


@giga4089513387, you are correct that there are 2N function calls. However, it is not just the number of calls to the function that is important, but also the work which is done within the function at each call…

But, as far as I can tell, the work in this case involves finding the value of a list at a certain index (constant time), setting a key:value pair for a dictionary (constant time), and calling a function twice (constant time). So, I don’t see that there is anything there to increase the time complexity beyond O(N).

To see how the time varies with input, I used the Python module timeit. Here are the results I got, using sorted lists of lengths increasing from 10 to 10**7 by a factor of 10 each time (source code below).

build_bst Time for n = 10 is 0.00011 (10 runs)
    Current time / prior time = 0.00010709999999997111
build_bst Time for n = 100 is 0.00104 (10 runs)
    Current time / prior time = 9.702147525679695
build_bst Time for n = 1000 is 0.01082 (10 runs)
    Current time / prior time = 10.415359445673952
build_bst Time for n = 10000 is 0.12183 (10 runs)
    Current time / prior time = 11.256648125219485
build_bst Time for n = 100000 is 1.16860 (10 runs)
    Current time / prior time = 9.59236108488979
build_bst Time for n = 1000000 is 12.27396 (10 runs)
    Current time / prior time = 10.503123264751892
build_bst Time for n = 10000000 is 121.69054 (10 runs)
    Current time / prior time = 9.914530073437673

It seems pretty evident that the average running time for each list length is very nearly 10 * the time for the previous list; i.e., the function certainly appears to run in linear time.

I’d like to see some more comment on this one!!

The above data are the output from the code below:

#  https://docs.python.org/3/library/timeit.html#examples

# Here is the function being tested

def build_bst(my_list):  
  if my_list == []:
    return "No Child"
  middle_idx = len(my_list)//2
  middle_value = my_list[middle_idx]
#  print("Middle index: {}".format(middle_idx))
#  print( "Middle value: {}".format(middle_value))
  tree_node = {'data': middle_value}
  tree_node['left_child'] = build_bst(my_list[:middle_idx])
  tree_node['right_child'] = build_bst(my_list[middle_idx + 1:])  
  return tree_node

# Here is the function that will be passed to timeit
f1 = build_bst
f1_name = "build_bst"

# Here is the timing run
if __name__ == "__main__":
    import timeit
    import random
    import math
    runs = 10
    t_2, i_2 = 1, 1
    base = 10
    for i in range(1, 8):
        in_lst = [j for j in range(base**i)]    
        t_1 = timeit.timeit("f1(in_lst)", number = runs, globals = globals())
        i_1 = i
        print("{} Time for n = {} is {:.5f} ({} runs)".format(f1_name, base**i_1, t_1, runs))
        print("    Current time / prior time = {}".format(
            t_1/t_2, base**i_1/base**i_2))  
        t_2 = t_1
        i_2 = i_1

Making a bst from a sorted list SHOULD be O(N).
Copying the lists is …wrong.

The total number of copies made (easiest way is to count while doing it) does scale superlinearly:

1 0
10 19
100 480
1000 7987
10000 113631
100000 1468946
1000000 17951445
10000000 213222809

Is that O(NlogN)? Idk, maybe, plausible.
The timings look linear because it’s close enough to linear that large constants from python drown it out and the copying is done by C code which is probably freakishly fast, possibly even cheating a bit with clever tricks.


Uh. A sorted list is already a bst. So O(1), really. Yeah. But copying N elements into another structure is O(N)

1 Like

That is the point of @giga4089513387’s post. The course material claims otherwise:

N is the length of our input list.

Our tree will be logN levels deep, meaning there will logN times where a new parent-child relationship is created.

If we have an 8 element list, the tree is 3 levels deep: 2**3 == 8 .

Each recursive call is going to copy approximately N elements when the left and right halves of the list are passed to the recursive calls. We’re reducing by 1 each time (the middle_value ), but that’s a constant factor .

Putting that together, we have N elements being copied logN levels for a big O of N*logN .

1 Like

What do you mean; which lists?

1 Like

They split the input list into two lists + 1 element.
They do this by copying them.
There is no need to copy, it should be more of a “view”, and slicing a view is a constant time operation.

But since they do copy, it may indeed be O(n log n), I’m not entirely convinced by their use of “constant” because it’s not all that constant, but I don’t… math.

And since the real algorithm to this is to do absolutely nothing (is O(0) a thing or is that O(1)) it’s … idk, moving on, better things to spend brain cycles on.


Perhaps the author had in mind that to cast an unsorted list to bst would be Nlog(N), but just … well, skipped a step?

As for final brain cycle of the day, to clarify: I don’t see any list copying in the exercise, just slicing. Where is copying called for?

1 Like

list slicing copies

An easy fix would be to send in the whole list + a range
slicing a range is a constant time operation (you get a new range back)

or use numpy arrays, their slices are views
or maybe some other library that isn’t as heavyweight…but then again np imports pretty quickly


Ah, I didn’t know that. Thanks!

my_lst = [1,2,3, ['a','b', 'c']]
new_lst = my_lst[2:]
print("new_ list = ", new_lst)

new_lst[0] = 25
print("my_list  =  ", my_lst)
print("new_ list = ", new_lst)

new_lst[1][0] = 'x'
print("my_list  =  ", my_lst)
print("new_ list = ", new_lst)


new_ list =  [3, ['a', 'b', 'c']]
my_list  =   [1, 2, 3, ['a', 'b', 'c']]
new_ list =  [25, ['a', 'b', 'c']]
my_list  =   [1, 2, 3, ['x', 'b', 'c']]    # Changed!
new_ list =  [25, ['x', 'b', 'c']]

What is log? Some python syntax?

1 Like

mathmatical function for the inverse of squaring a number. 2**2 = 4 thus log4 = 2
check youtube for great 5ish minute videos explaining the concept.
as for programming LogN and NlogN are time values for big O notation

Can anyone confirm that O(N*logN) as discussed in the last exercise (and above) is correct? Every outside resource I found says it should be O(N) to create a bst from a sorted array. I don’t understand why the number of levels created would be relevant for Big O. Since we have to touch each array element once to create it’s node, O(N) makes sense to me.

a sorted array is a bst. it takes 0 time to convert

I noticed something. In the example for getting half of a list (I check hints even if I work something out). it says:

pets = ["dogs", "cats", "lizards", "parrots", "giraffes"]
middle_idx = len(pets) // 2
# 2
first_half_pets = pets[:middle_idx + 1]
# ["dogs", "cats", "lizards"]
last_half_pets = pets[middle_idx + 1:]
# ["parrots", "giraffes"]

If I do as it asks, I get infinite recursion. Instead I have to do:

tree_node["left_child"] = build_bst(my_list[:middle_idx])

  tree_node["right_child"] = build_bst(my_list[middle_idx + 1:])

I understand we don’t want the middle value since otherwise we’ll duplicate nodes. Is the suggestion wrong, or just different use case?

1 Like

Maybe that hint was meant to be that way, for us to step a little farther. I also did as you’ve mentioned. Just took “+1” from the slicing of the first, left half.

Yes, it also confused me a little. Because i was searching for the base of the logarithm. First my guess was, maybe it was a natural logarithm (ln) with base of “e” (e ~ 2,718).
But turned out it seems to be because of the standard practice in programming, to assume that there is a base of 2.

1 Like

There is a mistake:
Please try to make the Codecademy Team or other people to fix this error.