A question about logic: for loops and (probably) scope of variables

Hi,

I’m doing the Python Dictionaries Challenge Project (https://www.codecademy.com/paths/data-science/tracks/dscp-python-fundamentals/modules/dscp-python-dictionaries-challenge-projects/projects/hurricane-analysis) and there’s something that I don’t get and I just can’t put my finger on what that is.

The following is the desired code. (The exercise asks us to: Write a function that constructs a dictionary made out of the provided lists, where the keys of the dictionary are the names of the hurricanes, and the values are dictionaries themselves containing a key for each piece of data (Name, Month, Year) about the hurricane. Thus the key “a” would have the value: {‘Name’: 'a, ‘Month’: ‘jan’, ‘Year’: 1}.)

names1 = ['a', 'b', 'c']
months1 = ['jan', 'feb', 'mar']
years1 = [1, 2, 3]

def orga_by_name(name, month, year):
  dict_by_name = {}
  for a, b, c in zip(name, month, year):
    dict_record = {}
    dict_record['Name'] = a
    dict_record['Month'] = b
    dict_record['Year'] = c
    dict_by_name[a] = dict_record
  return dict_by_name

And this is how I wrote the code at my (unsuccessful) first attempt: (In short, this assigns the value of the last iteration to all keys, instead of assigning each value to the corresponding key.)

def orga_by_name(name, month, year):
    dict_by_name = {}
    dict_record = {}
  for a, b, c in zip(name, month, year):
    dict_record['Name'] = a
    dict_record['Month'] = b
    dict_record['Year'] = c
    dict_by_name[a] = dict_record
    #print(dict_by_name[a])
    #print(dict_by_name)
  return dict_by_name

print(dict_by_name[a]) in the second code shows me that I have correctly created the dictionaries for each hurricane. However, print(dict_by_name) shows me that for every dictionary created (dict_record), it is assigned at all keys of the dict_by_name dictionary. Therefore, my assumption was that the variable dict_by_name[a] designates all keys of the dict_by_name dictionary. (So, at each iteration, the dictionary created (dict_record) is assigned to all keys of the main dictionary (dict_by_name).)
But if it were correct, why dict_by_name[a] in the first code wouldn’t designate all keys of the dict_by_name dictionary too (therefore still assigning the dictionary created (dict_record) to all keys of the main dictionary, just like in this code)?

Most important of all, why does the first code work?

I think it’s just the scoping rules that I don’t get. I would really appreciate it if someone could clarify this. If I was unclear, please leave me questions about the situation.

Thanks

I’m not sure if it’s copy/paste or not but the indentation in that second piece of code seems a little off.

The first thing that pops up is that using x = {} creates an entirely new dictionary in the original code (look at where it is located!). If you follow your order you use the same dictionary, dict_record, so you are simply overwriting the same keys of this dictionary.

Most importantly this is the same object. It’s like the following where the same key is being overwritten each time-

a = {}
a ["name"] = "red"
a ["name"] = "orange"
a ["name"] = "yellow"
print(a)

Out: {'name': 'yellow'}

So your outer dictionary dict_by_name simply has the same object assigned to multiple different keys. Every time you update the same object so these keys all point to the same update object (which is changed on every iteration of your loop). Does that make sense?

Hi!
Thanks for your reply, but I am not sure that I completely get it.

For every key of dict_by_name to have the save value dict_record, the key a (in dict_by_name[a]) must therefore designate all of the keys. This will indeed become the case when the code goes through each iteration. However, at each iteration the key a designates a different key so why would Python update all of the former keys a as well? Or I guess it’s just the way it is when the variable (dict_record) is accessible?

(And yes, you’re right, the indentation was not intended.)

Thanks in advance!

The keys of dict_record are “Name”, “Month”, “Year”. On each iteration you use the same key which modifies the actual dict_record object. So dict_by_name might have different keys for each iteration but dict_record does not; it uses the same keys.

You assign the same object, dict_record to every single key in your outer dictionary dict_by_name. But on each iteration you change the value of dict_record. Bear in mind that you do NOT create a new object, you simply alter the same one.

Perhaps an example might help-

names = "Aida", "Bartosz", "Catherine",
colours = "red", "orange", "yellow",
outer = {}
record = {}

for name, colour in zip(names, colours):
    record["Name"] = name
    record["Colour"] = colour
    outer[name] = record

pprint(outer)
Out: {
    'Aida': {'Colour': 'yellow', 'Name': 'Catherine'},
    'Bartosz': {'Colour': 'yellow', 'Name': 'Catherine'},
    'Catherine': {'Colour': 'yellow', 'Name': 'Catherine'}
}
# Outer dictionary KEYS are different
# But the values reference the SAME OBJECT...
# To make this absolutely clear-
record["Name"] = "Dymas"
pprint(outer)
Out: {
    'Aida': {'Colour': 'yellow', 'Name': 'Dymas'},
    'Bartosz': {'Colour': 'yellow', 'Name': 'Dymas'},
    'Catherine': {'Colour': 'yellow', 'Name': 'Dymas'}
}
# We have changed every single value entry!
# How? They are all the SAME object!

Aside from making Chris Martin very happy (or rather sad perhaps) we have shown that every key in the outer dictionary is bound to the same object. Can you now see why the solution code deliberately creates a new dictionary on every iteraton?

Thank you so much for making this very clear!

In the solution code in your first post, there is a for loop. In the first iteration of the loop, a new empty dictionary is created and assigned to the variable dict_record. Then the dictionary is populated with data and finally an entry is made in your master dictionary dict_by_name in which this dictionary is assigned as the value for a key (he key being the name of the hurricane). In the next iteration (here is the important part), you create another new empty dictionary and assign it to the variable dict_record. This dictionary is completely separate and unrelated to the dictionary from the previous iteration. Yes, we are using the same variable name, but in the first iteration this variable points at a specific dictionary while in the next iteration this variable points at an entirely new and separate dictionary.

Whereas in your code, you have created a new dictionary and assigned it to the variable dict_record BEFORE you enter the loop. Within the loop, you update this dictionary with data and then make an entry in your master dictionary dict_by_name with this dictionary being the value of a key. That works fine for the first iteration. But in the next iteration, you aren’t creating a new dictionary. You are just modifying the existing dictionary dict_record. Dictionaries are mutable. They can be modified. During your second iteration when you are modifying the dictionary, the changes you make are going to affect the entries from the previous iterations in your master dictionary dict_by_name because the values of the keys in your master dictionary are pointing to the same dictionary dict_record.