FAQ: Hash Maps: Python - Review

This community-built FAQ covers the “Review” exercise from the lesson “Hash Maps: Python”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Computer Science

Complex Data Structures

FAQs on the exercise Review

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Hello!

I was just copying down the code in my notes and found something that I don’t understand.

def retrieve(self, key):
array_index = self.compressor(self.hash(key))
possible_return_value = self.array[array_index]

if possible_return_value is None:
  return None

if possible_return_value[0] == key:
  return possible_return_value[1]

retrieval_collisions = 1

while (possible_return_value != key):
  new_hash_code = self.hash(key, retrieval_collisions)
  retrieving_array_index = self.compressor(new_hash_code)
  possible_return_value = self.array[retrieving_array_index]

  if possible_return_value is None:
    return None

  if possible_return_value[0] == key:
    return possible_return_value[1]

  number_collisions += 1

return

This is the retrieval method but it ends with += 1 for number_collisions and not retrieval_collisions. This is the solution that Codecademy gives, so I think it must be right, only I don’t understand why we +1 number_collisions which is in the Assign method. Can anyone explain? Thanks!

i think so it’s a typo, the correct solution would be retrieval_collisions+=1

3 Likes

Hi guys. I had a question regarding the open addressing technique used in this lesson. It doesn’t seem to account for how there might be a key-value we want to assign but every possible place we can assign it to is already taken by another key-value pair. The code looks like it would just go into an infinite loop where it checks the same indexes again and again trying to assign it. It looks like that same issue can arise when trying to retrieve a value that isn’t assigned to the list but every possible index it could’ve been assigned to that the code checks is already taken by another key-value pair. In this case, again, the retrieve method would go in an infinite loop checking the same indexes again and again never returning None. How should the code account for this?

4 Likes

Hi guys. I had a question regarding the open addressing technique used in this lesson. It doesn’t seem to account for how there might be a key-value we want to assign but every possible place we can assign it to is already taken by another key-value pair. The code looks like it would just go into an infinite loop where it checks the same indexes again and again trying to assign it. It looks like that same issue can arise when trying to retrieve a value that isn’t assigned to the list but every possible index it could’ve been assigned to that the code checks is already taken by another key-value pair. In this case, again, the retrieve method would go in an infinite loop checking the same indexes again and again never returning None. How should the code account for this?

Note that the final question in the exercise is

What should your hash map do if a key-value is added and the array is full? How does this hash map handle that?

Clearly, as you note, this hash map does not handle it.

From the Wikipedia article on Hash Maps, under the section on open addressing:

A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array. In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so. For many applications, these restrictions mandate the use of dynamic resizing, with its attendant costs.

In order to play around with this a bit, I created a slightly longer list of rock types:

hash_map = HashMap(25)
rocks = {
'Adakite': 'Igneous', 'Alkali': 'Igneous', 'Anorthosite': 'Igneous', 'Aplite': 'Igneous',
'gabbro': 'Igneous', 'Argillite': 'Sedimentary',  'Arkose': 'Sedimentary',  'sandstone': 'Sedimentary',
'Breccia': 'Sedimentary',  'Calcarenite': 'Sedimentary',  'Chalk': 'Sedimentary', 
'Blueschist': 'Metamorphic' , 'Cataclasite': 'Metamorphic',  'Gneiss': 'Metamorphic', 
'Granulite': 'Metamorphic',  'Greenschist': 'Metamorphic',  'Hornfels': 'Metamorphic'
 }

for key, value in rocks.items():
  hash_map.assign(key.lower(), value.lower())

There are 17 entries. Put in some print() statements to watch the collisions. With an array of length 25, there are from 10 - 20 collisions; with 30, often none; with 16, obviously, an infinite loop.

5 Likes

I will share for you my “delete” method, following the same procedure as the getter and setter with open addressing:

  def delete(self, key):
    array_index = self.compressor(self.hash(key))
    possible_pair = self.array[array_index]
    
    if possible_pair == None:
      return print("\"{0}\" is not in this hash map".format(key))
    
    if possible_pair[0] == key:
      print("value for this key before deletion:\n {0}".format(possible_pair[1]))
      self.array[array_index] = None
      return
    
    del_collisions = 1
    while possible_pair[0] != key:
      new_array_index = self.compressor(self.hash(key, del_collisions))
      possible_pair = self.array[new_array_index]
      
      if possible_pair == None:
        return print("\"{0}\" is not in this hash map".format(key))
      
      if possible_pair[0] == key:
        print("value for this key before deletion:\n {0}".format(possible_pair[1]))
        self.array[new_array_index] = None
        return
      
      del_collisions += 1
    
    return

I was thinking about calling retrieve then assigning None to the array index, but this implies modifying retrieve and assign methods.

5 Likes

Agree.
We can test the given solution:

keys_range = range(1000)
keys = []
for x in keys_range:
  keys.append(str(x))

values = range(5000, 6000)
for x, y in zip(keys, values):
  hash_map.assign(x, y)
print(hash_map.retrieve('99'))

Will get this error:
UnboundLocalError: local variable 'number_collisions' referenced before assignment

Change number_collisions+=1 to retrieval_collisions+=1 solves the issue.

what is the main things i need to think about when i try to build a function that delete from hash map?
like lets say i had colision once? , and the first item i deleted was the one before the colision, and now i want to delete the one after the colision?
is it mean that to delete i need to search items one by one? and i cant use the adressing system?
and i guess i need to change the retrieve function either?

class HashMap:

  def __init__(self, array_size):

    self.array_size = array_size

    self.array = [None for item in range(array_size)]

  def hash(self, key, count_collisions=0):

    return sum(key.encode()) + count_collisions

  def compressor(self, hash_code):

    return hash_code % self.array_size

  def assign(self, key, value, number_collisions = 0):

    current_array_value = self.array[self.compressor(self.hash(key, number_collisions))]

    if current_array_value is None:

      self.array[array_index] = [key, value]

      return

    if current_array_value[0] == key:

      self.array[array_index] = [key, value]

      return

    number_collisions += 1

    self.assign(self, key, value, number_collisions)

    return

  def retrieve(self, key, retrieval_collisions = 0):

    array_index = self.compressor(self.hash(key, retrieval_collisions))

    possible_return_value = self.array[array_index]

    if possible_return_value is None:

        return None

    if possible_return_value[0] == key:

        return possible_return_value[1]

    retrieval_collisions += 1

    return self.reretrieve(key, retrieval_collisions)

hash_map = HashMap(15)

hash_map.assign("gabbro", "igneous")

hash_map.assign("sandstone", "sedimentary")

hash_map.assign("gneiss", "metamorphic")

print(hash_map.retrieve("gabbro"))

print(hash_map.retrieve("sandstone"))

print(hash_map.retrieve("gneiss"))

I think it is a bit better with recursion than repeating the same process in a while loop for the assign() and retrieve() methods.

1 Like

I’ve tried to make code more simle for understanding with adding functions set_value_if_key_is_none_or_found and get_value_if_key_is_none_or_found.
Who knows is it good way?

hi

You probably copied it wrongly or something. Here is how I have it on my own Notepad;

if possible_return_value[0] == key:

  return possible_return_value[1]

retrieval_collisions = 1

while (possible_return_value != key):

  new_hash_code = self.hash(key, retrieval_collisions)

  retrieving_array_index = self.compressor(new_hash_code)

  possible_return_value = self.array[retrieving_array_index]

  if possible_return_value is None:

    return None

  if possible_return_value[0] == key:

    return possible_return_value[1]

  retrieval_collisions += 1

return

he didn’t copied wrong, the solution really gives the wrong var

Hi,

Hope I understand well this lesson…
I created

  • a delete method: I just simple added another optional parameter to the assign method because the logic to assign or delete is exactly same… (is it true?)
  • and a helper method, get_array_value, to decrease repeated code.
My solution
class HashMap:
  def __init__(self, array_size):
    self.array_size = array_size
    self.array = [None for item in range(array_size)]

  def hash(self, key, count_collisions = 0):
    key_bytes = key.encode()
    hash_code = sum(key_bytes)
    return hash_code + count_collisions

  def compressor(self, hash_code):
    return hash_code % self.array_size

  def get_array_value(self, key, count_collisions = 0):
    array_index = self.compressor(self.hash(key))
    return self.array[array_index]

  def assign(self, key, value, delete = False):
    array_index = self.compressor(self.hash(key))
    current_array_value = self.array[array_index]

    if current_array_value is None or current_array_value[0] == key:
      if delete:
        self.array[array_index] = None
      else:
        self.array[array_index] = [key, value]
      return

    # Collision!
    number_collisions = 1

    while(current_array_value[0] != key):
      current_array_value = self.get_array_value(key, number_collisions)

      if current_array_value is None:
        if delete:
          self.array[array_index] = None
        else:
          self.array[new_array_index] = [key, value]
        return

      if current_array_value[0] == key:
        self.array[new_array_index] = [key, value]
        return

      number_collisions += 1

    return

  def retrieve(self, key):
    possible_return_value = self.get_array_value(key)

    if possible_return_value is None:
      return None

    if possible_return_value[0] == key:
      return possible_return_value[1]

    retrieval_collisions = 1

    while (possible_return_value != key):
      possible_return_value = self.get_array_value(key, retrieval_collisions)

      if possible_return_value is None:
        return None

      if possible_return_value[0] == key:
        return possible_return_value[1]

      retrieval_collisions += 1

    return
  
  def delete(self, key):
    self.assign(key, "", True)

hash_map = HashMap(15)
hash_map.assign("gabbro", "igneous")
hash_map.assign("sandstone", "sedimentary")
hash_map.assign("gneiss", "metamorphic")

print(hash_map.retrieve("gabbro"))
print(hash_map.retrieve("sandstone"))
print(hash_map.retrieve("gneiss"))

hash_map.delete("gneiss")

print(hash_map.retrieve("gneiss"))

1 Like

def dele(self, key):

deleting_index = self.compressor(self.hash(key))
del_array_value = self.array[deleting_index]

if self.array[deleting_index] is None:
  return 

if del_array_value[0] == key:
  self.array[deleting_index] = None

index_to_del = self.find_key_indx(key)

if index_to_del is None:
  return None
else:
  self.array[index_to_del] = None

def find_key_indx(self, key):

col = 1
while col != key:

    new_hash_code = self.hash(key, col)
    indx = self.compressor(new_hash_code)
    return_value = self.array[indx]

    if return_value is None:
      return None

    if return_value[0] == key:
      return  indx

    col += 1

This is how i did the del and made another method to make the code efficent, just asking if it is valid?