Final project - Python 3 adaptation : got an empty list as a result


#1

I am doing the final project in Python. I had to adapt the code a little bit because I am running Python 3.5.2

So I started by adapting the code that Codecademy gave us about the markov chain. I haven't fetched a text in the web for now. So i use instead a simple doc.text for a start. I used the lyrics of a song as a text and it is supposed to generate a list with the words of the song.

When I use my run.py code, I only receive an empty list as a result : []
So I suppose that I have made a mistake while adapting the code for Python 3. If anybody have a clue on how to fix that, it is gonna be of great help. Thanks!

So here is my run.py (my cc_markov file is in the same dir as my run.py)


from cc_markov import MarkovChain

mc = MarkovChain()

mc.add_file("lyrics2.txt")

print (mc.generate_text(15))

And here is my adaptation of the cc_markov.py file


import re
import random
from collections import defaultdict, deque

"""
Codecademy Pro Final Project supplementary code

Markov Chain generator
  This is a text generator that uses Markov Chains to generate text
  using a uniform distribution.

  num_key_words is the number of words that compose a key (suggested: 2 or 3)
"""

class MarkovChain:

  def __init__(self, num_key_words=2):
    self.num_key_words = num_key_words
    self.lookup_dict = defaultdict(list)
    self._punctuation_regex = re.compile('[,.!;\?\:\-\[\]\n]+')
    self._seeded = False
    self.__seed_me()

  def __seed_me(self, rand_seed=None):
    if self._seeded is not True:
      try:
        if rand_seed is not None:
          random.seed(rand_seed)
        else:
          random.seed()
        self._seeded = True
      except NotImplementedError:
        self._seeded = False

  """
  " Build Markov Chain from data source.
  " Use add_file() or add_string() to add the appropriate format source
  """
  def add_file(self, file_path):
    content = ''
    with open(file_path, 'r') as fh:
      self.__add_source_data(fh.read())

  def add_string(self, str):
    self.__add_source_data(str)

  def __add_source_data(self, str):
    clean_str = self._punctuation_regex.sub(' ', str).lower()
    tuples = self.__generate_tuple_keys(clean_str.split())
    for t in tuples:
      self.lookup_dict[t[0]].append(t[1])

  def __generate_tuple_keys(self, data):
    if len(data) < self.num_key_words:
      return

    for i in range(len(data) - self.num_key_words):
      yield [ tuple(data[i:i+self.num_key_words]), data[i+self.num_key_words] ]

  """
  " Generates text based on the data the Markov Chain contains
  " max_length is the maximum number of words to generate
  """
  def generate_text(self, max_length=20):
    context = deque()
    output = []
    if len(self.lookup_dict) > 0:
      self.__seed_me(rand_seed=len(self.lookup_dict))

      idx = random.randint(0, len(self.lookup_dict)-1)
      chain_head = list(self.lookup_dict [idx])
      context.extend(chain_head)

      while len(output) < (max_length - self.num_key_words):
        next_choices = self.lookup_dict[tuple(context)]
        if len(next_choices) > 0:
          next_word = random.choice(next_choices)
          context.append(next_word)
          output.append(context.popleft())
        else:
          break
      output.extend(list(context))
    return output


#2

Hi, @marisestp ,

In the generate_text function, change this ...

chain_head = list(self.lookup_dict [idx])

... to this ...

chain_head = list(list(self.lookup_dict.keys())[idx])

In Python 3.x, this is necessary in order to convert the result that is returned by the keys method into a list.

In Python 3.x, unlike Python 2.x, the keys method returns what is known as a view, which can be used to iterate through the keys, but which does not contain any of the keys. Passing the view of the keys to the list built in function returns the list of the keys that you need.


#3

Thank you very much appylpye! This effectively solve the problem. I was confusely understanding that I can't not use the keys on a list but I was clueless on solving this, due to lack of Python knowledge. Thank you for your explanations!


#4

Hi @marisestp ,

Following is some example code that illustrates how, in Python 3.x, altering a dictionary also changes the view of the dictionary, even when the view was created prior to the changes made to the dictionary. See the comments within the code, and compare the output when executing it in Python 2.x to that from executing it in Python 3.x ...

# Menu_Dictionary.py
# Variant on https://www.codecademy.com/en/courses/python-beginner-en-pwmb1/2/2

# Demonstrate the dict keys() method
# Compare output in Python 2.x and Python 3.x

# August 28, 2016

menu = {} # Empty dictionary

# In Python 2.x, we get a list
# In Python 3.x, we get a view
menu_keys = menu.keys()

# If this is executing in Python 3.x,
# notice that altering the menu dictionary object
# also alters the menu_keys view object.

print("menu_keys: {:s}".format(str(menu_keys)))
menu['Chicken Alfredo'] = 11.50 # Adding new key-value pair
print("menu_keys: {:s}".format(str(menu_keys)))
# Your code here: Add some dish-price pairs to 'menu'
menu['Spam'] = 2.39
menu['Eggs'] = 1.79
menu['Ham'] = 9.49
menu['Apple Pie'] = 2.99
menu['Salad Kaek'] = 3.49
menu['Caramelized Onions'] = 1.49
menu['Cole Slaw'] = 2.49
menu['Chicken Bun'] = 9.99
menu['Huge Bag of New Jersey Onions'] = 770.00
print("There are " + str(len(menu)) + " items on the menu.")
menu_list = sorted(menu.keys())
for entree in menu_list:
    print("{:s} -> ${:0.2f}".format(entree, menu[entree]))
print("menu_keys: {:s}".format(str(menu_keys)))

Output from Execution in Python 3.x ...

menu_keys: dict_keys([])
menu_keys: dict_keys(['Chicken Alfredo'])
There are 10 items on the menu.
Apple Pie -> $2.99
Caramelized Onions -> $1.49
Chicken Alfredo -> $11.50
Chicken Bun -> $9.99
Cole Slaw -> $2.49
Eggs -> $1.79
Ham -> $9.49
Huge Bag of New Jersey Onions -> $770.00
Salad Kaek -> $3.49
Spam -> $2.39
menu_keys: dict_keys(['Chicken Alfredo', 'Chicken Bun', 'Caramelized Onions', 'Huge Bag of New Jersey Onions', 'Cole Slaw', 'Ham', 'Apple Pie', 'Eggs', 'Spam', 'Salad Kaek'])

Have fun with the Markov chain project.


#5

Thanks a lot for all this additionnal info. It is very clear to me now. Now, I will give a try to the first part of the project, fetching data from the web instead of using a doc.txt.

Thanks a lot! Marise