FAQ: Natural Language Parsing with Regular Expressions - Review

Community%20FAQs%20on%20Codecademy%20Exercises

This community-built FAQ covers the “Review” exercise from the lesson “Natural Language Parsing with Regular Expressions”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Natural Language Processing

FAQs on the exercise Review

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

I did the project on https://www.codecademy.com/paths/data-science/tracks/natural-language-processing-dsp/modules/parsing-with-regular-expressions-dsp/projects/nlp-regex-parsing-project
Can someone say whether I got it correct as there is no reference video on it.

from nltk import pos_tag, RegexpParser

from tokenize_words import word_sentence_tokenize

from chunk_counters import np_chunk_counter, vp_chunk_counter

import text of choice here

text = open(“dorian_gray.txt”,encoding=‘utf-8’).read().lower()

sentence and word tokenize text here

word_tokenized_text = word_sentence_tokenize(text)

store and print any word tokenized sentence here

single_word_tokenized_sentence = word_tokenized_text[100]

#print(single_word_tokenized_sentence)

create a list to hold part-of-speech tagged sentences here

pos_tagged_text =

create a for loop through each word tokenized sentence here

for word_tokenized_sentence in word_tokenized_text:

part-of-speech tag each sentence and append to list of pos-tagged sentences here

pos_tagged_text.append(pos_tag (word_tokenized_sentence))

store and print any part-of-speech tagged sentence here

single_pos_sentence = pos_tagged_text[100]

print(single_pos_sentence)

define noun phrase chunk grammar here

np_chunk_grammar = “NP: {

?*}”

create noun phrase RegexpParser object here

np_chunk_parser = RegexpParser(np_chunk_grammar)

define verb phrase chunk grammar here

vp_chunk_grammar = “VP: {

?<VB.><RB.?>?}”

create verb phrase RegexpParser object here

vp_chunk_parser = RegexpParser(vp_chunk_grammar)

create a list to hold noun phrase chunked sentences and a list to hold verb phrase chunked sentences here

np_chunked_text =

vp_chunked_text =

create a for loop through each pos-tagged sentence here

for pos_tagged_sentence in pos_tagged_text:

chunk each sentence and append to lists here

np_chunked_text.append(np_chunk_parser.parse(pos_tagged_sentence))

vp_chunked_text.append(vp_chunk_parser.parse(pos_tagged_sentence))

store and print the most common NP-chunks here

most_common_np_chunks = np_chunk_counter(np_chunked_text)

print(most_common_np_chunks)

store and print the most common VP-chunks here

most_common_vp_chunks = vp_chunk_counter(vp_chunked_text)

print(most_common_vp_chunks) from nltk import pos_tag, RegexpParser

from tokenize_words import word_sentence_tokenize

from chunk_counters import np_chunk_counter, vp_chunk_counter

import text of choice here

text = open(“dorian_gray.txt”,encoding=‘utf-8’).read().lower()

sentence and word tokenize text here

word_tokenized_text = word_sentence_tokenize(text)

store and print any word tokenized sentence here

single_word_tokenized_sentence = word_tokenized_text[100]

#print(single_word_tokenized_sentence)

create a list to hold part-of-speech tagged sentences here

pos_tagged_text =

create a for loop through each word tokenized sentence here

for word_tokenized_sentence in word_tokenized_text:

part-of-speech tag each sentence and append to list of pos-tagged sentences here

pos_tagged_text.append(pos_tag (word_tokenized_sentence))

store and print any part-of-speech tagged sentence here

single_pos_sentence = pos_tagged_text[100]

print(single_pos_sentence)

define noun phrase chunk grammar here

np_chunk_grammar = “NP: {

?*}”

create noun phrase RegexpParser object here

np_chunk_parser = RegexpParser(np_chunk_grammar)

define verb phrase chunk grammar here

vp_chunk_grammar = “VP: {

?<VB.><RB.?>?}”

create verb phrase RegexpParser object here

vp_chunk_parser = RegexpParser(vp_chunk_grammar)

create a list to hold noun phrase chunked sentences and a list to hold verb phrase chunked sentences here

np_chunked_text =

vp_chunked_text =

create a for loop through each pos-tagged sentence here

for pos_tagged_sentence in pos_tagged_text:

chunk each sentence and append to lists here

np_chunked_text.append(np_chunk_parser.parse(pos_tagged_sentence))

vp_chunked_text.append(vp_chunk_parser.parse(pos_tagged_sentence))

store and print the most common NP-chunks here

most_common_np_chunks = np_chunk_counter(np_chunked_text)

print(most_common_np_chunks)

store and print the most common VP-chunks here

most_common_vp_chunks = vp_chunk_counter(vp_chunked_text)

print(most_common_vp_chunks)

I want to download the pickle files from somewhere to run my code, whats the way to do it specifically for this portal’s code

@victoria_dr @lilybird Sorry for tagging you directly but no-one seems to respond on this course(Anyone who can help please feel free to respond). I am currently doing the AlienBot Chatbot Project
https://www.codecademy.com/paths/build-chatbots-with-python/tracks/rule-based-chatbots/modules/rule-based-chatbots/projects/python-chatbot-alienbot
I am at instruction #26 struggling to get past step 1 because somehow int() and float() functions just never work for me. Maybe I don’t understand the syntax. I COPY AND PASTED the exact same thing that was being shown in the hints even though it was the exact same thing that I had in the first place. Which is this:

def cubed_intent(self, number):

number = int(number)

cubed_number = number * number * number

return ("The cube of {number} is {cubed_number}. Isn't that cool?".format(number = number, cubed_number = cubed_number))

But I get the same TypeError every time:
TypeError: int() argument must be a string, a bytes-like object or a number, not ‘AlienBot’

I feel like I tried everything at this point and I didn’t come this far to give up over one silly typerror. Please help me guys I am really frustrated!

This error gives us a lot of information. First, it points out that the error is occurring when we use int(). It also tells us that the argument we supply to int() is not of the correct type. We are currently passing an object of type AlienBot to the function rather than a string, byte-like object, or number.

You’re using int() on this line, so this is where the error occurs. Make sure you’re passing in a valid data type to the int() function (AlienBot is not a valid data type for the function).

1 Like

@victoria_dr @lilybird
Thank you Victoria but I already understand that part of the situation. The data being passed in comes from a statement that the user will input. If there is a number in there input then the program will run the cubed_intent() function/method (I don’t really know the difference).

The cubed_intent() function will then return the cube of the number imputed by the user, or at least its supposed to do that. If the number = int(number) line of code has a type error then I don’t know how to address it. I’ve tried to look on other online forums and nothing helpful comes up about this specific type error.
I’m obviously a beginner but I think that the entire reply is being passed into the number parameter of the cubed_intent(self, number) function as I showed above. How do I stop this? assuming that this is even the issue. I don’t want to give up.
You may look at the entire programme below:

importing regex and random libraries

import re

import random

class AlienBot:

potential negative responses

negative_responses = (“no”, “nope”, “nah”, “naw”, “not a chance”, “sorry”)

keywords for exiting the conversation

exit_commands = (“quit”, “pause”, “exit”, “goodbye”, “bye”, “later”)

random starter questions

random_questions = (

    "Why are you here? ",

    "Are there many humans like you? ",

    "What do you consume for sustenance? ",

    "Is there intelligent life on this planet? ",

    "Does Earth have a leader? ",

    "What planets have you visited? ",

    "What technology do you have on this planet? "

)

def init(self):

self.alienbabble = {'describe_planet_intent': r'.*\s*your planet.*',

                    'answer_why_intent': r'.*why.*are.*you.*',

                    'cubed_intent': r'.*cube.*(\d+)'

                        }

Define .greet() below:

def greet(self):

self.name = input("Hi! What's your name?")

will_help = input(f"Hi {self.name}, I'm Etcetera. I'm not from this planet. Will you help me learn about your planet? ")

if will_help in self.negative_responses:

  print("Ok, have a nice Earth day!")

  return

self.chat()

This makes the user exit if they dont want to continue the convo. Define .make_exit() here:

def make_exit(self, reply):

for exit_command in self.exit_commands:

  if exit_command in reply:

    print("Ok, have a nice Earth day!")

    return True

This continues the conversation by testing if the user’s response does not indicate that hey want to exit. Define .chat() next:

def chat(self):

reply = input(random.choice(self.random_questions)).lower()

while not self.make_exit(reply):

  input(self.match_reply(reply))

This uses regular expressions defined in the dictionary above to determine what kind of reponse the user will receive to their reply. Define .match_reply() below:

def match_reply(self, reply):

for key, value in self.alienbabble.items():

  intent = key

  regex_pattern = value

  found_match = re.match(regex_pattern, reply)

  if found_match and intent == 'describe_planet_intent':

    return self.describe_planet_intent()

  elif found_match and intent == 'answer_why_intent':

    return self.answer_why_intent()

  elif found_match and intent == 'cubed_intent':

    return self.cubed_intent(self)

Define .describe_planet_intent():

def describe_planet_intent(self):

responses = ("My planet is a utopia of diverse organisms and species. ", "I am from Opidipus, the capital of the Wayward Galaxies. ")

return random.choice(responses)

Define .answer_why_intent():

def answer_why_intent(self):

responses = ("I come in peace. ", "I am here to collect data on your planet and its inhabitants. ", "I heard the coffee is good. ")

return random.choice(responses)

This is the one I’m having issues with please help:(

#Define .cubed_intent():

def cubed_intent(self, number):

number = int(number)

cubed_number = number * number * number

return ("The cube of {number} is {cubed_number}. Isn't that cool?".format(number = number, cubed_number = cubed_number))

Define .no_match_intent():

def no_match_intent(self):

return "Inside .no_match_intent()"

Create an instance of AlienBot below:

my_bot = AlienBot()

my_bot.greet()

Just two quick things before I delve into the problem. First, you might want to stop tagging lilybird in your replies. Second, please format your code in future replies according to this topic to make it easier to read. Thanks!


Right now, .cubed_intent() takes in 2 parameters: self and number. self is always automatically the object that .cubed_intent() is being called on; we don’t need to explicitly pass this in when we called our method. We do, however, need to pass in a value for number.

You call .cubed_intent() near the bottom of the .match_reply() method like so.

return self.cubed_intent(self)

What this method call does is it calls .cubed_intent() with the arguments self and self. I’ve illustrated this in the diagram below. self is passed into the self parameter because it is the object we are calling this instance method on. self is also passed into the number parameter because it is what we’ve included between the parentheses of our method call.

def cubed_intent(self, number)
                  ^     ^
       ___________|     |
      |                 |
    self.cubed_intent(self)

Because self is an instance of your class AlienBot, you are essentially using the following expression inside .cubed_intent().

number = int(self)

Therefore, you are using the int() function on an object of an invalid data type (which is AlienBot). You need to find a way to get an object with a valid data type that int() can operate on (such as a string, float, etc.) and pass that as the argument for the number parameter.


This might be a helpful article to help you understand the difference between functions and methods in Python.

@victoria_dr @lilybird I’m still trying to figure it out can you look at the code and tell me where I went wrong? I just want to move on with understanding. I’m doing the exact same thing as is said in the instructions and the “incorrect line of code” is the one suggested from the hints in the instructions.

It only works when I physically enter the number into the method.
elif found_match and intent == ‘cubed_intent’:

    return self.cubed_intent(3)

How is that even automated??? Anyone please assist. I don’t care if you work for Codecademy or not anything helps.

THANK YOU SO MUCH VICTORIA! This really helped a lot. It confirmed what I thought was the problem.

I still need to figure out how I am going to separate the word string from the number string passed in by the user and set it as the argument of the function. It works perfectly when I call it as a number but that number and its output remains even when other input numbers are imputed by the user.

UPDATE:
VICTORIA I DID IT! :laughing:
So I parsed the reply for the number using the search function. I then saved that as a variable and pasted it into the function as an argument.
elif found_match and intent == ‘cubed_intent’:

    result = re.search(r'(\d+)', reply).group(0)

    return self.cubed_intent(result)

That’s great! Nice to see that you solved the issue; happy coding!

1 Like