Censor Dispenser Challenge Project (Python)

gerardomtz_h · April 8, 2020, 6:17am

gist.github.com

https://gist.github.com/codecademydev/735dfcf26a48d91048f32d48599c2e03

script.py

#The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

def censor_word(email, word):
  email = email.casefold()
  word = word.casefold()
  if word in email:

This file has been truncated. show original

text2239032983 · April 8, 2020, 12:39pm

Hi there

Call the functions for censoring and write coresponding email variable as an argument, which represents text which should be censored

also, you and everyone else, do not forget to comment your code, so either you later, or anyone else can understand what you were trying to do.

Happy learning!

gerardomtz_h · April 8, 2020, 8:10pm

You are right about making comments on the code, Thanks for the advice!
It seems vey logical to do it, but I wasn’t considering it.

But I still can’t modify the text on the email, do you have any suggestions?
This happens even if I just write (without passing the email to the function):

email_one.replace('learning algorithms', '***')
print(email_one)

wanjapm · April 8, 2020, 8:43pm

https://github.com/wanjapm/censor_dispenser/blob/master/censor_dispenser.py
Very interesting challenge.

I took a bit of time trying to get a good final version.
Advantages of this code (after comparing with solution):

Commonly used lines of code have been made into smaller re-usable functions to make it easy to read and use
Friendly variable names
The paragraphs are maintained. So the text is censored neatly. (Question 4 and 5)

gerardomtz_h · April 9, 2020, 8:03am

Working Code

gist.github.com

https://gist.github.com/codecademydev/f6eae6c61194a9580969063035333b6d

script.py

# These are the emails you will be censoring. '''Censors emails, to keep valuable information hidden'''
#The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

def censor_word(email, word):
  '''Censors a single word/phrase and returns email'''
  censor = ''

This file has been truncated. show original

bit9931108677 · April 9, 2020, 12:09pm

Here is my final solution. I’m really happy with the first two functions (redact_proprietary and redact_negative), they handle censoring quite elegantly and don’t care about case. Word ‘subsets’ (e.g. ‘danger’ is subset of ‘dangerous’) do give it some trouble, but even so I think it’s ok.

The third solution redact_all works, even if it is not optimized. Here, problems arise when terms are phrases instead of single words, as .split() doesn’t really handle those well.

gist.github.com

https://gist.github.com/codecademydev/77da1523c1031ea0e5f67f537caa7941

script.py

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

redact_replacer = "[REDACTED]"
proprietary_terms = ["she", "personality matrix", "sense of self", "self-preservation", "learning algorithm", "her", "herself"]
negative_words = ["concerned", "behind", "dangerous", "danger",  "alarming", "alarmed", "out of control", "help", "unhappy", "bad", "upset", "awful", "broken", "damaging", "damage",  "dismal", "distressed", "distressing", "concerning", "horrible", "horribly", "questionable"]
# Note that the order of negative words has been changed such that terms that are 'subset' of other

This file has been truncated. show original

dyrits · April 9, 2020, 1:20pm

You can find mine on my GitHub : https://github.com/Dyrits/Python-CC-ChallengeProjects/tree/master/Censor%20Dispenser

The hardest part was to exclude the punctuation.

goananda · April 9, 2020, 4:36pm

Finally I made one universal function for all the cases:

gist.github.com

https://gist.github.com/codecademydev/4f217aad83bd7a23cf0b0c90a2b08a3e

script.py

def censor_text(text, phrases, limit=0, near=False, letters="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'"):

  # Words and word positions
  words = []
  word_positions = []
  letter_inds = [i for i in range(len(text)) if text[i] in letters]
  noletter_inds = [i for i in range(len(text)) if not i in letter_inds]
  point = -1
  while [i for i in letter_inds if i > point]:
    first_letter = [i for i in letter_inds if i > point][0]

This file has been truncated. show original

dyrits · April 9, 2020, 8:29pm

I’ve tried to test many projects from others, and nearly none of them had the same result I got. So, I don’t know if I am the one being wrong or not, or if I misunderstood the instructions.

text2239032983 · April 9, 2020, 9:06pm

replace does not edit string itself, rather just returns edited string. so you should add that ‘email.replace(…)’ into new variable, and print that new variable.

eg.
new_variable = email_one.replace(‘learning algorithms’, '*')
print(new_variable)

and also, if you are struggling with punctation, try to use ‘in’ keyword in if statements, rather than ‘==’, then you wont have to remove punctation

array4101605555 · April 9, 2020, 9:18pm

So far my code works, but it’s longer than the River Nile. Does Python allow you to index variable names? Seems like this whole project could be done in a couple of loops if you could pass modified strings from the bottom of a loop back to the top with a modified name. E.g., censor “she” out of the original text, (“text0”), and call that new text “text1.” Then push “text1” through the loop to censor out “her”, and that becomes “text2,” and so on till you’ve got all the proprietary words, negative words, and whatnot filtered out. Indexing variables was straightforward in the few languages I used in the Precambrian Era, but the topic hasn’t been covered here yet. Thanks.

goananda · April 10, 2020, 11:03pm

I improved my solution making it simpler and shorter.
Additionally you can set how many words before and after word from censor-list you want censor.

gist.github.com

https://gist.github.com/codecademydev/eb9beef415aa9be4db5db923b4276ad2

script.py

def censor_text(text, phrases, limit=0, del_near=0, letters="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'"):
  
  # Words and word positions
  words = []
  word_positions = []
  text += "."
  letter_inds = [i for i in range(len(text)) if text[i] in letters]
  noletter_inds = [i for i in range(len(text)) if not i in letter_inds]
  point = -1
  while [i for i in letter_inds if i > point]:

This file has been truncated. show original

8314154157 · April 12, 2020, 2:46am

I am not able to get a good hang of the project which is okay. The most important thing is the fact that it is a very good learning process. I have looked through the solution of codeacademy. I am writing this to appeal to codeacademy to please do a breakdown of the 'whys' and 'hows' of their solution. This will be very helpful especially for students at same level with me or below who do not yet really understand whats going on under the hood. Plus, this will be very helpful and motivating factor to try other projects. I am sure you wont like us to be demotivated.

Thank you

marquis_in_lv · April 13, 2020, 10:37pm

Yes, I would like to see this too. I peeked at the solution for email two and I would have never never landed there on my own. Having a some explanation or notes on the file would help us extrapolate some of the functionality to future lessons and uses.

For instance, in the email two solution, can someone explain what those for loops are doing:

for word in censored_list:
    censored_word = ""
    for x in range(0, len(word)):
      if word[x] == " ":
        censored_word = censored_word + " "
      else:
        censored_word = censored_word + "X"

Is it evaluating if each character of the strings in censored_list is a space? I don’t understand fully why it would need to do that.

I’m just having difficulty parsing the syntax on this one…

bit9931108677 · April 14, 2020, 8:48am

I’m by no means an expert, but here’s my understanding.

for word in censored_list: # Loop through individual words (strings) in the to-be-censored list
    censored_word = "" # Create a variable that will replace the word being censored
    for x in range(0, len(word)): # Create a list that is the same length of as the word being checked
      if word[x] == " ": # If the index of the string is a space, replace it with space
        censored_word = censored_word + " "
      else: # Else replace the index of the string with an X
        censored_word = censored_word + "X"

Also note that in reality, string-indexes are not being replaced as strings a immutable. Instead, the code will return a new string (censored_word) to take the place of the old one.

marquis_in_lv · April 14, 2020, 8:23pm

Thank you for this. I am going to chew on this for a bit and see if this helps me tackle email three and four.

One question, for this one below, it is creating a list or a string?

for x in range(0, len(word)): # Create a list that is the same length of as the word being checked

goananda · April 14, 2020, 10:38pm

It creates a loop where x iterates through a list of integers: 0, 1, 2, … len(word) - 1 .
So next line of code (if word[x] == " ") checks each character of string variable word (word[0], word[1], etc.) if it is a space.

liagroza · April 17, 2020, 3:22pm

Hi everybody!

I’ve successfully done this challenge project. Starting it, I decided to take a more general approach such that any input text will be successfully censored. Therefore the utility of the algorithms increases.
There are a few things this code does and you should take into consideration when you start this project:

The censor should be case-insensitive (for instance “lEArNing AlgoRITHms” should be censored - that way, Mocking SpongeBob will be proud of you )
The original text should be preserved (meaning that spaces, newlines, punctuation should not be neither lost, nor added).
Words containing the words from the given lists should not be censored in any way (We can clearly see that in the second email “researchers” will turn into “reasearc***s” just because it contains “her”).
The censor should preserve the length of the initial word and substitute only the alpha characters.
There are two things this algorithm is failing to do:
Considering short forms such as “can’t”, “won’t” as words - “can”, “won” will be taken into consideration.
Censor both the before and the after words of two consecutive words from the lists. For instance, “Send help, Helena!” will turn into “Send ****, ******! " instead of " **** ****, ******!” just because we censor Helena from proprietary_terms and its before word first. That way, “help” won’t be seen as a negative word.
Those being said, I hope my code will be helpful. Have fun coding!

github.com

LiaGroza/Censor-Dispenser/blob/master/censor_dispenser.py

'''
Created by Iulia Groza, April 2020
'''

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()


#censoring the phrase by maintaining its length and spaces
def censored(phrase):
    censor = ""
    for char in phrase:
        if char == " ":
            censor += " "
        else:
            censor += "*"
    return censor

This file has been truncated. show original

ani1238 · April 20, 2020, 12:05pm

I don’t understand how to censor words that have capitals in them.

5289371177 · April 20, 2020, 4:58pm

github.com

SyrovatkaA/Codecademy-Censor_Dispenser/blob/master/censor_dispenser_solution.py

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

def censor_one(email, censor):
  censorer = ''
  for letter in censor:
    if letter == ' ':
      censorer += ' '
    else:
      censorer += '█'
  email_censored = email.replace(censor, censorer)
  return email_censored

print(censor_one(email_one, 'learning algorithms'))
print('\n\n')

proprietary_terms = ["she", "personality matrix", "sense of self", "self-preservation", "learning algorithm", "her", "herself"]

This file has been truncated. show original