Censor Dispenser Challenge Project (Python)

Not the most elegant solution I believe, but it worked with everything I threw on it :slight_smile:
cheers guys <3

1 Like

Took me a while to complete this project :thinking:, please let me know if there’s something I can improve, and hopefully my codes will help you in some way. :grinning:

Hello, everybody

Here is my take on this challenge. Still, I have work to do like handle upper & lowercase and have better identification of words within words; however, it fulfills the tasks.

I updated the code including upper, lower and Title case handling

Hello,
One question about tour code.
for code block 2 referring function for email_two.
What the purpose of next syntax? Especially this element “[0]”
objectLst = [’’]
objectLst[0] = textToScan

I may be “off base” here but I placed the string textToScan in the list as the first list element because in python strings saved to variables are immutable, but strings as list elements are not. the element objectLst[0] is referring to the empty string at the 0 index of the list, the one I defined with objectLst = [’’]. not to suggest you weren’t aware, but just in case list indices begin at zero: in a list with five elements the indices are 0 to 4, with 0 representing the first element and 4 the last. as to why I didnt just write objectLst = [textToScan], well, I’m not a great lateral thinker.

Thank You for response.

1 Like

Hi. I’m on my first study in coding here. Just out of 25% of the Computer Science path. I’m trying to use only what has been shown to me so far.

My question is about the “her”, “herself”, “reseachers” problem. And I checked most of the solutions presented here, and they still leave researc***s because of the “her” in this word… or ***self because of the “her” in “herself”.

I solved the her - herself by slicing the original list of words and reintroducing them with “herself” first. This way it happens first in the iteration and the “her” is checked after the “herself” is already changed to *******.

But I am having a hard time thinking on the her - researchers problem. Any suggestions?

2 Likes

I also noted that the solution proposed in the heading of this topic has these same difficulties. XXXXXs

herself was not eliminated, it ended up like xxxself.

And the solution for email three has censored only the name Helena when I ran it.

One last thing. I might have misunderstood the point of step 4. “Write a function that can censor any occurance of a word from the “negative words” list after any “negative” word has occurred twice, (…)”

but no word from the list occurs more than once in email three. and most simply don’t occur.

PS: the last function works well for email three and four, but one and two still have some of the words, like learning algorithm.

Am I running this wrong? (I used Jupyter to run it) I made it work for email one and two, but I stopped at three because of this confusion. there is no way to know if the function asked works since the email does not contain more than two occurances of any of the terms…

Could somebody tell me if I misunderstood?
And sorry for bringing so many questions and unknowns without proposing solutions.

1 Like

I don’t know how but a week after starting the project, I got there. I decided to bite of more than I could chew by using OOP techniques. I’ve done my best to use comments to explain what does what. Hopefully it’s all pretty clear.

Ok. Here is mine

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

def blur(word):
  temp = ['*' for i in range(len(word))]
  blurred = ""
  for char in temp:
    blurred = blurred + char
  return blurred

def censor(text, phrase):
  if text.find(phrase) > -1:
    return text.replace(phrase, blur(phrase))
  return text

def censor_phrases(text, phrases):
  censored = text
  phrases.sort(key=len, reverse=True)
  for phrase in phrases:
    if censored.find(phrase) > -1:
      censored = censor(censored, phrase)
  return censored

def censor_negative_words(text, words):
  return censor_phrases(text, words)
  
def censor_all(text,negatives, terms):
  censored = censor_phrases(text, terms)
  censored = censor_negative_words(text, negatives)
  censors = censored.split() 
  if len(censors) < 2:
    return censored
  for i in range(len(censors)):
    if i >= 2:
      if (censors[i].find('***') > -1) and (censors[i-2].find('***') > -1):
        censors[i-1] = blur(censors[i-1])
  return ' '.join(censors)    

censored = censor(email_one, "learning algorithms")
print(censored)

proprietary_terms = ["she", "personality matrix", "sense of self", "self-preservation", "learning algorithm", "her", "herself"]

censored = censor_phrases(email_two, proprietary_terms)
print(censored)

negative_words = ["concerned", "behind", "danger", "dangerous", "alarming", "alarmed", "out of control", "help", "unhappy", "bad", "upset", "awful", "broken", "damage", "damaging", "dismal", "distressed", "distressed", "concerning", "horrible", "horribly", "questionable"]
censored = censor_negative_words(email_three, negative_words)
print(censored)

censored = censor_all(email_four, negative_words,proprietary_terms)
print(censored)

Not super pretty. The code works okay. I learned a ton working on it. Kept coming up on hints to use regular expressions, but avoided it for the learning. Anyways, done, and time to move on to the next project.

https://gist.github.com/codecademydev/3d39d3ad65a1616cd6ef7a09f2ec892d

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:

email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

# Define lists of words
remove_word = "learning algorithms"
proprietary_terms = ["she", "personality matrix", "sense of self", "self-preservation", "learning algorithm", "her", "herself"]
negative_words = ["concerned", "behind", "danger", "dangerous", "alarming", "alarmed", "out of control", "help", "unhappy", "bad", "upset", "awful", "broken", "damage", "damaging", "dismal", "distressed", "distressed", "concerning", "horrible", "horribly", "questionable"]
combined_list = proprietary_terms + negative_words

# Original solution. Did not take into account replacing words with # of the same length.

def redact_words(original_word):
    if len(original_word) < 1:
        return 
    original_cases = [] # List of all versions of redacted words.
    original_cases.append(original_word) # Normal spelling of the word.
    original_cases.append(original_word.upper()) # UPPERCASE spelling of the word.
    original_cases.append(original_word.lower()) # lowercase spelling of the word.
    original_cases.append(original_word.title()) # Title Case spelling of the word.    
    
    for word in original_cases:
        redact_phrase = ""
        for i in word:
            if i != " ":
              redact_phrase += "#" # Will end as same length as word or phrase. 
            elif i == " ":
                redact_phrase += ' ' # Will skip blank spaces.
        return redact_phrase
    
#print(redact_words("out of control"))

def censor_single_word(single_word, email):
    censored_email = email
    returned_redaction = redact_words(single_word)
    censored_email = censored_email.replace(single_word, returned_redaction)
    return censored_email

#print(censor_single_word(remove_word, email_one))

def censor_proprietary(lst_of_words, email):
    censored_email_two = email
    for word in lst_of_words:
        word1 = " " + word + " " # Checks for words on both sides.
        returned_redaction = redact_words(word1)
        censored_email_two = censored_email_two.replace(word1, returned_redaction)
        
        word2 = " " + word # Checks for words with space before
        returned_redaction = redact_words(word2)
        censored_email_two = censored_email_two.replace(word2, returned_redaction)   
        
        word3 =  word + " " # Checks for words with space after
        returned_redaction = redact_words(word3)
        censored_email_two = censored_email_two.replace(word3, returned_redaction) 
        
        word4 =  "#s" # Checks for pluralized words.
        returned_redaction = redact_words(word4)
        censored_email_two = censored_email_two.replace(word4, returned_redaction)              
               
    return censored_email_two

# print(censor_proprietary(proprietary_terms, email_two))

def censor_negative(lst_of_words, email):
    censored_email_two = email
    censored_words = []
    
    # Create list of censored words in the email and sort from longest to shortest.
    for term in lst_of_words: 
        if email.find(term) >= 0:
            censored_words.append(term)
    censored_words.sort(key=len, reverse = True)     
    # print(censored_words)
    
    # Create list of beginning index number for each word.
    index=[]
    for term in censored_words:
        index.append(email.find(term))  
    
    
    # Combining the two lists to one, with index number first. Sorting by index number. Using the third index to begin our censoring.
    censored_index_list = [[i, j] for i, j in zip(index, censored_words)]    
    censored_index_list.sort()
    #print(censored_index_list)
    new_list = [item[0] for item in censored_index_list]
    #print(new_list)
    
    #Removing first item if list is greater than 3
    nu = 0
    mu = 0
    for c in new_list:
        if len(new_list) > 3:
            while mu < 1:
                if new_list[nu] != new_list[nu + 1]:
                    del censored_index_list[0]
                nu += 1
                mu += 1     
    print(censored_index_list)
    
    # Updated list skipping the first occurrence. 
    new_censored_list = [item[1] for item in censored_index_list]
    new_censored_list.sort(key=len, reverse = True)  
    print(new_censored_list)
    
    for word in new_censored_list:
        word1 = " " + word + " " # Checks for words on both sides.
        returned_redaction = redact_words(word1)
        censored_email_two = censored_email_two.replace(word1, returned_redaction)
        
        word2 = " " + word # Checks for words with space before
        returned_redaction = redact_words(word2)
        censored_email_two = censored_email_two.replace(word2, returned_redaction)   
        
        word3 =  word + " " # Checks for words with space after
        returned_redaction = redact_words(word3)
        censored_email_two = censored_email_two.replace(word3, returned_redaction) 
        
        word4 =  "#s" # Checks for pluralized words.
        returned_redaction = redact_words(word4)
        censored_email_two = censored_email_two.replace(word4, returned_redaction)              
               
    return censored_email_two    

#print(censor_negative(negative_words, email_three))

def censor_it_all (lst_of_words, email):
    censored_email_two = email
    #print(lst_of_words)
    censored_words = []   
    
    # Create list of censored words in the email and sort from longest to shortest.
    for term in lst_of_words: 
        if email.find(term) >= 0:
            censored_words.append(term)
    censored_words.sort(key=len, reverse = True)     
    #print(censored_words)
    
    # Create list of words in all possibly spelling configurations
    original_cases_all = [] # List of all versions of redacted words.
    for word in censored_words:
        if len(word) < 1:
            return
        original_cases_all.append(word) # Normal spelling of the word.
        original_cases_all.append(word.upper()) # UPPERCASE spelling of the word.
        original_cases_all.append(word.lower()) # lowercase spelling of the word.
        original_cases_all.append(word.title()) # Title Case spelling of the word.      
    print(original_cases_all)    
         
    for term in original_cases_all:
        word1 = " " + term + " " # Checks for words with space on both sides.
        returned_redaction = redact_words(word1)
        censored_email_two = censored_email_two.replace(word1, returned_redaction)
        
        word2 = " " + term # Checks for words with space before
        returned_redaction = redact_words(word2)
        censored_email_two = censored_email_two.replace(word2, returned_redaction)   
        
        word3 =  term + " " # Checks for words with space after
        returned_redaction = redact_words(word3)
        censored_email_two = censored_email_two.replace(word3, returned_redaction) 
        
        word4 =  "#s" # Checks for pluralized words.
        returned_redaction = redact_words(word4)
        censored_email_two = censored_email_two.replace(word4, returned_redaction)              
               
    
    split_list_lines = censored_email_two.split('\n')
    final_string = ""
    for line in split_list_lines:
        split_list_words = line.split()
        #print(split_list_words)
        
        for word in split_list_words:
            if '#' in word and word != split_list_words[0]:
                index_of_word = split_list_words.index(word) - 1
                if "#" not in split_list_words[index_of_word]:
                    censor_phrase_before = ""
                    before = split_list_words[index_of_word]
                    for i in before:
                        censor_phrase_before += "#" # Will end as same length as word or phrase.    
                    split_list_words[index_of_word] = censor_phrase_before  

        for word in split_list_words:
            if '#' in word and word != split_list_words[-1]:
                index_of_word = split_list_words.index(word) + 1
                if "#" not in split_list_words[index_of_word]:
                    censor_phrase_after = ""
                    after = split_list_words[index_of_word]            
                    for i in after:
                        censor_phrase_after += "#" # Will end as same length as word or phrase.    
                    split_list_words[index_of_word] = censor_phrase_after  
        
        
        #print(split_list_words)  
        new_string =  " ".join(split_list_words)
        final_string = final_string + "\n" + new_string

        # new_string2 = "".join(new_string)
        # print(new_string2)


    return final_string
print(censor_it_all(combined_list, email_four))   

1 Like

The OP solution posted Jan 18, 2020 does not fully solve the problem, leaving ‘personality matrix’ and ‘Helena’ (next to a banned word) intact.

Here is my solution. Words censored because they appear in the two lists provided are indicated by ‘+’, and those censored because they are neighboring words are indicated by ‘-’ for extra clarity of what the code has accomplished.

Hey everybody, i’ve a simple stupid question -_- ; honestly i didn’t understand the challenge yet.
what do they want ? can anyone anwser me please , i should solve this to geet to the next step in my path

look forward for feedbacks: https://gist.github.com/a1d700b0385ac37e28770fb4b320d073

1 Like

I don’t understand how to fix this. It prints out a new email with the appropriate censorships for each term individually. How can I consolidate this so that all censored words appear in a single email?

shown are 1st and partial 2nd email. 1st has all instances of “she” replaced, 2nd all instances of “personality matrix” (but not “she”), etc. etc. etc.

Hello Guys,

Greetings from Riccardo! I have started coding just few weeks ago… and when I feel my self blocked I look on the web for find solutions on my problems.

Attached you can find my jupyter notebook containing my Censor Dispenser Challenge Project.

I have used a lot of “manual” coding, so I will be very happy if someone can help me to get my code more “automatic” including also punctuation, uppercase and lowercase letters.

I have used the function enumerate(), but i don’t know his characteristics, so if somebody can help me also to understand this and why I have used as extreme [:-3] I will be very happy too!

Greetings from Italy!

Thanks in advance!

Hey colleagues, im finally done, took me a couple of hours trying various things.

Here is my github project: https://github.com/komnen0v1c/CC-Censor-Dispenser

Regards!

Hello everyone!

I’ve been working on Python 3 for a few weeks now, and finished the course today. As such, I figured I’d try this project to learn some more. I’ve got the first two emails down, but unfortunately am running into problems on the third. My code is shared below, I have a question regarding it.

https://gist.github.com/d51b3f553e035db8ba7443215eac412d

Questions:

  1. The negative_counter in redact does not seem to work correctly. The counter should be incremented the first and second time a negative in negative_words is found (negative_counter < 2), but instead the code moves to the elif-statement and replaces the word. Why is this?
  2. If I were to utilize this slicing method of replacing words, how would I go about handling letter cases? I would want to do it so that I don’t need to .lower() the whole document, as then the output gets less readable.

How can I see if my functions are successful at modifying the emails?

I see that the emails are on read mode and are saved to variables, the variables are clones of the files and can be modified? or is it just to see the other file when I print it?

email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()