Censor Dispenser Challenge Project (Python)

So you’d have an original string. Make replacements. You’d now have a new string. To the new string, make replacements. Repeat until you’re through with all your replacements. Hand the last result to whoever asked for it.

It’s only discarded because you’re not doing anything with it.

I see, thanks ionatan

Can I ask please, how do we know when to use 1,2 or 3+ variables in a function?

I’ve been writing the functions according to codeacademy but when given no specific instructions I am not sure how many variables I would need to use.

def censor_two(input_text, censored_list):
  for word in censored_list:
    censored_word = ""
    for x in range(0,len(word)):
      if word[x] == " ":
        censored_word = censored_word + " "
      else:
        censored_word = censored_word + "X"
    input_text = input_text.replace(word, censored_word)
  return input_text

Here they have used 2, how do I know to use 2 without them telling me so.

There are like… 6 local variables in that function, maybe you mean parameters in which case you’d need to ask yourself what the input is and how many things that is. You get that by considering what you’re at all doing.

Not the most elegant solution I believe, but it worked with everything I threw on it :slight_smile:
cheers guys <3

Took me a while to complete this project :thinking:, please let me know if there’s something I can improve, and hopefully my codes will help you in some way. :grinning:

Hello, everybody

Here is my take on this challenge. Still, I have work to do like handle upper & lowercase and have better identification of words within words; however, it fulfills the tasks.

If there are any comments will be great to have feedback.

I updated the code including upper, lower and Title case handling

Hello,
One question about tour code.
for code block 2 referring function for email_two.
What the purpose of next syntax? Especially this element “[0]”
objectLst = [’’]
objectLst[0] = textToScan

I may be “off base” here but I placed the string textToScan in the list as the first list element because in python strings saved to variables are immutable, but strings as list elements are not. the element objectLst[0] is referring to the empty string at the 0 index of the list, the one I defined with objectLst = [’’]. not to suggest you weren’t aware, but just in case list indices begin at zero: in a list with five elements the indices are 0 to 4, with 0 representing the first element and 4 the last. as to why I didnt just write objectLst = [textToScan], well, I’m not a great lateral thinker.

Thank You for response.

1 Like

Hi. I’m on my first study in coding here. Just out of 25% of the Computer Science path. I’m trying to use only what has been shown to me so far.

My question is about the “her”, “herself”, “reseachers” problem. And I checked most of the solutions presented here, and they still leave researc***s because of the “her” in this word… or ***self because of the “her” in “herself”.

I solved the her - herself by slicing the original list of words and reintroducing them with “herself” first. This way it happens first in the iteration and the “her” is checked after the “herself” is already changed to *******.

But I am having a hard time thinking on the her - researchers problem. Any suggestions?

I also noted that the solution proposed in the heading of this topic has these same difficulties. XXXXXs

herself was not eliminated, it ended up like xxxself.

And the solution for email three has censored only the name Helena when I ran it.

One last thing. I might have misunderstood the point of step 4. “Write a function that can censor any occurance of a word from the “negative words” list after any “negative” word has occurred twice, (…)”

but no word from the list occurs more than once in email three. and most simply don’t occur.

PS: the last function works well for email three and four, but one and two still have some of the words, like learning algorithm.

Am I running this wrong? (I used Jupyter to run it) I made it work for email one and two, but I stopped at three because of this confusion. there is no way to know if the function asked works since the email does not contain more than two occurances of any of the terms…

Could somebody tell me if I misunderstood?
And sorry for bringing so many questions and unknowns without proposing solutions.

I don’t know how but a week after starting the project, I got there. I decided to bite of more than I could chew by using OOP techniques. I’ve done my best to use comments to explain what does what. Hopefully it’s all pretty clear.

Ok. Here is mine

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

def blur(word):
  temp = ['*' for i in range(len(word))]
  blurred = ""
  for char in temp:
    blurred = blurred + char
  return blurred

def censor(text, phrase):
  if text.find(phrase) > -1:
    return text.replace(phrase, blur(phrase))
  return text

def censor_phrases(text, phrases):
  censored = text
  phrases.sort(key=len, reverse=True)
  for phrase in phrases:
    if censored.find(phrase) > -1:
      censored = censor(censored, phrase)
  return censored

def censor_negative_words(text, words):
  return censor_phrases(text, words)
  
def censor_all(text,negatives, terms):
  censored = censor_phrases(text, terms)
  censored = censor_negative_words(text, negatives)
  censors = censored.split() 
  if len(censors) < 2:
    return censored
  for i in range(len(censors)):
    if i >= 2:
      if (censors[i].find('***') > -1) and (censors[i-2].find('***') > -1):
        censors[i-1] = blur(censors[i-1])
  return ' '.join(censors)    

censored = censor(email_one, "learning algorithms")
print(censored)

proprietary_terms = ["she", "personality matrix", "sense of self", "self-preservation", "learning algorithm", "her", "herself"]

censored = censor_phrases(email_two, proprietary_terms)
print(censored)

negative_words = ["concerned", "behind", "danger", "dangerous", "alarming", "alarmed", "out of control", "help", "unhappy", "bad", "upset", "awful", "broken", "damage", "damaging", "dismal", "distressed", "distressed", "concerning", "horrible", "horribly", "questionable"]
censored = censor_negative_words(email_three, negative_words)
print(censored)

censored = censor_all(email_four, negative_words,proprietary_terms)
print(censored)

Not super pretty. The code works okay. I learned a ton working on it. Kept coming up on hints to use regular expressions, but avoided it for the learning. Anyways, done, and time to move on to the next project.

https://gist.github.com/codecademydev/3d39d3ad65a1616cd6ef7a09f2ec892d

# These are the emails you will be censoring. The open() function is opening the text file that the emails are contained in and the .read() method is allowing us to save their contexts to the following variables:

email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()

# Define lists of words
remove_word = "learning algorithms"
proprietary_terms = ["she", "personality matrix", "sense of self", "self-preservation", "learning algorithm", "her", "herself"]
negative_words = ["concerned", "behind", "danger", "dangerous", "alarming", "alarmed", "out of control", "help", "unhappy", "bad", "upset", "awful", "broken", "damage", "damaging", "dismal", "distressed", "distressed", "concerning", "horrible", "horribly", "questionable"]
combined_list = proprietary_terms + negative_words

# Original solution. Did not take into account replacing words with # of the same length.

def redact_words(original_word):
    if len(original_word) < 1:
        return 
    original_cases = [] # List of all versions of redacted words.
    original_cases.append(original_word) # Normal spelling of the word.
    original_cases.append(original_word.upper()) # UPPERCASE spelling of the word.
    original_cases.append(original_word.lower()) # lowercase spelling of the word.
    original_cases.append(original_word.title()) # Title Case spelling of the word.    
    
    for word in original_cases:
        redact_phrase = ""
        for i in word:
            if i != " ":
              redact_phrase += "#" # Will end as same length as word or phrase. 
            elif i == " ":
                redact_phrase += ' ' # Will skip blank spaces.
        return redact_phrase
    
#print(redact_words("out of control"))

def censor_single_word(single_word, email):
    censored_email = email
    returned_redaction = redact_words(single_word)
    censored_email = censored_email.replace(single_word, returned_redaction)
    return censored_email

#print(censor_single_word(remove_word, email_one))

def censor_proprietary(lst_of_words, email):
    censored_email_two = email
    for word in lst_of_words:
        word1 = " " + word + " " # Checks for words on both sides.
        returned_redaction = redact_words(word1)
        censored_email_two = censored_email_two.replace(word1, returned_redaction)
        
        word2 = " " + word # Checks for words with space before
        returned_redaction = redact_words(word2)
        censored_email_two = censored_email_two.replace(word2, returned_redaction)   
        
        word3 =  word + " " # Checks for words with space after
        returned_redaction = redact_words(word3)
        censored_email_two = censored_email_two.replace(word3, returned_redaction) 
        
        word4 =  "#s" # Checks for pluralized words.
        returned_redaction = redact_words(word4)
        censored_email_two = censored_email_two.replace(word4, returned_redaction)              
               
    return censored_email_two

# print(censor_proprietary(proprietary_terms, email_two))

def censor_negative(lst_of_words, email):
    censored_email_two = email
    censored_words = []
    
    # Create list of censored words in the email and sort from longest to shortest.
    for term in lst_of_words: 
        if email.find(term) >= 0:
            censored_words.append(term)
    censored_words.sort(key=len, reverse = True)     
    # print(censored_words)
    
    # Create list of beginning index number for each word.
    index=[]
    for term in censored_words:
        index.append(email.find(term))  
    
    
    # Combining the two lists to one, with index number first. Sorting by index number. Using the third index to begin our censoring.
    censored_index_list = [[i, j] for i, j in zip(index, censored_words)]    
    censored_index_list.sort()
    #print(censored_index_list)
    new_list = [item[0] for item in censored_index_list]
    #print(new_list)
    
    #Removing first item if list is greater than 3
    nu = 0
    mu = 0
    for c in new_list:
        if len(new_list) > 3:
            while mu < 1:
                if new_list[nu] != new_list[nu + 1]:
                    del censored_index_list[0]
                nu += 1
                mu += 1     
    print(censored_index_list)
    
    # Updated list skipping the first occurrence. 
    new_censored_list = [item[1] for item in censored_index_list]
    new_censored_list.sort(key=len, reverse = True)  
    print(new_censored_list)
    
    for word in new_censored_list:
        word1 = " " + word + " " # Checks for words on both sides.
        returned_redaction = redact_words(word1)
        censored_email_two = censored_email_two.replace(word1, returned_redaction)
        
        word2 = " " + word # Checks for words with space before
        returned_redaction = redact_words(word2)
        censored_email_two = censored_email_two.replace(word2, returned_redaction)   
        
        word3 =  word + " " # Checks for words with space after
        returned_redaction = redact_words(word3)
        censored_email_two = censored_email_two.replace(word3, returned_redaction) 
        
        word4 =  "#s" # Checks for pluralized words.
        returned_redaction = redact_words(word4)
        censored_email_two = censored_email_two.replace(word4, returned_redaction)              
               
    return censored_email_two    

#print(censor_negative(negative_words, email_three))

def censor_it_all (lst_of_words, email):
    censored_email_two = email
    #print(lst_of_words)
    censored_words = []   
    
    # Create list of censored words in the email and sort from longest to shortest.
    for term in lst_of_words: 
        if email.find(term) >= 0:
            censored_words.append(term)
    censored_words.sort(key=len, reverse = True)     
    #print(censored_words)
    
    # Create list of words in all possibly spelling configurations
    original_cases_all = [] # List of all versions of redacted words.
    for word in censored_words:
        if len(word) < 1:
            return
        original_cases_all.append(word) # Normal spelling of the word.
        original_cases_all.append(word.upper()) # UPPERCASE spelling of the word.
        original_cases_all.append(word.lower()) # lowercase spelling of the word.
        original_cases_all.append(word.title()) # Title Case spelling of the word.      
    print(original_cases_all)    
         
    for term in original_cases_all:
        word1 = " " + term + " " # Checks for words with space on both sides.
        returned_redaction = redact_words(word1)
        censored_email_two = censored_email_two.replace(word1, returned_redaction)
        
        word2 = " " + term # Checks for words with space before
        returned_redaction = redact_words(word2)
        censored_email_two = censored_email_two.replace(word2, returned_redaction)   
        
        word3 =  term + " " # Checks for words with space after
        returned_redaction = redact_words(word3)
        censored_email_two = censored_email_two.replace(word3, returned_redaction) 
        
        word4 =  "#s" # Checks for pluralized words.
        returned_redaction = redact_words(word4)
        censored_email_two = censored_email_two.replace(word4, returned_redaction)              
               
    
    split_list_lines = censored_email_two.split('\n')
    final_string = ""
    for line in split_list_lines:
        split_list_words = line.split()
        #print(split_list_words)
        
        for word in split_list_words:
            if '#' in word and word != split_list_words[0]:
                index_of_word = split_list_words.index(word) - 1
                if "#" not in split_list_words[index_of_word]:
                    censor_phrase_before = ""
                    before = split_list_words[index_of_word]
                    for i in before:
                        censor_phrase_before += "#" # Will end as same length as word or phrase.    
                    split_list_words[index_of_word] = censor_phrase_before  

        for word in split_list_words:
            if '#' in word and word != split_list_words[-1]:
                index_of_word = split_list_words.index(word) + 1
                if "#" not in split_list_words[index_of_word]:
                    censor_phrase_after = ""
                    after = split_list_words[index_of_word]            
                    for i in after:
                        censor_phrase_after += "#" # Will end as same length as word or phrase.    
                    split_list_words[index_of_word] = censor_phrase_after  
        
        
        #print(split_list_words)  
        new_string =  " ".join(split_list_words)
        final_string = final_string + "\n" + new_string

        # new_string2 = "".join(new_string)
        # print(new_string2)


    return final_string
print(censor_it_all(combined_list, email_four))   

The OP solution posted Jan 18, 2020 does not fully solve the problem, leaving ‘personality matrix’ and ‘Helena’ (next to a banned word) intact.

Here is my solution. Words censored because they appear in the two lists provided are indicated by ‘+’, and those censored because they are neighboring words are indicated by ‘-’ for extra clarity of what the code has accomplished.

Hey everybody, i’ve a simple stupid question -_- ; honestly i didn’t understand the challenge yet.
what do they want ? can anyone anwser me please , i should solve this to geet to the next step in my path

look forward for feedbacks: https://gist.github.com/a1d700b0385ac37e28770fb4b320d073

1 Like

I don’t understand how to fix this. It prints out a new email with the appropriate censorships for each term individually. How can I consolidate this so that all censored words appear in a single email?