Struggling with Censor Dispenser project

Hi all,

I’ve spent 15+ hours on the censor dispenser project and am not making any progress beyond part three. I have referred to previous lessons and googled everything I can think of. I have not looked at the solution code.

I’d really appreciate some advice on what to do when you grind to a halt on a project and have exhausted every idea you had and searched extensively for answers.

1 Like

Hi @foureightyeight

I’d say you’ve already hit the right answer - you’ve come to the forums and asked for help! :slight_smile:

It’d be easier for us to help you, though, if we could see the code you’ve got so far and if you explained how you were stuck. For example, is there a function you’re struggling to get to work or is something not behaving in the way you were expecting?

When you post your code here for us to look at, please make sure to use the </> (code) button in the editor.

This will insert a new block into your post, where you can paste your code and the forum will keep all the formatting (whitespace etc) correct so we can copy it and run it for troubleshooting. (This is especially important in Python, because otherwise the forum won’t keep your indenting and that’s a big no-no for Python!) Like this:

Once we’ve got the code and we know more exactly what your problem is, we’ll try and help. :slight_smile:

Hi @thepitycoder

This is what I have at the moment:

def censor_three(email):
    for word in proprietary_terms:
        email = email.replace(word, 'X' * len(word))
    email = email.split()

    count = 0
# Doesn't censor anything
    for word in negative_words:
        for split_word in email:
            if split_word == word:
                split_word = '*' * len(word)
    return ' '.join(email)

The second for loop is where I’m having trouble. It isn’t censoring anything. At the moment, I’m trying to censor all of the negative words before attempting to censor negative words when count > 2.

split_word is just a read-only value of the list, why do you expect assignment to split_word variable to update the list?

1 Like

So I can’t iterate through email and replace any split_word which matches word with '*' * len(word)? How do I assign the updated split_word to email?

you can, but not with the approach you currently attempt. As you have discovered, nothing gets replaced.

if we want to update element in the list we have to do:

the_list[index] = "new value"

I think I understand. I need to iterate through email and get the index of split_words which match word, then replace?

that sounds about right :slight_smile:

I on purpose left that bit open. Strange as it might sound, the less i help you, the better it is for you. Putting together the pieces yourself will teach you more then someone just giving your the answer.

1 Like

Thank you for your help. Now I just need to figure out how to write this!

let me know if you need more help.

just for myself: project url, so i can find the lesson more easily.

I’ve successfully produced the expected result when censoring all negative_words and am now having trouble implementing a count so that the censoring begins when count > 2. This is what I have so far:

def censor_three(email):
    for word in proprietary_terms:
        email = email.replace(word, 'X' * len(word))
    email_split = email.split()
    index_of_bad_words = []

    for word in negative_words:
        for index, string in enumerate(email_split):
            if string == word:
                index_of_bad_words += [index]

    count = 0

    for index1 in index_of_bad_words:
        for index2 in range(len(email_split)):
            if index1 == index2 and count > 2:
                email_split[index2] = '*'
              email_split[index2] = email_split[index2]
              count += 1  
    return ' '.join(email_split)

your implementation is not there yet, it has several flaws.

you will need to loop over negative_words, then within this loop you will need to loop over email_split, to check if the current word is in negative words, if so, increase counter, then check if counter is greater then 2, starts filtering:

// pseudo code
for negative_word in negative_words:
   for word in email_split:
   if word in negative word:
      increase counter
      check if counter greater then 2

the additional challenge here, is that `out of control` is a negative word (words really), so you need extra code to handle that.

Keep in mind that your program isn’t a black box. You can print out information at any time to observe what is being done, in what order, see if operations are having the desired effect.

For readability’s sake, instead of iterating through indices which you aren’t actually interested in, put all the output into another list instead, you’d only need to append.

You can also break things out into separate functions. The less code there is, and especially, the less nested code there is, the easier it is to reason about.

def censor_after_two(words, badwords):
    seen = {}

    def keep(word):
        # should this word be kept?
        # is it bad? how many times have I seen it?
        # this function doesn't need to care about lists at all
        # it just needs to make a decision for a single word.
        return True

     # actually, filter won't do the right thing because it would omit
     # things instead of censoring. you could make your own version
     # which censors or doesn't depending on whether the function
     # says it should be kept
     return filter(keep, words)

The above takes an approach that you’re probably not used to, but you might notice that problems might really be multiple subproblems, and if you solve those problems completely separately then you can compose a solution out of the subsolutions.

1 Like

Another suggestion, or, the same really, is to separate out the splitting/joining from the censoring logic.

The censoring functions should probably only be operating on lists of words with no punctuation or spaces, that’s a completely separate problem.

I believe there may also be mention of keeping the original non-alpha characters (spaces, punctuation), in that case what can be done is to have a function looking like this:

def blah(email):
    everything = group by whether letter
    # everything = ['abc', ', !?', 'def', '\n ', 'blah']
    clean_words = just the words from everything
    censored_clean_words = censor(clean_words)
    put all the words back between the other stuff
    concatenate the results into a single string

Overall that’s bit of a circus, but the individual problems there aren’t so bad.

The general idea of that function being to take the input apart, do logic on the parts that are interesting, then put it back together.

Taking things apart and putting them back together is a pattern that fits a lot of problems. Even something like iterating through a list fits this - take it apart (iterate) do something (loop body) put it back together (append the results to a new list)

1 Like

Is there any value in just looking at the solution and working backwards from there?

Probably not.

Read the problem, internalize it, ask yourself how you’d do it manually, analyze what actions that involves, implement actions, write code to run the actions in appropriate order

1 Like

I think my suggestion above is worth exploring.

I’ve already split up the problem, I think you’ll find those subproblems to be approachable and a whole lot cleaner than codecademy’s solution code.

What makes it cleaner is that the censoring logic doesn’t need to deal with word boundaries, that’s already done by some other part. It only needs to care about words, already in a nice list.

The part of putting it all back together is admittedly a little bit tricky to code. But not that much. If you were to look at the parts that should be put together then you can easily manually match the parts together, if you analyze what you’d be doing then you can write the code for it too.

input:          Jimmy walked her dog!
split it:       ['Jimmy', ' ', 'walked', ' ', 'her', ' ', 'dog', '!']
just words:     ['Jimmy', 'walked', 'her', 'dog'] <- suitable input to censoring function
censor results: ['Jimmy', 'walked', '***', 'dog']
(alternatively results could be booleans for keep/censor)

put the results back into the split list (word: replace, non-word: keep as is), concatenate
Jimmy walked *** dog!
1 Like

I can see how this would tackle single words, but what about bad phrases with spaces in them?

I’m close to giving up on this project as I feel like I’m lacking some fundamental understanding of the underlying concepts

For looking at multiple words this provides an even bigger advantage. Having a list of just the words makes it easy to look at multiple locations at once, if all the spaces and commas and such were still there this would be nightmarish.

So I’ve made some good progress by following your suggestion for how to split up the problem. This is what I came up with with your help and this post on stack overflow.

def censor_three(email):
    # Censors proprietary_terms
    for word in proprietary_terms:
        email = email.replace(word, '*' * len(word))
    # Seperates punctuation from letters, then splits by ' '
    separated_punctuation = ''
    for char in email:
        if char in '.,:;!?()"':
            output_char = ' ' + char    # equivalent to output_char = ' %s' % char
            output_char = char
        separated_punctuation += output_char
    separated_punctuation_split = separated_punctuation.split()

    # Censors any negative_words found in the split email (separated_punctuation_split)
    for index in range(len(separated_punctuation_split)):
      if separated_punctuation_split[index] in negative_words:
        separated_punctuation_split[index] = '*' * len(word)
    # Joins punctuation to words and joins words with spaces between them
    joined_string = ''
    for index in range(len(separated_punctuation_split)):
        if separated_punctuation_split[index] in '.,:;!?':
            joined_string += '' + separated_punctuation_split[index]
        elif index == 0:
            joined_string += '' + separated_punctuation_split[index]
            joined_string += ' ' + separated_punctuation_split[index]
    return joined_string

It’s probably much longer than it could be, but I’ll come to that later. Just need to add functionality which will censor things like out of control or learning algorithms as well as a counter which will prevent censoring until > 2.

Thank you for making things clearer!