This discussion is about remove_duplicates


#1



list_ = [1, 2, 3 ,4, 3, 2, 1]
if 1 in list_:
    print True
else:
    print False

print len(list_)
list_[3] = 5
print list_

def remove_duplicates(list_):
    list2 = list_
    length = len(list_)
    range_ = range(length)
    for x in range_:
        if list_[x] in list_[x+1:]:
            return 1

Why does this code work for remove_duplicates?


#2

Are you sure it works? Can you show us the output for,

['a','b','b','a','c','d','d','c']

?

as in,

print (remove_duplicates(['a','b','b','a','c','d','d','c']))

#3

It doesn't give me the output I want. It's just that when I click save & submit, it says Way to go!, instead of asking me to revise because it didn't work with example x.


#4

That's a lenient lesson check, then, perhaps? The code from what I can ascertain will not run as expected. What output did you get, by the way?

Something to keep in mind...

Your code uses the same name in the parameter as the object uses in global scope. That means no matter what arguments are passed, there is a good chance they will not be used. The object list_ is already defined, and that is what may be used by the function. Avoid this possible conflict by using local names in functions, not global ones.


#5

I just get 1, because when returning inside a for loop, it only returns once.


#6

And we solve that, how? Is there a specific reason why there should be a return value of 1?


#7

I was just using that to play, Even when I do a triple quote to cross the whole section out, it still works.


#8

I just typed return 1 because I wanted to play with another idea I got. Often times when I can't think of the command I want, I just fill out the command I'm working on quickly, so as it doesn't result in an error, so I can type out a quick few statements to figure out which command I am looking for.


#9

In testing, use print in your function and watch the display. When the outputs are what you expect, then remove the print statement. Using return is a not a good way to test a running function.

Consider:

>>> def remove_duplicates(list_):
    list2 = list_
    length = len(list_)
    range_ = range(length)
    for x in range_:
        if list_[x] in list_[x+1:]:
            print (x, list_[x], list_[x+1])

            
>>> print (remove_duplicates(['a','b','b','a','c','d','d','c']))
0 a b
1 b b
4 c d
5 d d
None
>>>

This gives four results, which do not make a lot of sense in themselves but they raise a flag of merit... We expected four results. Something must be working correctly.


#10

To discuss your code a little further, consider the role that list2 plays. At present, none, but it is needed, only not as a copy of list_, but as an empty list to build from the larger list. One that does not contain any duplicates.

On a side note, it would appear there is no conflict in using the same variable name, but I would still recommend avoiding said practice.

list2 = []

Be sure not to walk away from this until you have a working solution that you are satisfied is correct and fully functional. Your return value will be a list that contains no duplicates.


#11

As I said, the return 1 was just filler, so I could play with a command, my end goal is to remove all duplicates in the list. What I've now done is said

if list_[x] in list_[x+1:]:
list2 .remove(list2[x])
return list2

Its tripping up a little bit, but we can talk about that in a second
Question, how do I get it to go into the actual code, like in that box thing?


#12

Read the pinned article here, https://discuss.codecademy.com/, for help with posting code samples.

Consider whether of not list2 is a copy of list_ or just a reference. The answer is the latter. It is not a copy. When you mutate list2, you are also mutating the list that it references, list_. This will mess with your range in the loop.

It's okay to remove, just as long as it is not from the list being iterated. That's where a true copy will come into a play. It will still take some finagling, though.

list2 = list_.copy()

or

list2 = list_[:]

#13

There are two ways to go about this problem,
1. Adding every item in a list that doesn't appear later to another list.
or 2. Removing every item that does appear later from a list.

I'm trying to go about it the latter way.


#14

Option 1 is much simpler, for sure. In option 2, as we remove items from list2, it's indices get messed with. That means we have to log any changes so the next iteration if another match in the main list, is correctly matched and removed from the secondary. Tricky.


#16

You're right, I see how the indices get messed up, but I don't see an easier way foir option 1.
Would I use .append() to add the list_[x] to list2?


#17

I've got it now, thanks for your help. I'd have been up all night trying to go about it the other way, I didn't realize about the indices, I was getting frustrated because it was telling me the indice was out of range and I couldn't understand why.

The code I've settled on is a simple one:

def remove_duplicates(list_):
    list2 = []
    length = len(list_)
    range_ = range(length)
    for x in range_:
        if not list_[x] in list_[x+1:]:
            list2 .append(list_[x])
    return list2

It passes through codeacademy's lesson check, although my first code did aswell. Any suggestions as to what I should test it with?


#18

Yes, that would be the approach.

>>> def remove_duplicates(list_):
    list2 = list_[:]
    length = len(list_)
    range_ = range(length)
    for x in range_:
        if list_[x] in list_[x+1:]:
            list2.remove(list_[x+1])
    return list2

>>> print (remove_duplicates(['a','b','b','a','c','d','d','c']))
['a', 'a', 'c', 'c']
>>>

#19

This is still not fully tested, but it works with the test sample:

>>> def remove_duplicates(list_):
    list2 = list_[:]
    offset = 0
    for x in range(len(list_)):
        if list_[x] in list_[x+1:]:
            list2.remove(list2[x + offset])
            print (list2)
            offset -= 1
    return list2

>>> print (remove_duplicates(['a','b','b','a','c','d','d','c']))
['b', 'b', 'a', 'c', 'd', 'd', 'c']
['b', 'a', 'c', 'd', 'd', 'c']
['b', 'a', 'd', 'd', 'c']
['b', 'a', 'd', 'c']
['b', 'a', 'd', 'c']     # return value
>>> print (remove_duplicates([1, 2, 3 ,4, 3, 2, 1]))
[2, 3, 4, 3, 2, 1]
[3, 4, 3, 2, 1]
[4, 3, 2, 1]
[4, 3, 2, 1]             # return value
>>>

#20

Oh, that's cool. I would never have figured something like that out on my own. Thank you for all your time!


#21

That's where intermediate test printing comes into play. It lets us see each step and study the logic for anomalies and discrepancies. We rarely write pristine code in our first drafts, and ideas can be played out many ways. Expect mistakes and logic errors in the early going on any project. That's why we must stick with it and not walk away (although a break now and then is good, it lets us get a fresh focus).

Did you write the alternate method and get it to work correctly? Belay, belay. I just saw your alternate code above. Only one thing to fix...

The white space should be removed before the dot.