Remove duplicates, however I copy the original list it still modifying it


#1

Hi,
another one that has me puzzled. (Sorry!)
this is my code, that seems to do what is asked but doesn't seem to avoid changing the original but, I make all modifications to a copy of it!
Is my copying faulty?
many thanks

def remove_duplicates(temp_list):
    new_list =[]
    new_list = temp_list
    for i in temp_list:
        new_list.sort
        if new_list.count(i) > 1:
            new_list.remove(i)
    return new_list

I tried declaring the new list outside the function, inside, inside the loop but to no avail!
Any help welcome (and most needed).
Many thanks,
DD


#2

I am puzzled by this, i added a print statement, and let it return temp_list instead of new_list, but temp_list seems to be getting altered for a reason i don't understand, here is the code:

def remove_duplicates(temp_list):
    new_list =[]
    new_list = temp_list
    for i in temp_list:
        new_list.sort
        if new_list.count(i) > 1:
            new_list.remove(i)
    return temp_list
print remove_duplicates([4,5,5,4])

as you can see, temp_list gets altered


#3

Hi,

I wonder if this issue isn't something I just learned about when helping someone else with a problem recently:

From the Python Documentation link:

It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.

In the case I looked at, we were able to get the iteration to work properly when we made a temporary copy of the list by adding [:], like this:

words = list(text)
    for letter in words[:]:
        if letter in vowels:
            words.remove(letter)

That seemed to get it to function as expected. I guess it's just a weird behavior in Python.

Good luck!


#4

Thank you joeb. Either I am very unlucky or... could there be something wrong with my browser or the interpreter on my side?
i get strange things, many are my bad programming and lack of knowledge but other, I end up questioning.
point in case: I can't ad an elif or else to the following function without getting a syntax message. But the syntax is exactly the same as other examples the program accepts.

num_len = 0
num_sorted = []
def median(num):
    num_sorted = sorted(num)
    num_len == len(num)
    if num_len == 1:
        return num_sorted(0)
        break
    elif num_len % 2 == 0:
        par1 == num_len/2
        par2 == par1 + 1
        added == num_sorted(par1) + num_sorted(par2)
        addedanddivided == added/2.0
    return num_sorted(addedanddivided)
 
    elif num_len %2 != 0:
        median_odd_minus1 == num_len - 1
        median_odd == median_odd_minus1 + 1
    return numeros_sorted(median_odd)



I get on the console:
    File "python", line 16
        elif num_len %2 != 0:
           ^
    SyntaxError: invalid syntax

#5

Please ignore the earlier.

I have tried starting again. I checked some other suggestion and I came up with something it looks to me very similar to solutions given here yet it doesn't work and I don't know why.

def median(num):
    in_order = sorted(num)
    l = len(in_order)
    if l % 2 == 0:
        n1 = l / 2
        n2 = n1 + 1
        return ((n1 + n2) /2.0)
    elif l % 2 != 0:
        n_middle = ((l+1)/2.0)
        return n_middle

Oops, try again. median([4, 5, 5, 4]) returned 2.5 instead of 4.5


#6

Hi DD,

I think I've got a possible solution, and you're on the right track. In a list like [1,2,3,4] we want to get the average of 2 & 3, and in list like [1,2,3] we just need the middle integer.

The problem comes when we get successfully to the middle of the list: how do we access those values?

The variables n1 and n2 above are returning values, but we've got to somehow access the values in the list by accessing indexes. I think it's especially confusing when you're dealing with integers and index numbers that are around the same values.

Try this and see what you think (and let me know if I made a mistake)!:

def median(num):
    in_order = sorted(num)
    l = len(in_order)
    if l % 2 == 0:
        n1 = l / 2
        n2 = n1 - 1
        return (in_order[n1] + in_order[n2]) / 2.0
        
    elif l % 2 != 0:   # I think we could also just use "else:" with no conditions here
        n_middle = ((l+1)/2)
        return float(in_order[n_middle - 1])

Good luck!


#7

Thanks a lot joeb.

my last attempt (I'm going step by step)

is:

def median(num):
    in_order = sorted(num)
    l = len(in_order)
    if l % 2 == 0:
        pos1 = l / 2
        num1 = in_order[int(pos1)]
        pos2 = (pos1 + 1)
        num2 = in_order[int(pos2)]
        return (num1 + num2) / 2
    elif l % 2 != 0:
        n_middle = int((l + 1) / 2.0)
        return in_order[n_middle]
    elif l == 1:
        return in_order[int(l)]

If I don't make all numbers integers the program won't parse it.
The first two if seem to work but fails when it receives a [1] and throws a:

Oops, try again. median([1]) resulted in an error: list index out of range.

I'm trying to figure out why and to understand what is going on so I really appreciate your patience and dedication.
At my return I will try with your code and see what happens. I will let you know.
Many thanks again,
DD


#8

i would swap your elif l % 2 != 0: and elif l == 1:

Think about if l=1 then:

elif 1 % 2 !== 0

is true, so your elif l == 1: will never get executed


#9

Thank joeb.
The code as yours work.
I'm still a bit confused about why I have to make all my returns integers.
But thank you very much!


#10

Thanks a lot stetim94!
I tried but swaping the if statements order did not sort the problem and it still returns

Oops, try again. median([1]) resulted in an error: list index out of range

Ahhhh!
Finally I worked out why it all got pear-shape! The mayor reason bugging my efforts was that the list indexes start at 0 and my accounts always pointed right to the middle SHOULD THE INDEXES HAD STARTED AT 1; that way when my code pointed at four in a list of 7 items that is correct if you start counting at 1, but if you start counting at 0 then the middle is index3 !!!
The problem is that when you think your logic is right you try to change everything but...

Thanks a lot joeb and stetim94!!