[better way?]Can i not have to make empty new lists everytime


#1
`def remove_duplicates(x):
    new=[]
    for char in x:
        if char not in new:
            new.append(char)
    return new

see the "new =[]". is there a way that pros do it so they dont have to do that


#2

We'll need a guru to weigh in. Learning to write algorithms is fundamental, but eventually we expand our programming scope to include Python built-in functions.

def remove_duplicates(x):
    return list(set(x))

Run it...

x = [2,4,6,8,10,5,6,3,2,4,8,7,9,10]
remove_duplicates(x)
print x        # [2, 3, 4, 5, 6, 7, 8, 9, 10]

#3

A caveat with mtf's solution is that the order between the elements is forgotten.

An alternative:
http://stackoverflow.com/a/7961425


#4

There's a problem with the code in OP. It's slow. Don't worry, it's fine for this purpose.

It searches through new every time it adds a new value, compares the new value with each already existing one. For a large list this gets out of hand because it takes more time per element the larger the list is.

Sets don't need to compare to other elements to see if the value is unique, which means that they are faster for de-duplicating large lists. The short explanation of how this is done, is that set computes where the value would be if it is in the set. And then it only needs to look in that place. (it's a hash table)


#5

thanks for your reply, but i dont really get how it cycles through the whole list. and also what does is the list() thing( does it refer to a list? but i dont understand the brackets after it). im sorry im such a newbie thanks :smile:


#6

thx, ill keep this in mind :smile:


#7

set and list are classes. Value types.

A list is an instance of the list class, 5 is an instance of the int class

Calling a class creates an instance of it.

print int() # 0

When given no arguments, int gives 0 as a default.

You can also give it a string:

print int('5') # 5

And the int class converts it.

Calling list produces an empty list:

print list() # []

And set..

print set() # set()

Container types expect some kind of sequence to be passed to their constructors
And typically they will add one element for each one.
If you loop through a string, you get each character.

print list('hello') # ['h', 'e', 'l', 'l', 'o']

#8

all these people helped me out a lot and have taught me tons, but to answer my own question, a new variable just needs to be set first before applying .append() etc (atleast to the furtheset of my knowledge)


#9

append is a method that lists have, it's not something you can do to any name.

[].append(5)

The above is rather pointless though, since there's no reference to that list, it just disappears.

You can create a type that defines an append method:

class Herpaderp:
    def append(self, value):
        print "totally adding %r here." % (value,)

Herpaderp().append('meow') # prints: totally adding 'meow' here.

And you can add a constructor too..

class Herpaderp:
    def __init__(self, seq):
        for elem in seq:
            self.append(elem)

    def append(self, value):
        print "totally adding %r here." % (value,)

Herpaderp('meow') # it'll tell you it's appending each letter in 'meow'

#10

I have my flatten function right here, I wrote this for my Battleship game so that I could flatten several different types of input I was giving my AI algorithm from several different sources of input, and to also sterilize output when needed.

def __flatten(self, a_list):
    """
    Flattens the given list of an arbitrary amount of lists/items into a single list with all values
    :param a_list: Pass the list object you wish to flatten, no list will be left with in. Also maintains order.
    :return: Returns this function with the hold list as an argument if there
    are any list objects, else it returns the held list.
    """
    hold = []
    for item in a_list:
        if list == type(item):
            for it in item:
                hold.append(it)
        else:
            hold.append(item)
    return self.__flatten(hold) if any(type(part) == list for part in hold) else hold

This function is fast enough that it gets called over 1,700 times in a sec and won't overkill the cpu. There is an inherit limit on it's recursion depth in place by python so there is that but it does what it is supposed to do though.

Here is a link to a generator for the flatten too, it is simple and does what is needed.

The nice thing about generators is they do not consume very much memory so they are good to work on databases and such with huge data sets.


#11

Just to kick this can around a little bit more...

def remove_dupes(x):
    y = []
    for a in x:
        y.append(a) if a not in y else False
    return y
    
old_list = [2,4,6,8,10,5,6,3,2,4,8,7,9,10]
new_list = remove_dupes(old_list)

print new_list     # [2, 4, 6, 8, 10, 5, 3, 7, 9]

However, we're back to the empty list.