Removing Duplicates: Simplest way in 3 lines of code


#1

Guys, we have an amazing library in the Python language that should be used whenever possible! Do not reinvent the wheel! Although understanding different notions is surely great it might make code more complex on the long run. Instead, try figuring out the simplest and laziest approach since using convention is what Python aims for.

Do notice that by no means am I saying to not find out new ways, the beauty of programming comes from knowing that there are multiple solutions for every problem.

As such, and being that I understand how to solve this program without using my simple method I will discuss the simple method in here.

# 3 lines of code method
def remove_duplicate(n): 
  # pass as a set, sets cannot have duplicates
  nSet = set(n)
  # convert to a list now
  nList = list(nSet)
  # return that
  return nList

Simple right?

Here is how you would go by doing this line by line using your own method:

def part_two(n):
    # create a list to hold the new function
    nList = []
    # iterate through n and for each character check to see if it is not
    # in the new list
    for i in n:
        if i not in nList:
            nList.append(i)
    return nList

I prefer the second version, solely because it helps with understanding a core fundamental principle regarding programming: data can be manipulated in whatever way you want.

As a programmer, you guys are data rockstars, you mold and shift data in whatever way you can and want. Use your powers wisely and create awesome things!

I will add solutions by other users in here:

This one is from zeziba, similar to my first example although in here there is no need to use assignments. Pretty cool example

def remove_duplicates(item_to_reduce):
    return list(set(item_to_reduce))

#2

try figuring out the simplest and laziest approach

Really? For many years I was thinking that we have to write code that is readable and optimal in matter of its time / space complexity. Stupid me.

Your first code is a bit cryptic. Conversion of list to set to remove duplicates, it might be not obvious for others, but it's not a big problem because you commented this operation.

Ok, but what about time and space complexity of these two versions?


I am not trying to be mean, but words like simpler, better, faster are very dangerous in computer science. Try to show all positive sides of your version without using these words.

At this moment I see only one positive aspect - 3 lines instead of 6 lines. But I think that comment to first version is necessary, so it's more like 4 lines for me. And I can create this function in 1, 2 lines so I really don't see why I should use your version.


#3

Cryptic? This is a fairly subjective opinion you have over what is cryptic. How is a function being called on a variable and then poised to an assignment cryptic?

What is so hard to understand from:

nSet = set(n)

If this is cryptic then you are definitely going to love c++, c and perl.

I do not think you are being mean, I think you are being ignorant, nowhere does it say in my post that you should use my solution, none of you for that matter, it was meant to demonstrate to beginners that there are multiple solutions, SIMPLER solutions and they can feel safe implementing whatever they want.

I did not tell anyone to use my code because what I want is people to know that they have choices.

Really? For many years I was thinking that we have to write code that is readable and optimal in matter of its time / space complexity. Stupid me.

In Python that is called being pythonic and since you missed it I suggest googling for it. How in the world is assigning the result of a function to a variable cryptic will continue to be beyond me.


#4

That is not the simplest way to remove duplicates at all....

def remove_duplicates(item_to_reduce):
    return list(set(item_to_reduce))

There is no need to create anything other than use the built in functions that are already present.


#5

There is no need, but again and just to make sure everyone is reading:

THIS IS MEANT TO HELP BEGINNERS UNDERSTAND THINGS


#6

Even then I will add your solution to the list, just please keep in mind, this is only to show that there are multiple ways in which things can be solved. What I want is for beginners to look at solutions and then figure out which way works best for them, everything from scratch? Library use? That is what I want, for beginners to see and understand choice so that they do not feel overwhelmed.


#7

Cryptic? This is a fairly subjective opinion you have over what is cryptic. How is a function being called on a variable and then poised to an assignment cryptic?

What is so hard to understand from:

nSet = set(n)

If this is cryptic then you are definitely going to love c++, c and perl.

You are right, I love C and Perl. Most of the time I spend on creating artificial neural networks in C.

Ok, why I think that your method is a bit cryptic? You have converted a list to the set, two completely different data structures, to remove duplicates. You should warn users about one big downside - order preservation.

From your comment, I guess that you program in C, C++ and Perl. Great. As you probably know every data structure has different characteristics. Good programmers use data structures that are best suited for the current task. So if you work in the programming company and you have to write a function that will remove duplicates from the list - you want to keep correct order of elements.

That is why I think your code is cryptic - the result of converting a list to the set is much broader than removing duplicates.

I do not think you are being mean, I think you are being ignorant.

Ok, I can live with that.

Nowhere does it say in my post that you should use my solution, none of you for that matter, it was meant to demonstrate to beginners that there are multiple solutions, SIMPLER solutions and they can feel safe implementing whatever they want.

Your code does not have the same result as the code provided in exercises. It's not a different method, it's a completely different function.

Showing beginners that they can just convert a list to the set to remove duplicates without explaining what is going under the hood? Dangerous simplification.

Your solution is not simpler. Not if the order of elements is important - and if someone uses a list I believe that he wants to keep the original order.

I did not tell anyone to use my code because what I want is people to know that they have choices.

Great. That is why in my first comment I was trying to force you to present a comparison of both methods. What is a time complexity? Does function preserves order? Which method is more readable?

You have chosen to say that I am an ignorant. :smile:

In Python that is called being pythonic and since you missed it I suggest googling for it. How in the world is assigning the result of a function to a variable cryptic will continue to be beyond me.

Fortunately, I already explained this. About the pythonic code - don't worry, your code is not pythonic. I am a big fan of style guides in programming. :smile:


#8

The instructions for the program itself said that the order for which the list items are returned is itself not important, they even mentioned an example for this in which what they want is for duplicated items to be removed, that is it, it does not say that the order is important hence why the program I wrote is relevant.

The point from my original post was to demonstrate that there are multiple solutions for this particular problem.You wanted to make the issue at hand more detailed and you are most definitely right when you say that order is important, then again it might not be important depending on the task at hand, for this particular task it is not.

You want to conclude on my code not being Pythonic, according to you it is not idiomatic to what Python stands for? If anything It is a matter of you not liking the solution for your stated reasons which in itself are not entirely valid based on the task given by codeacademy.

Now regarding time complexity, you mentioned that a beginner might have an issue with understanding the convertion between sets and lists and vice versa, but you want me to go into why would one method of solving the problem will be more efficient than the other? don't get me wrong, algorithm efficiency is definitely a subject that must not be passed, but lets go by levels man, first let them understand that they have choices, then we teach them why a particular choice is more efficient than the other in terms of algorithm efficiency.

I like how you do know what you are talking about. I apologize for calling you ignorant on the matter.

As a parting note, care to share some examples as to what other solutions might they have for this particular problem set? And I am saying that in the nicest non hostile way.

Cheers

I


#9

The instructions for the program itself said that the order for which the list items are returned is itself not important, they even mentioned an example for this in which what they want is for duplicated items to be removed, that is it, it does not say that the order is important hence why the program I wrote is relevant.

You are right! Yes, according to instructions you don't have to care about order preservation. In my opinion it's a stupidity, I already explained why I think so.

I really don't try to offend you. In fact, I appreciate that you are helping others. I just believe that we should try to maintain higher standard of explanations than codecademy. In your first post here you have said Use your powers wisely and create awesome things!. Exactly! Every method has some pros and cons and if you want to create a discussion that aggregates different functions - you should present their positive and negative aspects.

This is my opinion. You don't have to agree with me. Really.

The point from my original post was to demonstrate that there are multiple solutions for this particular problem.You wanted to make the issue at hand more detailed and you are most definitely right when you say that order is important, then again it might not be important depending on the task at hand, for this particular task it is not.

And here I see my mistake. I think that codecademy is not a good learning resource and I thought that you want to aggregate functions to remove duplicates from the list, not list solutions to this particular exercises. The difference is big, right? :smile:

You want to conclude on my code not being Pythonic, according to you it is not idiomatic to what Python stands for? If anything It is a matter of you not liking the solution for your stated reasons which in itself are not entirely valid based on the task given by codeacademy.

Python is similar to Ruby in this matter. I think there are three levels of code quality in Python:

  1. Poorly written code.
  2. Well written code.
  3. Pythonic code.

Your code in my opinion is poorly written, it reminds me of this code:

def is_odd(n):
    if n % 2 == 1:
        result = True
    else:
        result = False
    return result

Is this code pythonic? Nope. It's a poorly written snippet.

The code provided by @zeziba is not a different solution. It's a well-written version of your function.

Now regarding time complexity, you mentioned that a beginner might have an issue with understanding the convertion between sets and lists and vice versa, but you want me to go into why would one method of solving the problem will be more efficient than the other? don't get me wrong, algorithm efficiency is definitely a subject that must not be passed, but lets go by levels man, first let them understand that they have choices, then we teach them why a particular choice is more efficient than the other in terms of algorithm efficiency.

No, I mentioned that beginner might not be aware of the fact that your method destroys the order of elements.

You have said that they should use the laziest approach. They should not. They should be aware of pros and cons of each method to chose one best suited for the given task (also in the future, not only in this single exercise).

TThere is a great, simple way to show time complexity of all methods - time benchmark. 4 random data sets. In the first set 500 short lists, in the second set 500 longer lists, etc. The essence of time complexity, don't you think?

I like how you do know what you are talking about. I apologize for calling you ignorant on the matter.

No need to apologize :smile: We are two random guys. You can call me whatever you want, and I really mean this. I am here to share some thoughts about programming, not to argue about with you about my humble person.

As a parting note, care to share some examples as to what other solutions might they have for this particular problem set? And I am saying that in the nicest non hostile way.

Just to clarify - I don't think that you are mean.

Sure, I would love to!

My favourite solution is:

def remove_duplicates(l):
    seen = set()
    return [x for x in l if x not in seen and not seen.add(x)]

This is the fastest order preserving solution that I know. I think this function was written by Dave Kirby, but, unfortunately, I am not sure. I will try to find a source later.

If I would not care about order I would probably use this function:

def remove_duplicates(l):
    return {}.fromkeys(l).keys()

#10

the code below worked for me

def remove_duplicates (x):
p = []
for i in x:
if i != i:
p.append(i)
return i


#11

this is supposed to be duplicates


#12

A question from a beginner:
why is the "is_odd_" example bad code, or what would be a better option? Just trying to understand the difference between bad and good.


#13

Not sure what the best way to do it is, but in my opinion:

def is_odd(n):
    return n % 2 == 1

would be a better option.

It get's rid of the result variable and the if/else without beeing harder to read.

Also it uses only one return statement. Some people advice against having multiple return statements.
I think it's fine to have multiple return statements, but only if there is a good reason to have them, like making the code more readable.


#14

It does illustrate better the options available to be honest. As a complete beginner it would be so easy for me to get lost in the banter of seasoned coders and not see the wood for the trees, so it is useful to read exchanges such as this one in a forum :wink:


#15