15/15 Is there a better way?


#1
def median(numbers):
    numbers = sorted(numbers)
    total = 0
    for num in numbers:
        total += 1
    indexnum = total / 2 - 1
    indexnum2 = total / 2
    if total % 2 == 0:
        dev = numbers[indexnum2] - numbers[indexnum]
        k = float(dev) / 2
        return numbers[indexnum] + k
    else:
        return numbers[indexnum2]

its how I solved it, I wonder if there's a more simplified way.


#2

There are multiply ways, but what is better? You could maybe reduce the number of lines, but does it stay nicely readable code? Anyway, just speaking in numbers of lines, i was curious how much i could shrink it, this is what i came up with:

def median(x):
    x = sorted(x)
    return x[len(x)/2] if len(x)%2==1 else (x[len(x)/2] + x[len(x)/2-1])/2.0
    
print median([4,5,5,4])

i used something which i am not sure is covered yet, one line if statement:

value_when_true if condition else value_when_false

but i wanted to return the value:

return value_when_true if condition else value_when_false

so, then i checked if the list contains odd or even number of items:

if len(x)%2==1

if that is the case, return:

x[len(x)/2]

let's say the list contains 5 items, then len(x) = 5, 5 / 2 = 2,which is the middle item (items are indexed: 0, 1, 2, 3, 4. so 2 is the middle one.
then the else statement, which is a bit more tricky. where you use indexnum and indexnum2, i do that in one line:

(x[len(x)/2] + x[len(x)/2-1])

then i divide that by 2.0 (dividing a int by a float will return a float.

Hope this helps, don't feel bad you didn't came up with this solution, i have been coding a bit longer. It is a fun challenge to try and reduce the number of lines of code your have, that will make you a better programmer, maybe use some of the tricks i showed you here? Good luck!


#3

Continuing the discussion from 15/15 Is there a better way?:

Oh that's pretty nice.
didnt learn yet about one line if statements but I see the logic by just reading it, could be useful for shrinking down lines into one.

you said at the start that the code might be small but harder to read, does a code with more lines takes longer to process?
what are the benefits to having a shrinked code as opposed to multiple lines like the ones I posted?


#4

Here is the thing, seeing a solution and understanding it is step 1. Step 2 is to actually come up with a efficient solution and write it.

Yea, if you have more lines of code you have more possibility's to include comments, so if you look back on your code later, it is easier to understand, in a codecademy exercise this most likely won't be a problem.

Less lines does not always mean the code runs faster. But i am afraid my knowledge ends at this point, i read some things about where more lines where used, but the code was faster. But these examples involve linux kernel, assembly and C, i think that will only be confusing

For example, my solution, you could argue that i calculate the len is calculated 3 times, which mean the len function needs to be called 3 times. But this is unnecessary, because once we start the function, the length of the array is fixed, so you could do:

def median(x):
    x = sorted(x)
    y = len(x)
    return x[y/2] if len(x)%2==1 else (x[y/2] + x[y)/2-1])/2.0
    
print median([4,5,5,4])

Now i have more lines, but maybe it is faster (don't know, i will have to measure it, i am sure that is possible)


#5

Ooh I have three contributions.

slight rewrite of @stetim94's version:

def median(x):
 y=len(x)
 return x.sort()or y%2and x[y/2]or(x[y/2]+x[y/2-1])/2.0

My attempt at a low amount of characters:

median=lambda x:sum(sorted(x)[(len(x)-1)/2:len(x)/2+1])/(2.0-len(x)%2)

How I would actually do it:

from numpy import median

..But then again numpy has to be available. It's a common package, but it doesn't ship with CPython and it would be silly to have a dependency just to be able to compute a median

CPython does ship with pip though, and from the command line, one can install numpy with:

pip install numpy

oh wait. There's a built-in one too!.. But only in Python3

from statistics import median

#6

@stetim94 I see, the new visual studio can caculate the amount of time it takes for a code to run and also have support for python. I'll check there when I have the time.
thank you for the answers, really cleared things up for me.

@ionatan I didnt learn the lambda function yet.
also didnt quite understand how the first code works.
I can see how importing a function from a module would be simpler, how can I tell though if there's a module with something I need? is it just about fiddling around with modules until you find something useful in one that you keep in memory for later or there's a better way?


#7

You need visual studio to calculate it? Anyway, i was curious, so i measured, but i doubt it is fair, i modified the code:

import time
start_time = time.time()
def median(x):
    x = sorted(x)
    return x[len(x)/2] if len(x)%2==1 else (x[len(x)/2] + x[len(x)/2-1])/2.0
            
median([4,5,5,4])
print "--- %s seconds ---" % (time.time() - start_time)

then i told bash to run that 1000 times, store it in a file:

for i in `seq 1 1000`; do python2.7 python.py; done | awk '{ print $2 }' > tempstore

then i copied that file to my clipboard, and calculated the average in excel, it is:
1.45*10-5 (so 0.0000.145 seconds)

But this number is so small, i doubt the average is reliable.


#8

There's nothing special about lambda, it's a function that takes one argument, x, and it returns the result of the expression on the right of the colon.

Normally, a function is created with a function statement. Statements have side effects, in the case of a function statement it is that a function gets created and the name you specified will refer to it.

A lambda expression is not a statement, so no side effect, instead the result of the expression is a function.

Likewise, a ternary expression (if-else in an expression) is.. not a statement, it has a result. While an if-statement statement has no result, but a side effect, which is that other statements are executed.

My first piece of code is intentionally a bit difficult to understand, it's just exploiting that the logical operators and/orare short-circuited, meaning that if the left hand side (lhs) of theor-operator is truthy, then that's the result, and the rhs doesn't need to be evaluated at all.
(A value is truthy if converting it to bool evaluates to True)
For example, this will not print anything:

from __future__ import print_function # Make print a function instead of statement
5 or print('this will not print because 5 is truthy')

list.sort always returns None, which is falsy, so the result of the first or is that of the rhs which is the rest of the expression. So this remains: y%2and x[y/2]or(x[y/2]+x[y/2-1])/2.0

and will not evaluate rhs if lhs is falsy. So if x%2 is 1 (truthy) then x[y/2] also gets evaluated and its result will probably be truthy, causing that to be the result as the last or will ignore its rhs.

I say probably, because the result might be 0 which.. means that there's a scenario where my function will calculate an incorrect median!

..Obviously, code should be easy to read, the only useful thing about code like that is understanding how those operators work.

Modules..just google for what you want to do, if the module exists, somebody else will have asked about it as well and it'll turn up as one of the search results.

Regarding time, computing a median requires sorting, and then it's just looking up one or two values which takes a negligible amount of time.

Sorting numbers takes time proportional to k * n where k is the length of the longest number (in number of digits) and n is the length of the list. (radix sort)
But Python's sort takes n * log(n) time because it sorts by comparing values to each other (radix sort can only be used for numbers) So those median functions in numpy and statistics, they just might be using radix sort instead which is a significant improvement for large lists. And that would be more difficult to implement, particularly for floats.