Median


#1

This is the code I came up with that succeeded, but I was wondering if there were any obvious spots I could make more efficient, because I’m sure there are.

def median(x):
    med = 0
    sort = sorted(x)
    halflen = (len(sort) / 2)
    if len(sort) == 1:
        med = x
    elif len(sort) % 2 == 0:
        med = (sort[halflen - 1] + sort[halflen]) / 2.0
    else:
        med = sort[int(halflen + 0.5)]
    return med

Median 15/15
#2

this line:

med = 0

is not doing anything, so it could be removed

this line:

med = sort[int(halflen + 0.5)]

why + 0.5? lets say our list has a length of 3, then halflen is 1 ( 3 / 2 = 1), for a length of 5 the halflen is 2 and so on, which are perfectly the middle indexes. so you can just use sort[halflen] to get the median values for lists with uneven length

now you fixed that, this:

if len(sort) == 1:
        med = x

is also no longer needed. the else clause can handle (feel free to do the math)


#3

Oh right. For some reason I was thinking halflen in terms of floats, where I’d have 1.5, 2.5 etc and need to add 0.5 to bring it to the whole number. Can see now why it was throwing an error: I was referencing an index as a float, and then unknowingly fixed my own error by converting it to int haha. Thanks for your help, really appreciate it!


#4

nope, in python2 division which involves two integers will floor (round down), so 5 / 2 = 2, not 2.5

however, in python3 you would be right, the division would result in a float (5 / 2 = 2.5 in python3)

but if the length of the list is 5, the indexes are: 0, 1, 2, 3 and 4. So doing + 0.5 would result in 3, which is not the middle index, you would rather do -0.5

ideally, you would just use floor division (//) to ensure its always working


#5

Look for repeated patterns or phrases and cache, rather than repeat.

Avoid using the names of built-in functions, even if it doesn’t hurt anything, just so you don’t get in the habit.

s = sorted(x)
n = len(s)

We do not need a special case for length of 1, that will fall under ‘odd’. Consider what the middle index will be…

m = int(n / 2)

Now when we divide n by 2, we get the index of the median for any odd length sample.

With those three pieces of information we have enough to proceed with our return values.

if n % 2:         # will be True for odd
    return s[m]
return float(s[m - 1] + s[m]) / 2

Above, and earlier, we used explicit type casting so the reader has no question in their mind what we are after. Notice that only the numerator above is cast to a float, and we leave the counting number in the denominator untouched.


#7

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.