5. Factorial - 2 Different Solutions with performance differences


#1



For the factorial exercise, I created two versions of the solution.
My first thought was to break everything down into chunks - e.g. how would I build this in excel? Once I figured that out, I looked for ways to simplify this.

You can use much larger numbers, but expect to wait for a while.
Would be interested to learn more about performance testing for sure...

VERSION 1

def factorial(x):
    x = abs(x)							# accomodate negative numbers
    factor_list = []					# blank list
    factor_calc = 1						# prevent hitting 0, which results in final value of 0

    for i in range(1, x+1):				# from each number in 1 to x (including x)
        factor_list.append(i)			# append this number to the list, factor_list

    for i in factor_list:				# within each number in the list, factor_list
        factor_calc = i * factor_calc   # take current value of the number in the loop and multiply by previous total

    return (factor_calc)


# print(factorial(55221))

import cProfile							# this imports the performance testing library
cProfile.run('factorial(55221)')		# this outputs the performance metrics of the function

This outputs:

         55226 function calls in 0.963 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.963    0.963 <string>:1(<module>)
        1    0.958    0.958    0.963    0.963 test_one.py:1(factorial)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.abs}
        1    0.000    0.000    0.963    0.963 {built-in method builtins.exec}
    55221    0.004    0.000    0.004    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


[Finished in 1.0s]

**VERSION 2

the seemingly more efficient solution**

def factorial_b(x):
    x = abs(x)					# accomodate for negative numbers
    calc = 1					# prevent multiply by 0, which results in final value of 0

    for i in range(1, x+1):		# from each number in 1 to x (including x)
        calc = calc * i 	    # calc = current calc value * current value of i in range
    return calc


# print (factorial_b(55221))		 # uncomment this to see the function's output

import cProfile					 	 # this imports the performance testing library
cProfile.run('factorial_b(55221)')	 # this outputs the performance metrics of the function

This outputs:

         5 function calls in 0.847 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.847    0.847 <string>:1(<module>)
        1    0.847    0.847    0.847    0.847 test_two.py:1(factorial_b)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.abs}
        1    0.000    0.000    0.847    0.847 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


[Finished in 0.9s]

#2

You can speed it up significantly by multiplying them pairwise repeatedly until there's only one number left - that way the numbers being multiplied are going to have far less digits (and we're talking a whole lot of digits here)
It's also somewhat significant how the numbers are paired, ideally large numbers should be paired with small ones instead of other large ones. However, the main gain seems to be in moving away from re-using the current product

The builtin math.factorial has similar time to your functions in Python2, but in Python3 it's far faster due to being written in C instead of in Python which is the case in Python2, and also using a better algorithm (though faster ones exist)

Your two versions are arguably the same thing, but one of them puts the numbers in a list first, and then takes them out again, while the other skips that part (doesn't do anything)
I think the time difference has mostly to do with how the timing was done, they shouldn't differ as much as 10%

The timeit module is more suitable than profile for benchmarking, though I find pytest-benchmark to be a fair bit more convenient (likely depends on use case) - that can be installed and used like so:

$ python -m pip install --user pytest-benchmark
$ python -m pytest factorial.py

import math


def factorial_a(x):
    x = abs(x)							# accomodate negative numbers
    factor_list = []					# blank list
    factor_calc = 1						# prevent hitting 0, which results in final value of 0

    for i in range(1, x+1):				# from each number in 1 to x (including x)
        factor_list.append(i)			# append this number to the list, factor_list

    for i in factor_list:				# within each number in the list, factor_list
        factor_calc = i * factor_calc   # take current value of the number in the loop and multiply by previous total

    return (factor_calc)


def factorial_b(x):
    x = abs(x)					# accomodate for negative numbers
    calc = 1					# prevent multiply by 0, which results in final value of 0

    for i in range(1, x+1):		# from each number in 1 to x (including x)
        calc = calc * i 	    # calc = current calc value * current value of i in range
    return calc


def factorial_c(x):
    xs = range(1, x+1)
    while len(xs) > 1:
        pairs = zip(xs[:len(xs)//2], xs[len(xs)//2:])
        xs = [] if len(xs) % 2 == 0 else [xs[-1]]
        for a, b in pairs:
            xs.append(a * b)

    return xs[0]


def factorial_d(x):
    return product(range(1, x+1))


def product(xs):
    if len(xs) <= 3:
        result = 1
        for x in xs:
            result *= x
        return result
    return product(xs[:len(xs)//2]) * product(xs[len(xs)//2:])


def test_a(benchmark):
    result = benchmark(factorial_a, 55221)
    assert result == math.factorial(55221)


def test_b(benchmark):
    result = benchmark(factorial_b, 55221)
    assert result == math.factorial(55221)


def test_c(benchmark):
    result = benchmark(factorial_c, 55221)
    assert result == math.factorial(55221)


def test_d(benchmark):
    result = benchmark(factorial_d, 55221)
    assert result == math.factorial(55221)


def test_builtin(benchmark):
    benchmark(math.factorial, 55221)

Which produces the following:

platform linux -- Python 3.6.0, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
--------------------------------------------------------------------------- benchmark: 5 tests ---------------------------------------------------------------------------
Name (time in ms)          Min                 Max                Mean            StdDev              Median                IQR            Outliers(*)  Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_builtin           55.0256 (1.0)       58.4306 (1.0)       56.2277 (1.0)      1.0249 (1.45)      56.3166 (1.0)       1.6498 (1.78)             7;0      18           1
test_c                 80.8702 (1.47)      83.1000 (1.42)      81.7541 (1.45)     0.7088 (1.0)       81.8070 (1.45)      0.9293 (1.0)              6;0      13           1
test_d                 99.7435 (1.81)     108.6869 (1.86)     102.2997 (1.82)     2.9211 (4.12)     101.0099 (1.79)      3.1089 (3.35)             2;1      10           1
test_b                760.0894 (13.81)    777.7967 (13.31)    766.1644 (13.63)    6.8665 (9.69)     763.4134 (13.56)     6.7690 (7.28)             1;0       5           1
test_a                768.4352 (13.97)    784.2266 (13.42)    775.1470 (13.79)    7.9873 (11.27)    770.9598 (13.69)    14.9967 (16.14)            2;0       5           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

pytest runs the functions starting with test_ and because I've given them parameters named benchmark, pytest provides a benchmark object for me to use (provided by pytest-benchmark which is a plugin)
I'm also asserting that they're getting the same result as python's own math.factorial, otherwise it doesn't matter how fast they are.. (pytest reports errors if there are any, that's.. what it's usually used for)

..oh, and factorial isn't defined for negative numbers!


#3

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.