Is it okay to use a data set to reduce the load?

projecteuler

#1

This process assumes the solution triplet to be a multiple of one of a small set of known triplets. Starting with the first five.

from math import sqrt, trunc

def f_of_x(a, b, x):
    return trunc(sqrt((a * x)**2 + (b * x)**2)), (a + b) * x

knowns = [(3,4), (5,12), (7,24), (8,15), (20,21)]

for k in knowns:    
    x = 0                                         # literally, x, the independent
    r = (0,0)
    
    while r[0] + r[1] < 1000:        
        x += 1        
        r = f_of_x(k[0], k[1], x)                 # return tuple includes h(x)

    if r[0] + r[1] == 1000:                       # r[0] is f(x), r[1] is h(x)

        a, b, c = k[0], k[1], r[0]

        f = a * b * c * x**2                      # solution
        
        g = [(a * x, b * x, c), x, (k, c // x)]   # report
        
        break
        
print (f, g)

31875000 [(200, 375, 425), 25, ((8, 15), 17)]

The assumption is correct, in this instance. The solution is a multiple of the fourth in the set of knowns.This whole process is based on finite solutions without the use of brute force, while still trial and error. And it yields data. From a full on efficiency perspective it is probably wasteful owing to memory usage. A brute force approach will arrive at the same conclusion with less memory usage.

If my job depends upon it, brute force it is. If exploration is at hand, then break it down. Step by step, and let the code explain it to the reader.


#2

If you can motivate why that would be true without knowing the solution first, sure.

You've got so very many single-letter variable names, reading any part of the code requires full understanding of the whole thing, which makes it more difficult to understand any one part and to piece it together. I can't write code that way, I get lost.

Brute force is not super great here if it can be avoided. The input data is small, but already takes a noticeable amount of time to execute, O(n2)

#!/usr/bin/env python3

"""
Brute-force solution to euler 9
"""

from itertools import combinations


SQUARES = [n**2 for n in range(1000)]


def is_solution(ab):
    a, b = ab
    return a + b < 1000 and SQUARES[1000 - a - b] == SQUARES[a] + SQUARES[b]


def present(ab):
    a, b = ab
    return a * b * (1000 - a - b)


for solution in filter(is_solution, combinations(range(1, 1000), 2)):
    print(present(solution))

I haven't tested if precomputing the squares is actually faster, probably insignificant even though there are only 1000 of them and they are looked up around 1.5m times

There are a lot of properties of pythagorean triples that can be used for pruning, this solution is really naive.

Euclid's formula can probably be used for a much faster solution. Don't know what time complexity that would be or how to best choose input to the formula.


#3

My reasoning is that there is a finite number of small pythagorean triplets, where at some point we begin to encounter triplets that trace back to one of the small, or primitive ones. It is a hypothesis, which needs bearing out on a larger scale. Given the limited range I felt it was worth a try.

Single letter variables are one of my worst habits. Were I to be working in the field that habit would have been hammered out by now. In the case above, it relates more to algebraic symbols, in a literal sense. A, B, C, X.

a and b are always constants, taken from the respective tuple in the knowns list. c in the solution and report is a computed value. x is as in algebra, the independent variable, a multiplier.

I am impressed, as always, by your brute force solution @ionatan.

One question:. Do we need 1000 squares? Would 500 be enough to satisfy the algo?


#4

You only need to go as far as to the sixth primitive triple to show that you haven't checked all triples in that range. If you generate all primitive triples in that range then you're in business.

If you just copy them into your code and multiply them, then your code isn't doing much. That code would be doing about as much as mine does. Which would you consider the biggest cheat?

Yes. There is much pruning that can be done in the range I'm checking. When is the right time to stop pruning? There might be a lot of properties of those triples that can be used for pruning.


#5

Yes, I see that by not generating triplets it is disingenuous. This time I got lucky. Have spent some time in the Wikipedia page on triplets and settling on which way to go. It'll be a couple days of fumbling before I get a handle on this. Leaning to the Dickson's method approach since it has one variable, factoring r**2 / 2 to come up with candidate pairs.

A test for coprimes is the main requirement, to eliminate from the candidate list. After that it's just a matter of adding 'em up. Eventually the solution will be reached.


#6

A little pruning, gets it down to 80k iterations down from 500k (both search all triples, not stopping at the solution)
Moved the code into a function because functions have faster variables than the global scope. (global is a dict, function is more like how a compiled language would use variables) .. The previous one would have benefited as well since it has a loop that does 500k iterations in global scope.

#!/usr/bin/env python3

"""
Brute-force solution to euler 9
"""


def main():
    SQUARES = [n**2 for n in range(1000)]

    for a in range(1, 1000):
        for b in range(a + 1, 1000):
            c = 1000 - a - b
            comparison = SQUARES[c] - (SQUARES[a] + SQUARES[b])
            # c is too great, keep increasing b
            if comparison > 0:
                continue
            # c is too small and will only get smaller by increasing b.
            # prune by breaking loop
            if comparison < 0:
                break

            # c**2 == b**2 + a**2, solution found.
            print(a * b * c)


main()

I like the other one better.


#7

I still think we only need the first 500 squares, but haven't got to that area of testing. Still bent on the valid pair solution.

Came up with this as a coprimeness test...

def coprimes(a,b):
    p = prime_factors(a)
    q = prime_factors(b)
    r = [x == y for x in p for y in q]
    #print(not r[0])
    return not r[0]

Still needs testing but seems to give correct results.

It uses a piece of code I dredged off an old drive that I must have gleaned years ago...

def prime_factors(n):
    if n <= 1: return []
    prime = next((x for x in range(2, int(ceil(sqrt(n))+1)) if n%x == 0), n)
    return [prime] + prime_factors(n//prime)

I understood this code when I saved it, and still do. I just don't remember where it came from. The ole brain is not what it used to be.

Edit. Of course right out of the gate it fails on 1,1. Giminy crickets!

Update:

Did I fix it?

def coprimes(a,b):
    if a == 1 or b == 1: return True