Prime Number Printer - thought experiment and test cases


#1

It would be fair to discount this discourse from the submission slate. More of a thought experment using a lot of brute force tactics that are pricey.

from math import sqrt
from itertools import permutations

Prime number sieve from Basic submission...

# function to generate a prime number sieve
def gen_sieve(lim):
  # function to retrieve next prime index in sieve
  def nextPrime(n):
    n += 1
    while not s[n]:
        n += 1
        continue
    return n
  i = u = lim + 1
  j = k = 0
  s = [1] * i
  s[0] = 0
  s[1] = 0
  while k < sqrt(u):
    j = nextPrime(k)
    k = j
    while j + k < u:
      j += k
      s[j] = 0
  return s

Not sure how much memory is given back after this...

# function to extract number digits from a string
def extract_nums(test_str):
  return list(filter(lambda x: not x.isalpha(), test_str))

We end up with a list of strings, which, as it were turns out to be a better way to venture forward as it does get resolved to integers down the way.

Next we look for odd number combinations (last member in list).

def filter_odd_nums(nums):
    return list(filter(lambda x: int(x[len(x)-1]) % 2, nums))

The next step uses the most resources.

#function to generate permutations of given r value
def generate_perms(nums, r):
  return list(permutations(nums, r))

With a list of tuples that contain only lower order odd numbers, we can next combine the tuples into numeric values in a list...

# function to join tuples of number strings into an integer
def combine_nums(nums):
  return [int(''.join(x)) for x in nums]

Next to see if the number combination is found in the test string...

# function to filter number in test string
def filter_in_str(nums, test_str):
    return list(filter(lambda x: str(x) in test_str, nums))

Those that do get tested for primeness...

# filter to track primes
def filter_primes(nums):
  s = gen_sieve(max(nums))
  return list(filter(lambda x: not not s[x], nums))

The outcome should rightly be a list of primes contained in the string

Given there are any number of elements in the numbers list, things can feasibly go off the rails in this realm. Finding an effecient way to explore this is the ultimate goal. Test case to follow..


[Challenge] Prime Number Printer
#2

Test case # 1

input_str = 'abc123def45gh67ijk89lm'
num_str = extract_nums(input_str)
nP1 = filter_primes([int(x) for x in num_str])
primes = nP1
del nP1
nP2 = generate_perms(num_str, 2)
nP2 = filter_odd_nums(nP2)
nP2 = combine_nums(nP2)
nP2 = filter_in_str(nP2, input_str) if len(nP2) else nP2
nP2 = filter_primes(nP2)
primes += nP2
del nP2
nP3 = generate_perms(num_str, 3)
nP3 = filter_odd_nums(nP3)
nP3 = combine_nums(nP3)
nP3 = filter_in_str(nP3, input_str)
nP3 = filter_primes(nP3) if len(nP3) else nP3
primes += nP3
del nP3

print (sorted(list(set(primes))))

[2, 3, 5, 7, 23, 67, 89]

Since the test string has no number longer than three digits, I stopped at r = 3.


#3

Looking back over the challenge stipulations, this one rings loudly...

Example: primeNumberPrinter("abc2134kd31"), will output an array [2,13,3,3,31] (in the order they appear in the string).

My brute force test case doesn't even consider that. One more thing to sort out.

I'm still scratching my head over how to break apart the string so I can determine r-max. It's not a difficult problem, but one to solve and refine. More to the needs of a less brute force approach, one thinks.

Haven't studied any of the submissions so as to keep a clear head, but surely to handle a very large integer this brute force approach would be a farce, even with a restraint, like Python 2's maxint,

9223372036854775807

I know the sieve can handle 1E6 - 1, but it must suck up a huge amount of RAM in magnitudes greater than 6. Plus the sieve is destroyed and must be rebuilt on every test value. Even a small number of large numbers would weigh down the system, I suspect.

For that reason, it is difficult to envision how far down the road this approach is feasible. I'm puzzling over the limit to set on n. With just n = 9 numbers in the digit list, the middle term in the polynomial (nP5) will be huge, in the order of 15 000 or more. That's a lot of tuples to have to chew through with the filters, and whole lot of memory. nP4 and nP6 are not much more than an hundred so they are not all that daunting.

335_221_286_400

That's 19P10.

Edit:

Here's what I came up with for extracting the numbers intact, and in order...

def number_extractor(test_str):
  num_list = []
  ns = test_str
  while len(ns):
    if ns[0].isalpha():
      ns = ns[1:]
    else:
      num = ''
      while not ns[0].isalpha():
        num += ns[0]
        ns = ns[1:]
      num_list.append(num)
  return num_list

print (number_extractor('abc123def45gh67ijk89lm'))

['123', '45', '67', '89']

#4

I'd say if you wanted to limit yourself to smaller numbers trial division is the most efficient approach. But if you wanted big numbers, find the biggest number there, run the sieve up to that number but save the contents of the sieve in a list such as "primes" and then check if each number is in primes or not, that way the sieve is only run once.


#5

To solve the ordering problem to match the stipulated results expected (permutations gave too many results) I switched to combinations and that seemed to solve part of the problem. Using the above number extractor (with a couple of tweaks) rather than the earlier digit extractor shortened the field considerably. I haven't tested my submission extensively, but it shows promise. Have do idea yet what the limits will be.

Once the number list is extracted, each one is subjected to the combination generator and those results filtered down to. Since the filtering is more deterministic than brute force, odd numbers are not filtered.

Edit:

Had time to do a little more testing and discovered a major flaw in the extractor... isalpha() lets way too much junk past. It occurred to me earlier but I must have had blinders on. Fixed it with isdigit().

# function to extract whole numbers in order
def number_extractor(test_str):
  num_list = []
  ns = test_str
  while len(ns):
    if not ns[0].isdigit():
      ns = ns[1:] if len(ns) else ns
    else:
      num = ''
      while ns and ns[0].isdigit():
        num += ns[0]
        ns = ns[1:]
      num_list.append(num)

  return num_list