Euler #7: 10001st Prime

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

There are many prime-finding sieves. The ancient prototype is the Sieve of Eratosthenes, discovered circa 250 BC. There are other more modern and faster sieves such as the Sieve of Sundaram or the Sieve of Atkin. Since it’s an oldie but goodie, let’s go with a modified version of Eratosthenes.

Normally, a list is generated then multiples of numbers are crossed off leaving behind the primes. Here, let’s try something different. The idea is to iterate through numbers without crossing off future multiples. As a new number is encountered, we check our list of found primes to see if any of them divides the new number. If not, the new number is prime and is added to a second list of primes.

def find_nth_prime(n):
    """Modified version of the Sieve of Eratosthenes.  Instead of crossing off
    future multiples (except for multiples of 2), use found primes to check 
    for divisibility.  Return the nth prime once it is found.
    primes = [2] # So we can skip by 2s.
    test_number = 3
    while len(primes) != n:
        for prime in primes:
            has_loop_fully_completed = False
            if test_number % prime == 0:
                # We have found a composite number
            has_loop_fully_completed = True
        if has_loop_fully_completed:
            # Tried all found primes and found no clean division.
            # Therefore test_number is prime.
        test_number += 2
    return primes[-1]

Running find_nth_prime(10001) returns 104743, which is indeed the 10001st prime.

Now, how does this algorithm scale up? I think it’d be fun to graphically determine the algorithm’s time complexity. The idea is to plot a graph of average execution time versus n.

Average execution time of find_nth_prime(n).  Looks polynomial.

A little jagged, but the pattern is clear. This algorithm has terrible time complexity! That curve suggests a time complexity of \mathcal{O}(n^2). With large n, the algorithm will run at a snail’s pace. That’s a far cry from the optimal implementation’s complexity of \mathcal{O}(n\log{\log{n}}).

There’s a flaw with this algorithm. Prime numbers larger than \frac{n}{2} will never be able to divide n cleanly. However, accounting for this will only end up halving the complexity. It’ll still be polynomial of degree two.

My slow algorithm will suffice for now. But as I encounter more difficult prime-finding problems, I’ll be revisiting sieve algorithms and making improvements on them.