Archive for the ‘ Project Euler ’ Category

Project Euler Problem 8

Problem 8

“Find the greatest product of five consecutive digits in the 1000-digit number.

73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450″

The first step here is to find a representation for that fairly humungous number. Obviously it’s not going to fit into a paltry 32-bit int…but then we don’t need it to. The problem description requires us to think in terms of smaller (5-digit) numbers, not one giant 1000-digit number.

So, it is sufficient for us to consider the number as an enumerable stream of single digits, which we can conveniently represent as IEnumerable. I could use a macro to convert the number into a collection initialiser, but it’s much easier to treat the string as an IEnumerable<char> and let LINQ do the heavy lifting.

var nums = Enumerable.AsEnumerable(
        "73167176531330624919225119674426574742355349194934" +
        "96983520312774506326239578318016984801869478851843" +
 
        // ..... etc etc ......
 
        "05886116467109405077541002256983155200055935729725" +
        "71636269561882670428252483600823257530420752963450"
        ).Select(x => Convert.ToInt32(x.ToString()));

This gives us an IEnumerable<int> containing every digit in the 1000-digit number. Now, the ‘obvious’ way to solve the problem is to iterate through the collection, and at each index multiply the value against the next four indexes. A simple loop should deal with it:

private static int SimpleSolver(int[] ints)
{
    int max = 0;
    for (int i = 0; i < ints.Length - 4; i++)
    {
        int tmp = ints[i] * ints[i + 1] * ints[i + 2]
            * ints[i + 3] * ints[i + 4];
        max = Math.Max(max, tmp);
    }
 
    return max;
}

As ever, though, that’s pretty ugly – the loop condition and product calculation is tied to the sequence size of 5, and messing with an index variable is tedious.

An alternative approach is to take advantage of LINQ’s Skip and Take methods to split the problem domain into overlapping ‘slices’. Similar to the for loop above, the core of the approach is to iterate through the digits, and at each digit grab a number of subsequent digits and calculate the product.

Lets look at the 5-digit slices available from the first 10 digits:

  7   3   1   6   7   1   7   6   5   3
|       73167       |
    |       31671       |
        |       16717       |
            |       67176       |
                |       71765       |
                    |       17653       |

We can use Skip to progressively move the starting index forward, and Take to grab the 5 digits we need. So, starting with i=0, each successive slice can be sliced from the whole with:

var slice = ints.Skip(i++).Take(5);

To calculate the product of the digits in the slice, we can use the Aggregate operation:

slice.Aggregate(1, (curr, next) => curr*next);

We’ve met Aggregate before – it’s basically a fold, which collapses a sequence to a single item by repeatedly applying an operation to an accumulating result.

This can all be wrapped up as an iterator block, like so:

private static IEnumerable<int> EnumerateSlices(
        IEnumerable<int> ints, int sliceSize)
{
    int i = 0;
    while (true)
    {
        var slice = ints.Skip(i++).Take(sliceSize);
 
        if (slice.Count() < sliceSize)
            yield break; // end
 
        yield return slice.Aggregate(1,
                (curr, next) => curr*next);
    }
}

Note the termination condition – when we have enumerated every slice, our next slice will contain only 4 elements (3, 4, 5, and 0 from the end of the sequence) – that’s our cue to exit the loop.

Also note that this approach makes the algorithm trivial to parameterize – it will work just as well with slice sizes other than 5.

This iterator will produce an IEnumerable containing the products of all slices, so the final step is to select the largest:

var result = EnumerateSlices(nums, 5).Max();
  • Share/Bookmark

Project Euler Problem 7

Problem 7

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

Ah, what a nice, straightforward, unambiguous spec! If only business software specifications were so precise.

Way back in problem 3, I took a bit of a wander off-topic and built a prime generator in .Net using the Sieve of Eratosthenes. Armed with this, problem 7 should be easy, right? The sieve implementation generates an IEnumerable, which is non-indexable (i.e. I can’t just say Primes()[10001]), but I can take the first 10,001 and then ask for the last element, which will be the answer to the problem.

There’s a problem with this, however. The sieve requires an upper bound during initialisation. This means it’s great for solving problems like “generate all the primes less than 10,001″, but not so great at answering questions like “what is the 10,001st prime number?”, since it requires foreknowledge of the upper bound.

To illustrate the problem, I’ll take a wild guess at the upper bound. I’m going to guess that the 10,001st prime number is less than 99,999. What happens?

var sieve = new SieveOfEratosthenes(99999);
var result = sieve.Primes().Take(10001).Last();

This generates an answer of 99,991. If I enter this into the Project Euler website, however, it tells me the answer is wrong. Gah! What went wrong? A simple test reveals the problem:

var sieve = new SieveOfEratosthenes(99999);
var primes = sieve.Primes().Take(10001);
var count = primes.Count();

There’s only 9,592 primes generated! As the docs for Take() state (emphasis mine):

Take<TSource> enumerates source and yields elements until count elements have been yielded or source contains no more elements.

Damn. So, looks like my 99,999 guess was too small – with that as an upper bound, the sieve only finds 9,592 primes, and I need the 10,001st. OK, I’ll bump it up by an order of magnitude:

var sieve = new SieveOfEratosthenes(999999);
var result = sieve.Primes().Take(10001).Last();

This gives me the correct answer. Not exactly a wonderful solution though; the idea of having to guess the upper bound is pretty horrendous, and if this was real code it wouldn’t be particularly maintainable – what if the requirements changed and we had to find the nth prime, which happened to be >99,999? We’d have to guess again. Ugh.

Worse, the sieve algorithm precomputes all the primes up to the specified upper bound, meaning that in the above approach I’ve asked the sieve to generate primes up to 999,999 (all 78,498 of them!) despite only needing 10,001. Not very efficient.

Fortunately, the upper bound can be calculated separately. Where n>8601, as in this case, we can use the following equation:

p(n) < n (loge n + loge loge n - 0.9427)

where p(n) is the nth prime number.

Alternatively, for flexibility in handling n<8601, we can use the less accurate

p(n) < n loge loge n

which works for n>5. We can easily precompute the answers for n<=5, or simply calculate on demand.

The formula can be implemented on the sieve class, with a factory method to help when we want to use it:

public static SieveOfEratosthenes
    CreateSieveWithAtLeastNPrimes(int n)
{
    return new SieveOfEratosthenes((long)
            Math.Ceiling(UpperBoundEstimate(n)));
}
 
private static double UpperBoundEstimate(int n)
{
    return n * Ln(n) + n * (Ln(Ln(n)));
}
 
private static double Ln(double n)
{
    return Math.Log(n, Math.E);
}

This leaves us with an overall solution like so:

var sieve = SieveOfEratosthenes.CreateSieveWithAtLeastNPrimes(10001);
var result = sieve.Primes().Take(10001).Last();

This generates a total 10,018 primes, cutting the wasted effort from almost 70,000 superfluous primes to just 17, and takes around 20ms to execute on my machine. Plenty fast enough, I think.

  • Share/Bookmark

Project Euler Problem 6

Onwards to…

Problem 6

The sum of the squares of the first ten natural numbers is,

12 + 22 + … + 102 = 385

The square of the sum of the first ten natural numbers is,

(1 + 2 + … + 10)2 = 552 = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

Bit of a disappointment, problem 6; it’s too easy. It’s rated as the third-easiest, i.e. easier than problems 3, 4, and 5 which I’ve already covered. In fact, for my money it’s easier than problem 2 as well. Ah well, the difficulty ramps up soon enough, trust me.  Here’s the very simple python solution:

sum_sq = sum([ x*x for x in xrange(1, 101)])
sq_sum = sum(xrange(1, 101)) ** 2
 
print sq_sum - sum_sq

As you can see, it’s pretty intuitive. You sum the squares, square the sum, and calculate the difference. The answer is basically in the description, you just have to scale up a little.

There’s not much else to say about this one. Even if I abandon the functional approach and write a straightforward imperative solution it’s still very straightforward. In (deliberately non-idiomatic, so don’t whine at me) ruby:

sum_of_squares = 0
sum = 0
 
1.upto 100 do |x|
    sum_of_squares += x * x
    sum += x
end
 
p (sum * sum) - sum_of_squares
  • Share/Bookmark

Project Euler Problem 5

On to the next Project Euler problem (after a bit of a hiatus)…

Problem 5

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?

In common with many of the other Euler problems, there’s a brute-force way to solve this, and a clean algorithmic way. And in common with my other Euler posts so far, I’ll start with the brute-force way ;-)

This problem can be tackled head-on with the following approach: Start from n=1 and increment in a loop. Test each value of n by attempting to divide it by all numbers m from 1 to 20. The first number to pass the test (i.e. n mod m is 0 for all values of m) is the answer.

private static long BruteForceSolver()
{
    long result;
    for (result = 1; !Check(result); ++result)
        ;
    return result;
}

private static bool Check(long result)
{
    for (int i = 1; i <= 20; ++i)
    {
        if (result % i != 0)
            return false;
    }

    return true;
}

This works, but it takes >12 seconds to execute on my PC, so it’s not what you’d call efficient (though it is well within the Euler execution time guidelines).

Some speed gains can be achieved by exploiting the information provided in the question itself. We are told that 2520 is the lowest number evenly divisible by all numbers from 1 to 10. Since the problem space (1 to 20) includes all these numbers, the answer must also be evenly divisible by 2520. This allows much bigger increments each loop – rather than incrementing by 1, why not increment by 2520? And since the answer must be greater than or equal to 2520, why not start the loop there instead of 1? Finally, since we already know that 1 to 10 divide evenly into 2520, each inner loop only needs to check numbers 11 to 20.

That should speed things up a bit:

private long BruteForceSolver()
{
    long result;
    for (result = 2520; !Check(result); result += 2520)
        ;
    return result;
}

private bool Check(long result)
{
    for (int i = 11; i <= 20; ++i)
    {
        if (result % i != 0)
            return false;
    }

    return true;
}

And indeed, on my machine this is now down to 150ms or so. It’s still not a very nice way to tackle the problem, though.

Thinking about it from a different angle yields an altogether smarter approach. Imagine we are looking for the lowest number evenly divisible by the numbers 1 to 2.

[1, 2]

Well that’s easy; since there are only two numbers we just find the lowest common multiple (LCM), which in this case is 2 (since 2 % 2 == 0, and 2 % 1 == 0). If we call this sequence s1, we can say that LCM(s1) = 2.

OK, now imagine we are solving the same problem for s2, which contains the numbers 1 to 3.

[1, 2, 3]

You’ll notice that s2 contains s1 in its entirety. LCM(s2) must therefore be a multiple of LCM(s1), so we can rewrite s2 as [LCM(s1), 3], or [2, 3] (since we know LCM(s1) = 2). Now we are down to two numbers again, so we can calculate the LCM of 2 and 3, which is 6, so LCM(s2) = 6.

OK, now we solve the problem for the first 4 numbers (s3).

[1, 2, 3, 4]

This sequence contains s2, therefore LCM(s3) is a multiple of LCM(s2). We can rewrite s3 as [LCM(s2), 4], or [6, 4]. Thus, LCM(s3) = 12.

This can be repeated as many times as necessary. Generally, we have sn = [LCM(sn-1), n+1] where n > 0.

This looks recursive, but a better way to think of it is as an excellent example of a fold. A fold is one of the fundamental tools of functional programming. In fact, it is perhaps the most fundamental, since map, filter etc can be implemented as right folds[1].

I won’t inflict my pitiful Photoshop skills on anyone by trying to graphically represent a fold – try looking at this Wikipedia article if you want to try and visualise it.

Broadly, the behaviour of a fold is to apply a combining function to elements in a list (or other data structure) and accumulate the results. That’s exactly what we want here – our combining function is LCM, and our accumulating value is the LCM of the whole list. Effectively, for list s3 above, we have

LCM(s3) = LCM(LCM(LCM(1,2),3),4) = 12

Note how the result of the innermost LCM (applied to values 1 and 2) becomes a parameter to the next LCM, which in turn becomes a parameter to the outermost LCM which returns the result we want.

By using a fold, we can generalise. In Haskell, the whole problem is a one-liner:

foldl lcm 1 [1..20]

The 1 passed in as a parameter represents the terminating value to use when the end of the list is reached. It is common for this value to be the first element of the list, so Haskell provides a convenience function that removes the need to specify it as a parameter:

foldl1 lcm [1..20]

Not all languages and platforms provide an LCM function right out of the box, so to take this neat Haskell solution and port it to .Net, the LCM function needs to be implemented. This is easily done in terms of the greatest common divisor (GCD) like so:

.Net doesn’t provide a GCD function either, so I’ll implement it using Euclid’s Algorithm as an extension method on long ints:

public static long GCD(this long a, long b)
{
    while (b != 0)
    {
        long tmp = b;
        b = a % b;
        a = tmp;
    }

    return a;
}

With GCD defined, LCM can be implemented as above:

public static long LCM(this long a, long b)
{
    return (a * b) / a.GCD(b);
}

With this in place, it’s a simple matter to use .Net’s equivalent of fold – a method on IEnumerable called Aggregate – to get the answer[2]:

return LongEnumerable.Range(1, 20)
    .Aggregate(1L, (curr, next) => curr.LCM(next));

And indeed, the same basic pattern can be used to solve the problem in a number of languages. In F#, given implementations of LCM and GCD as above, we have:

List.fold_left lcm 1 [1..20]

And in ruby:

require 'rational'
(1..20).inject { |c, n| c.lcm n }

Given that the right algorithm makes this problem a fairly trivial expression in all these languages, it’s pretty hard to identify which is the nicest. I think overall I’ll give the nod to Haskell, however, for not making me implement LCM and because I find ruby’s ‘inject’ a less intuitive function name than foldr (but that’s probably because I learned the technique in Haskell in the first place and am set in my ways…)


[1]For example, in F#:

let filter p lst =
  List.fold_right (fun x xs -> if p x then x::xs else xs) lst []

let map f lst =
  List.fold_right (fun x xs -> f x :: xs) lst []

[2]Note that in this code LongEnumerable is just a very simple partial reimplementation of Enumerable, using longs instead of ints

  • Share/Bookmark

Project Euler Problem 4: Extra

Couple of things to add to yesterday’s post about problem 4. As is so often the case in life, no sooner had I finished the article than I realised there was an obvious additional step I could make, which I’d somehow failed to spot.

Regarding the C# solution, an easy win having implemented the Reverse extension method would be to add an IsPalindrome extension to the string class too. The implementation is straightforward:

public static bool IsPalindrome(this string s)
{
    return s == s.Reverse();
}

With this done, the where clause in the LINQ query is more readable, and we have a couple of handy reusable string extensions into the bargain.

var result = (from product in AllProducts.From(100).To(999)
              where product.ToString().IsPalindrome()
              select product).Max();

Also, Sol commented that the C code could have a direct implementation of a palindrome function, rather than messing about with strrev, since the implementations are very similar. Whilst this series isn’t really focussed on the performance benefit of this approach, it does also make the code more expressive, so I’ll include it:

int strpalindrome(char* s) {
    char *s1, *s2;

    s1 = s2 = s;
    while (*s2)
        s2++;

    while (s1 < s2) {
        if (*(s1++) != *(--s2))
            return 0;
    }

    return 1;
}

The loop now looks like this:

    for (i = 100; i < 1000; ++i) {
        for (j = i; j < 1000; ++j) {
            int sum = i * j;
            sprintf(s1, "%d", sum);
            if (strpalindrome(s1) && sum > largest) {
                largest = sum;
            }
        }
    }

If you are interested in the Project Euler problems but craving more detailed analysis, Joel Neely is working through at a similar rate to me, but focusing his efforts on Scala and studying each problem and its solution in greater depth rather than flitting from language to language. Highly recommended.

  • Share/Bookmark