<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><!-- generator="wordpress/2.3.1" --><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Basildon Coder</title>
	<link>http://basildoncoder.com/blog</link>
	<description>Incoherent and disjointed opinionated drivel from somewhere near London</description>
	<pubDate>Sun, 26 Oct 2008 21:59:27 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.1</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/BasildonCoder" type="application/rss+xml" /><item>
		<title>Project Euler Problem 7</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/432954693/</link>
		<comments>http://basildoncoder.com/blog/2008/10/26/project-euler-problem-7/#comments</comments>
		<pubDate>Sun, 26 Oct 2008 21:58:14 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Mathematics]]></category>

		<category><![CDATA[Project Euler]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/10/26/project-euler-problem-7/</guid>
		<description><![CDATA[Problem 7
By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
What is the 10001st prime number?
Ah, what a nice, straightforward, unambiguous spec! If only business software specifications were so precise.
Way back in problem 3, I took a bit of a wander off-topic [...]]]></description>
			<content:encoded><![CDATA[<p><em><strong><a href="http://projecteuler.net/index.php?section=problems&amp;id=7">Problem 7</a></strong></em></p>
<blockquote><p>By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6<sup>th</sup> prime is 13.</p>
<p>What is the 10001<sup>st</sup> prime number?</p></blockquote>
<p>Ah, what a nice, straightforward, unambiguous spec! If only business software specifications were so precise.</p>
<p><a href="http://basildoncoder.com/blog/2008/04/07/project-euler-problem-3/">Way back in problem 3</a>, I took a bit of a wander off-topic and built a prime generator in .Net using the <a href="http://en.wikipedia.org/wiki/Sieve_of_eratosthenes">Sieve of Eratosthenes</a>. Armed with this, problem 7 should be easy, right? The sieve implementation generates an IEnumerable<long>, which is non-indexable (i.e. I can&#8217;t just say Primes()[10001]), but I can take the first 10,001 and then ask for the last element, which will be the answer to the problem.<br />
</long></p>
<p>There&#8217;s a problem with this, however. The sieve requires an upper bound during initialisation. This means it&#8217;s great for solving problems like &#8220;generate all the primes less than 10,001&#8243;, but not so great at answering questions like &#8220;what is the 10,001st prime number?&#8221;, since it requires foreknowledge of the upper bound.</p>
<p>To illustrate the problem, I&#8217;ll take a wild guess at the upper bound. I&#8217;m going to guess that the 10,001st prime number is less than 99,999. What happens?</p>
<pre>
<span class="Type">var</span> sieve = <span class="Statement">new</span> SieveOfEratosthenes(<span class="Constant">99999</span>);
<span class="Type">var</span> result = sieve.Primes().Take(<span class="Constant">10001</span>).Last();</pre>
<p>This generates an answer of 99,991. If I enter this into the Project Euler website, however, it tells me the answer is wrong. Gah! What went wrong? A simple test reveals the problem:</p>
<pre><span class="Type">var</span> sieve = <span class="Statement">new</span> SieveOfEratosthenes(<span class="Constant">99999</span>);
<span class="Type">var</span> primes = sieve.Primes().Take(<span class="Constant">10001</span>);
<span class="Type">var</span> count = primes.Count();</pre>
<p>There&#8217;s only 9,592 primes generated! As the <a href="http://msdn.microsoft.com/en-us/library/bb503062.aspx">docs for Take()</a> state (emphasis mine):</p>
<blockquote><p>Take&lt;TSource&gt;<tsource> enumerates source and yields elements until count elements have been yielded <em>or source contains no more elements</em>.</tsource></p></blockquote>
<p>Damn. So, looks like my 99,999 guess was too small - with that as an upper bound, the sieve only finds 9,592 primes, and I need the 10,001st. OK, I&#8217;ll bump it up by an order of magnitude:</p>
<pre><span class="Type">var</span> sieve = <span class="Statement">new</span> SieveOfEratosthenes(<span class="Constant">999999</span>);
<span class="Type">var</span> result = sieve.Primes().Take(<span class="Constant">10001</span>).Last();</pre>
<p>This gives me the correct answer. Not exactly a wonderful solution though; the idea of having to guess the upper bound is pretty horrendous, and if this was real code it wouldn&#8217;t be particularly maintainable - what if the requirements changed and we had to find the <em>n</em>th prime, which happened to be &gt;99,999? We&#8217;d have to guess again. Ugh.</p>
<p>Worse, the sieve algorithm precomputes all the primes up to the specified upper bound, meaning that in the above approach I&#8217;ve asked the sieve to generate primes up to 999,999 (all 78,498 of them!) despite only needing 10,001. Not very efficient.</p>
<p>Fortunately, the upper bound can be calculated separately. <a href="http://primes.utm.edu/howmany.shtml" aiotarget="false" aiotitle="Where n &gt;8601, as in this case, we can use the following equation">Where <em aiotitle="n">n</em>&gt;8601, as in this case, we can use the following equation</a>:</p>
<pre>p(<em>n</em>) <u>&lt;</u> <em>n</em> (log<sub>e</sub> <em>n</em> + log<sub>e</sub> log<sub>e</sub> <em>n</em> - 0.9427)</pre>
<p>where p(<em>n</em>) is the <em>n</em>th prime number.</p>
<p>Alternatively, for flexibility in handling <em>n</em>&lt;8601, we can use the less accurate</p>
<pre>p(<em>n</em>) &lt; <em>n</em> log<sub>e</sub> log<sub>e</sub> <em>n</em></pre>
<p><a href="http://en.wikipedia.org/wiki/Prime_number_theorem#Approximations_for_the_nth_prime_number">which works for <em>n</em>&gt;5</a>. We can easily precompute the answers for <em>n</em>&lt;=5, or simply calculate on demand.</p>
<p>The formula can be implemented on the sieve class, with a factory method to help when we want to use it:</p>
<pre><span class="Type">public</span> <span class="Type">static</span> SieveOfEratosthenes
    CreateSieveWithAtLeastNPrimes(<span class="Type">int</span> n)
{
    <span class="Statement">return</span> <span class="Statement">new</span> SieveOfEratosthenes((<span class="Type">long</span>)
            Math.Ceiling(UpperBoundEstimate(n)));
}

<span class="Type">private</span> <span class="Type">static</span> <span class="Type">double</span> UpperBoundEstimate(<span class="Type">int</span> n)
{
    <span class="Statement">return</span> n * Ln(n) + n * (Ln(Ln(n)));
}

<span class="Type">private</span> <span class="Type">static</span> <span class="Type">double</span> Ln(<span class="Type">double</span> n)
{
    <span class="Statement">return</span> Math.Log(n, Math.E);
}</pre>
<p>This leaves us with an overall solution like so:</p>
<pre><span class="Type">var</span> sieve = SieveOfEratosthenes
    .CreateSieveWithAtLeastNPrimes(<span class="Constant">10001</span>);
<span class="Type">var</span> result = sieve.Primes().Take(<span class="Constant">10001</span>).Last();</pre>
<p>This generates a total 10,018 primes, cutting the wasted effort from almost 70,000 superfluous primes to just 17, and takes around 20ms to execute on my machine. Plenty fast enough, I think.</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=m7xLkV"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=m7xLkV" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/10/26/project-euler-problem-7/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/10/26/project-euler-problem-7/</feedburner:origLink></item>
		<item>
		<title>Look Before You Look Before You Leap</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/430776034/</link>
		<comments>http://basildoncoder.com/blog/2008/10/24/look-before-you-look-before-you-leap/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 14:32:36 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[.Net]]></category>

		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/10/24/look-before-you-look-before-you-leap/</guid>
		<description><![CDATA[Generally, I try to avoid turning this blog into some sort of snark-fest about other programmers or blogs. I&#8217;ve disagreed with Jeff Atwood once or twice though, and so by posting this I&#8217;m probably straying a little close to the edge&#8230;but what the hell.
A couple of days ago Coding Horror carried a fluff piece about [...]]]></description>
			<content:encoded><![CDATA[<p>Generally, I try to avoid turning this blog into some sort of <a href="http://www.google.co.uk/search?q=define%3Asnark">snark</a>-fest about other programmers or blogs. I&#8217;ve disagreed with Jeff Atwood <a href="http://basildoncoder.com/blog/2008/02/22/code-can-be-beautiful/">once</a> or <a href="http://basildoncoder.com/blog/2008/01/29/freedom-zero-the-all-or-nothing-fallacy/">twice</a> though, and so by posting this I&#8217;m probably straying a little close to the edge&#8230;but what the hell.</p>
<p>A couple of days ago Coding Horror carried a <a href="http://www.codinghorror.com/blog/archives/001177.html">fluff piece</a> about how all developers should be marketers too. Predictably, the article soon got posted to <a href="http://www.reddit.com/r/programming/">proggit</a> where it was <a href="http://www.reddit.com/r/programming/comments/78vq3/jeff_atwood_finally_jumps_the_shark/">ripped on by reddit&#8217;s resident Jeff-haters</a>, and even more predictably the comments were a mix of interesting insight and barely-concealed hate.</p>
<p>Apparently some of them got up Jeff&#8217;s nose a bit, and today he <a href="http://www.codinghorror.com/blog/archives/001178.html">responded</a>. The core of his rebuttal seems to be that you shouldn&#8217;t trust what you read on blogs, and should verify everything yourself. True enough, I guess, if perhaps a bit impractical given the sheer amount of information out there.</p>
<p>Then, however, Jeff goes on to give an example by referencing a <a href="http://blog.madskristensen.dk/post/Compression-and-performance-GZip-vs-Deflate.aspx">compression benchmark</a> he&#8217;d read on a blog and providing counter-analysis to show that the benchmark was wrong in claiming Deflate is faster than gzip. In doing so, much knowledge was gained.</p>
<p>Or so we are told.</p>
<p>The comment thread quickly becomes a goldmine of humour. Bugs in Jeff&#8217;s benchmarking code (not resetting the stopwatch) meant that the durations were cumulative, not independent, with inevitable distortion of the results. Another commenter pointed out that <a href="http://en.wikipedia.org/wiki/Gzip">gzip</a> cannot possibly be faster than Deflate, since the gzip algorithm IS the Deflate algorithm plus some additional computation.</p>
<blockquote><p>“gzip” is often also used to refer to the gzip file format, which is:</p>
<ul>
<li>a 10-byte header, containing a magic number, a version number and a timestamp</li>
<li>optional extra headers, such as the original file name,</li>
<li>a body, containing a DEFLATE-compressed payload</li>
<li>an 8-byte footer, containing a CRC-32 checksum and the length of the original uncompressed data</li>
</ul>
<p align="right"><a href="http://en.wikipedia.org/wiki/Gzip">http://en.wikipedia.org/wiki/Gzip</a></p>
</blockquote>
<p>With the benchmarking code fixed, we see that Deflate is indeed slightly faster than gzip.</p>
<p>All of which leads to repeated quotations from Jeff about the community being smarter than him, and some drastic toning down of language in post-publication edits to the article. I read a cached version of the RSS feed, which is markedly different to the article currently live on codinghorror.com - &#8220;on my box, GZip is twice as fast as Deflate&#8221; becomes &#8220;on my box, GZip is just as fast as Deflate&#8221;, &#8220;Deflate is way slower. It&#8217;s not even close&#8221; becomes &#8220;Deflate is nowhere near 40% faster&#8221;, etc.</p>
<p><img src="/images/codinghorror01.png" align="middle" /></p>
<p>Anyone who&#8217;s tackled a major performance problem will likely agree that profiling is a tremendously valuable technique that should always be applied before attempting to optimise (i.e. look before you leap). I think this little episode has highlighted a couple of important things to bear in mind, however:</p>
<ol>
<li>Profiling isn&#8217;t a magic wand - if you use buggy profiling code, you are leading yourself up the garden path.</li>
<li>Profiling is less useful when you can reason (in the mathematical sense) about the code. That involves <em>understanding the algorithms you are dealing with</em>. Gzip is Deflate plus a bit more processing - so unless that extra processing has a negative duration gzip must necessarily take longer. You don&#8217;t need a profiler to work that out. Look <em>before </em>you look before you leap.</li>
</ol>
<p>Anyway, enough hatcheting from me, normal service will be resumed shortly.</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=AWMK96"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=AWMK96" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/10/24/look-before-you-look-before-you-leap/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/10/24/look-before-you-look-before-you-leap/</feedburner:origLink></item>
		<item>
		<title>Project Euler Problem 6</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/364637502/</link>
		<comments>http://basildoncoder.com/blog/2008/08/14/project-euler-problem-6/#comments</comments>
		<pubDate>Thu, 14 Aug 2008 00:00:26 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Mathematics]]></category>

		<category><![CDATA[Project Euler]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/08/14/project-euler-problem-6/</guid>
		<description><![CDATA[Onwards to&#8230;
Problem 6
The sum of the squares of the first ten natural numbers is,
12 + 22 + &#8230; + 102 = 385
The square of the sum of the first ten natural numbers is,
(1 + 2 + &#8230; + 10)2 = 552 = 3025
Hence the difference between the sum of the squares of the first ten [...]]]></description>
			<content:encoded><![CDATA[<p class="problem_content">Onwards to&#8230;</p>
<p><em><strong><a href="http://projecteuler.net/index.php?section=problems&amp;id=6">Problem 6</a></strong></em></p>
<blockquote><p>The sum of the squares of the first ten natural numbers is,</p>
<p style="text-align: center">1<sup>2</sup> + 2<sup>2</sup> + &#8230; + 10<sup>2</sup> = 385</p>
<p>The square of the sum of the first ten natural numbers is,</p>
<p style="text-align: center">(1 + 2 + &#8230; + 10)<sup>2</sup> = 55<sup>2</sup> = 3025</p>
<p>Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 <img src="http://projecteuler.net/images/symbol_minus.gif" style="vertical-align: middle" border="0" width="9" height="3" /> 385 = 2640.</p>
<p>Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.</p></blockquote>
<p>Bit of a disappointment, problem 6; it&#8217;s too easy. <a href="http://projecteuler.net/index.php?section=problems&amp;sort=difficulty">It&#8217;s rated as the third-easiest</a>, i.e. easier than problems <a href="http://basildoncoder.com/blog/2008/04/07/project-euler-problem-3/">3</a>, <a href="http://basildoncoder.com/blog/2008/04/21/project-euler-problem-4/">4</a>, and <a href="http://basildoncoder.com/blog/2008/06/10/project-euler-problem-5/">5</a> which I&#8217;ve already covered. In fact, for my money it&#8217;s easier than problem <a href="http://basildoncoder.com/blog/2008/03/22/project-euler-problems-1-and-2/">2</a> as well. Ah well, the difficulty ramps up soon enough, trust me.  Here&#8217;s the very simple python solution:</p>
<pre>
sum_sq = sum([ x*x <span class="Statement">for</span> x <span class="Statement">in</span> xrange(1, 101)])
sq_sum = sum(xrange(1, 101)) ** 2

<span class="Statement">print</span> sq_sum - sum_sq</pre>
<p>As you can see, it&#8217;s pretty intuitive. You sum the squares, square the sum, and calculate the difference. The answer is basically in the description, you just have to scale up a little.</p>
<p>There&#8217;s not much else to say about this one. Even if I abandon the functional approach and write a straightforward imperative solution it&#8217;s still very straightforward. In (deliberately non-idiomatic, so don&#8217;t whine at me) ruby:</p>
<pre>
sum_of_squares = <span class="Constant">0</span>
sum = <span class="Constant">0</span>

<span class="Constant">1</span>.upto <span class="Constant">100</span> <span class="Statement">do</span> |<span class="Identifier">x</span>|
    sum_of_squares += x * x
    sum += x
<span class="Statement">end</span>

p (sum * sum) - sum_of_squares</pre>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=GGxql6"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=GGxql6" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/08/14/project-euler-problem-6/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/08/14/project-euler-problem-6/</feedburner:origLink></item>
		<item>
		<title>Magic Numbers and Other Numerical Nightmares</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/363864346/</link>
		<comments>http://basildoncoder.com/blog/2008/08/13/magic-numbers-and-other-numerical-nightmares/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 12:48:18 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/08/13/magic-numbers-and-other-numerical-nightmares/</guid>
		<description><![CDATA[There are many coding practices that are near-universally regarded as &#8216;bad&#8217;, yet somehow keep cropping up over and over again. Conditional-branch abuse (including, yes, gotos). Deep nesting. Cryptic variable names. Global variables. Tight coupling. Entangled business/presentation logic. I could go on.
Why do we keep doing it? Convenience? Laziness? Tiredness? Is unreadable spaghetti code some sort [...]]]></description>
			<content:encoded><![CDATA[<p>There are many coding practices that are near-universally regarded as &#8216;bad&#8217;, yet somehow keep cropping up over and over again. Conditional-branch abuse (including, yes, gotos). Deep nesting. Cryptic variable names. Global variables. Tight coupling. Entangled business/presentation logic. I could go on.</p>
<p>Why do we keep doing it? Convenience? Laziness? Tiredness? Is unreadable spaghetti code some sort of steady-state/equilibrium for code? Is it a natural consequence of the vague and squidgy limitations of our evolved monkey-brains? Or is well-designed code abhorred like a vacuum and naturally atrophies into the sort of shambles you dread seeing on your first day at a new job, unless well-intentioned and dedicated people actively work to clean and polish it, like the <a href="http://en.wikipedia.org/wiki/Forth_Railway_Bridge#Maintenance">Forth Bridge</a>?</p>
<p>I don&#8217;t have the time or wit to give this subject the treatment it deserves, but I do want to rant a bit about another symptom of this disease, which has given me a couple of sleepless nights recently. I refer, as the title might suggest, to <a href="http://en.wikipedia.org/wiki/Magic_number_(programming)">magic numbers</a>.</p>
<p>Magic numbers are constants, <a href="http://en.wikipedia.org/wiki/Magic_number_(programming)#Unnamed_numerical_constant">unnamed</a> in the most pathological cases, that represent an assumption or a limit in a piece of code. They often cause problems because soon they are forgotten about or their meaning is lost - and then something happens to invalidate the assumption, the code breaks, and all hell breaks loose.</p>
<p>Magic numbers, to stretch the definition a bit, can also be implicit. If you are using a 32-bit integer, your magic number is 2,147,483,647 - that&#8217;s the biggest number you can store in that type. Often, movement up to and beyond these ranges can trigger long-dormant bugs that are no fun at all to diagnose.</p>
<p>Three times in recent history I&#8217;ve been bitten by bugs of this class, triggered by auto-incrementing sequences in database. These are they:</p>
<ol>
<li>A table in a database had a 32-bit integer primary key. At the time this seemed like a perfectly reasonable default, but insanely fast growth in usage of the system meant that the ~2.1billion upper limit of that data type was quickly reached. The DB column was switched to a 64-bit integer, but some of the client applications reading that table were not identified as at-risk. When the sequence generator left the 32-bit range, those applications overflowed. This happened at 4:30pm on a Friday afternoon. Saturdays were peak-times for system usage. You can imagine the frantic hacking that ensued.</li>
<li>A sequence generator for a particular entity was started at 20,000,000, so as not to clash with the ID sequence of a related entity (that had started at 0 a good few years earlier). The similarity between the entities and the need to not have the IDs overlap had valid business justification, but the magic number was selected arbitrarily and promptly forgotten. Inevitably, the latter sequence surpassed that number, causing bizarre and difficult-to-trace entity relationship corruption that manifested as strangely-disappearing data on the front-end.</li>
<li>A stored procedure parameter was incorrectly declared as an OracleType.Float, when it should have been an OracleType.Int32. This resulted in the value being cast from an integer to a floating-point and back again. For the first 16,777,216 integers, this happens to work OK. For the value 16,777,217, however, the loss in precision means that the number changes during casting. This simple bit of (heavily contrived) code shows the problem:
<pre>
<span class="Type">static</span> <span class="Type">void</span> Main(<span class="Type">string</span>[] args)
{
    <span class="Statement">for</span> (<span class="Type">int</span> i = <span class="Constant">0</span>; i &lt; <span class="Constant">17000000</span>; ++i)
    {
        <span class="Statement">if</span> (i != (<span class="Type">int</span>)(<span class="Type">float</span>)i)
        {
            Console.WriteLine(<span class="Constant">&#8220;{0} != {1}&#8221;</span>,
                i, (<span class="Type">int</span>)(<span class="Type">float</span>)i);
            <span class="Statement">break</span>;
        }
    }
}</pre>
<p>There are many numbers above 16,777,217 that have this characteristic; 16,777,217 just happens to be the first, for reasons you can probably divine if you think the IEEE floating-point spec is a riveting read. A couple of weeks after the launch of a fairly major internal application, this time-bomb exploded due to a sequence reaching the magic number. The bug was nothing to do with the new application, but of course fingers were pointed at it since a long-running and stable system had mysteriously choked very shortly after deployment of the new application.</li>
</ol>
<p>Now, unquestionably, all these problems are avoidable, and a strong argument could be made that none of them should ever have been allowed to happen. Yet, for many reasons, they do. For example, first-mover advantage can mean the opportunity cost of taking the time to do things right first time is greater than the cost of fixing problems later.</p>
<p>Also, people make assumptions. The issue underlying the <a href="http://en.wikipedia.org/wiki/Millennium_bug">Millennium Bug</a> hysteria was caused by well-meaning developers who knew that two-digit dates wouldn&#8217;t work after 1999 (effectively another magic number), but assumed the software would have been replaced or upgraded by then. No doubt that seemed a totally reasonable assumption in the 1970s, and it had genuine technical benefits (storage space was so tight that every byte saved was a battle won).</p>
<p>Anyway, I don&#8217;t have a magic bullet solution for this, I&#8217;m just venting spleen. Unit tests can help, but won&#8217;t magically eliminate this class of bug (no matter what some of the more extreme TDD fanatics might tell you), so I suppose the lesson to take from this is the importance of being able to recognise and diagnose potential magic number issues. Pay close attention to data types, type conversions, and current values of sequences in your database. Keeping a sacrifical goat on hand might pay dividends too, in case any blood-thirsty deities with a head for binary arithmetic are watching.</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=3CrGT8"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=3CrGT8" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/08/13/magic-numbers-and-other-numerical-nightmares/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/08/13/magic-numbers-and-other-numerical-nightmares/</feedburner:origLink></item>
		<item>
		<title>Ubuntu, Xmonad, and an Ode to Apt</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/361396326/</link>
		<comments>http://basildoncoder.com/blog/2008/08/10/ubuntu-xmonad-and-an-ode-to-apt/#comments</comments>
		<pubDate>Sun, 10 Aug 2008 22:34:42 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/08/10/ubuntu-xmonad-and-an-ode-to-apt/</guid>
		<description><![CDATA[This weekend I finally got around to updating my main Linux box from Ubuntu 7.10 to 8.04 (yes, I know, 4 months late - but moving fast!). The highly excellent xmonad has made it into the main Ubuntu repositories, so I discarded my own build and grabbed the packaged version - which promptly didn&#8217;t work [...]]]></description>
			<content:encoded><![CDATA[<p>This weekend I finally got around to updating my main Linux box from Ubuntu 7.10 to 8.04 (yes, I know, 4 months late - but moving fast!). The highly excellent <a href="http://xmonad.org/">xmonad</a> has made it into the main Ubuntu repositories, so I discarded my own build and grabbed the packaged version - which promptly didn&#8217;t work as expected on my dual-head setup. Gah.</p>
<p><a href="https://bugs.launchpad.net/debian/+source/haskell-x11/+bug/203594">A bit of googling suggested</a> that the problem lay with the upstream debian package, which contained a build of libghc6-x11-dev that was compiled without xinerama support. This left me with a choice of either waiting for the package to get sorted out, or to do the build myself again. I decided to do my own build rather than live without xmonad, but rather than mucking about with tarballs I could at least now get the source from the package repository.</p>
<p>The appropriate steps, for anyone interested or having the same problem, are:</p>
<ol>
<li>Make sure libxinerama-dev is installed</li>
<li>Recompile libghc6-x11-dev and install it</li>
<li>Recompile libghc6-xmonad-dev and libghc-xmonad-contrib-dev against the new X11 lib</li>
</ol>
<p>The apt-get incantations are:</p>
<pre>
sudo apt-get install libxinerama-dev
<span class="Statement">cd</span> /tmp
sudo apt-get source <span class="Special">&#8211;compile</span> libghc6-x11-dev
sudo dpkg <span class="Special">-i</span> libghc6-x11-dev_1.<span class="Constant">4</span>.<span class="Constant">1</span>-1_i386.deb
sudo apt-get build-dep libghc6-xmonad-dev
sudo apt-get source <span class="Special">&#8211;compile</span> libghc6-xmonad-dev
sudo dpkg <span class="Special">-i</span> libghc6-xmonad-dev
sudo apt-get build-dep libghc6-xmonad-contrib-dev
sudo apt-get source <span class="Special">&#8211;compile</span> libghc6-xmonad-contrib-dev
sudo dpkg <span class="Special">-i</span> libghc6-xmonad-contrib-dev_0.<span class="Constant">6</span>-4_i386.deb</pre>
<p>A quick alt-q restart, and all is well.</p>
<p>I only mention all this because it&#8217;s so easy in this day and age to take something like apt for granted, and every so often it&#8217;s worth taking a moment to appreciate just how spectacularly good it really is. Where I work, deployments are an endless source of headaches and grief, yet the complexity of those deployments absolutely pales against the task of updating literally millions of systems, all slightly different to each other, thousands of times a day. It&#8217;s just a joy to be able to say to apt &#8220;hey, go get me everything I need to build package x, then build package x, then install it for me. And get it right first time!&#8221;.</p>
<p>In most cases, it does just that. It&#8217;s an astonishing piece of software.</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=EFTz1S"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=EFTz1S" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/08/10/ubuntu-xmonad-and-an-ode-to-apt/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/08/10/ubuntu-xmonad-and-an-ode-to-apt/</feedburner:origLink></item>
		<item>
		<title>Dynamic Async Batching with PFX</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/359565809/</link>
		<comments>http://basildoncoder.com/blog/2008/08/08/dynamic-async-batching-with-pfx/#comments</comments>
		<pubDate>Fri, 08 Aug 2008 16:41:04 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[.Net]]></category>

		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Patterns]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/08/08/dynamic-async-batching-with-pfx/</guid>
		<description><![CDATA[The PFX Team blog has been posting some excellent articles recently on the subject of task batching using the June 2008 CTP release of the Task Parallel Library. It&#8217;s really cool to see some of these techniques abstracted properly in .Net, and I hope it eventually becomes part of the core libraries.
I&#8217;ve been playing around [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://blogs.msdn.com/pfxteam/default.aspx">PFX Team blog</a> has been posting some excellent articles recently on the subject of <a href="http://blogs.msdn.com/pfxteam/archive/2008/08/05/8835612.aspx">task batching</a> using the June 2008 CTP release of the Task Parallel Library. It&#8217;s really cool to see some of these techniques abstracted properly in .Net, and I hope it eventually becomes part of the core libraries.</p>
<p>I&#8217;ve been playing around a bit recently with the June CTP in the context of batching up web service calls, as that&#8217;s something I do quite a lot. One particular problem that comes up occasionally is a two-stage series of requests to download a complete set of paged data. I might do this if I wanted to download an entire discussion thread, for instance, or a large account statement from my online bank.</p>
<p>Typically in this situation the web service will limit the number of records I can retrieve in one request, and allow me to specify start and count parameters to the request. The response will also include a total record count, so I know how much data there is.</p>
<p>The normal use case for this is to request the first page of data, and use the total record count to display a list of page links that my user can click on to navigate the data or jump to any page. In my case, however, I want ALL the data as quickly as possible.</p>
<p>So, imagine a situation where I am using a service that lets me download a maximum of 200 records per request. My first step is to request the maximum 200 records starting from index 0, i.e. the first page of data. In the response will be a total record count - if that number is equal to the number of records I got back (i.e. &lt;= 200) I&#8217;ve got everything in one hit and can stop. But what if the total record count is, say, 1000? I need to make four more requests (since I&#8217;ve already got records 1-200, I have 800 more to get in batches of 200 each).</p>
<p>Naturally I want to do this asynchronously, using as few resources as I can. This means all webservice calls should be using the APM pattern (thus using IO completion ports, and not consuming worker threads from the thread pool or creating my own threads) and, preferably, not blocking anywhere except when I actually need some data before continuing.</p>
<p>The two-stage process can be successfully captured asynchronously by combining a future and a continuation. I encapsulate the initial request in a Future object (which is a subclass of Task), and handle the check-record-count-and-get-more-records-if-required logic in the continuation. The code for this basically looks as follows:</p>
<pre>
<span class="Type">public</span> Future&lt;List&lt;Item&gt;&gt; GetAllItemsAsync()
{
    <span class="Type">var</span> f = Create&lt;GetItemsResponse&gt;(
            ac =&gt; Service.BeginGetItems(<span class="Constant">0</span>, ac, <span class="Constant">null</span>),
            Service.EndGetItems);

    <span class="Type">var</span> start = <span class="Constant">200</span>;

    <span class="Type">var</span> resultFuture = f.ContinueWith(
        r =&gt;
            {
                <span class="Comment">// Batch retrieval here&#8230;</span>
            });

    <span class="Statement">return</span> resultFuture;
}</pre>
<p>In order to support the APM pattern neatly, I&#8217;m using the following method <a href="http://blogs.msdn.com/pfxteam/archive/2008/03/16/8272833.aspx">from the PFX blog</a>:</p>
<pre>
<span class="Type">private</span> <span class="Type">static</span> Future&lt;T&gt; Create&lt;T&gt;(
        Action&lt;AsyncCallback&gt; beginFunc,
        Func&lt;IAsyncResult, T&gt; endFunc)
{
    <span class="Type">var</span> f = Future&lt;T&gt;.Create();
    beginFunc(iar =&gt;
        {
            <span class="Statement">try</span>
            {
                f.Value = endFunc(iar);
            }
            <span class="Statement">catch</span> (Exception e)
            {
                f.Exception = e;
            }
        });
    <span class="Statement">return</span> f;
}</pre>
<p>This could be coded as an extension method, though I haven&#8217;t bothered yet as I&#8217;m hopeful this immensely useful snippet will be integrated into the library itself.</p>
<p>Now I need to make a number of calls to get the rest of the data, so I loop until I&#8217;ve made the required number of async service calls:</p>
<pre>
<span class="Type">var</span> resultFuture = f.ContinueWith(r =&gt;
    {
        <span class="Type">var</span> items = <span class="Statement">new</span> ConcurrentQueue&lt;Item&gt;();
        <span class="Type">var</span> handles = <span class="Statement">new</span> List&lt;WaitHandle&gt;();

        <span class="Statement">while</span> (start &lt; r.Value.TotalRecordCount)
        {
            <span class="Type">var</span> asyncResult = Service.BeginGetItems(<span class="Constant">200</span>,
                ar =&gt; Service.EndGetItems(ar).Items
                    .ForEach(items.Enqueue), <span class="Constant">null</span>);

            handles.Add(asyncResult.AsyncWaitHandle);
            start += <span class="Constant">200</span>;
        }

        handles.ForEach(h =&gt; h.WaitOne());
        <span class="Statement">return</span> items.ToList();
    });</pre>
<p>I&#8217;m about 85% happy with this as an approach. I&#8217;m not completely happy, however, because of the WaitOne calls, which mean that I&#8217;m blocking on a threadpool thread until all the calls complete. Given that this is all wrapped up in a future, I may not actually need to access the data until well after the calls have completed, in which case I am wastefully consuming a threadpool thread for some period of time. So the $64,000 question is, how do I get rid of it? I&#8217;m sure there&#8217;s a way to do it, but my brain has gone on a protest march about all the time I&#8217;m forcing it to spend thinking about this stuff.</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=tO84Mj"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=tO84Mj" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/08/08/dynamic-async-batching-with-pfx/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/08/08/dynamic-async-batching-with-pfx/</feedburner:origLink></item>
		<item>
		<title>Comment Discontent</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/350617524/</link>
		<comments>http://basildoncoder.com/blog/2008/07/30/comment-discontent/#comments</comments>
		<pubDate>Wed, 30 Jul 2008 16:01:07 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/07/30/comment-discontent/</guid>
		<description><![CDATA[There seems to have been a recent outbreak in blog posts about code commenting. As is so often the case with topics such as this, everyone has an opinion and they all seem to be different. It&#8217;s quite an eye-opener seeing some of the explanations, justifications, and outright haranguing used in defence of all sorts [...]]]></description>
			<content:encoded><![CDATA[<p>There seems to have been a recent <a href="http://blog.uncommons.org/2008/07/25/no-your-code-is-not-so-great-that-it-doesnt-need-comments/">outbreak</a> <a href="http://www.carlcrowder.com/blog/?p=34">in</a> <a href="http://www.codinghorror.com/blog/archives/001150.html">blog</a> <a href="http://steve-yegge.blogspot.com/2008/02/portrait-of-n00b.html">posts</a> about <a href="http://en.wikipedia.org/wiki/Comment_(computer_programming)">code commenting</a>. As is so often the case with topics such as this, everyone has an opinion and they all seem to be different. It&#8217;s quite an eye-opener seeing some of the explanations, justifications, and outright haranguing used in defence of all sorts of weird and wonderful stances.</p>
<p>I got a wry smile from <a href="http://steve-yegge.blogspot.com/2008/02/portrait-of-n00b.html">stevey&#8217;s post</a>, as I recognise only too well the tendency to write narrative comments. I&#8217;m sure there&#8217;s plenty of code from early in my career still floating around in various company codebases where the code/comment ratio is something embarrassing. I&#8217;ve mostly shaken that off now, though I sometimes have to fight my inner raconteur when writing something I think is neat or clever.</p>
<p>Jeff Atwood, as is so often the case recently, contradicted his <a href="http://www.codinghorror.com/blog/archives/000749.html">own previous post</a> on the matter (replacing the statement &#8220;comments can never be replaced by code alone&#8221; with &#8220;if your feel your code is too complex to understand without comments, your code is probably just bad&#8221;) and endearingly veered wildly to and fro across a sensible medium<sup>[1]</sup>, without ever quite hitting it. Coding Horror, indeed.</p>
<p>So far, so blah; every time an argument on comments flares up we see the same thing. Something I&#8217;ve not noticed before though, either because I wasn&#8217;t paying attention or because it&#8217;s a new thing, is a trend amongst the I-don&#8217;t-need-comments crowd to advocate very long and detailed method names as an alternative.</p>
<p>As neophyte coders we all have it drilled into us that we must use descriptive names. Programming gospel, as handed down in sacred tomes such as <a href="http://www.amazon.co.uk/Code-Complete-Practical-Handbook-Construction/dp/1556154844">Code Complete</a>, tell us not to use names like &#8216;i&#8217; and &#8216;tmp&#8217; except in very specific circumstances (e.g. loop indexes and tempfile handles). And, without question, this is good solid advice. Take heed, young Padawan, etc.</p>
<p>But can you take it too far? It&#8217;s not something I&#8217;ve really come up against, but it seems to be increasingly popular. <a href="http://blog.uncommons.org/2008/07/25/no-your-code-is-not-so-great-that-it-doesnt-need-comments/">One response to Jeff&#8217;s post</a> suggested (only in passing, to be fair) using a function name like <code>newtonRaphsonSquareRoot</code>. A <a href="http://digg.com/programming/No_your_code_is_not_so_great_that_it_doesn_t_need_comments">digg comment</a> (OK, OK, not exactly the fount of all knowledge) vehemently defended the virtue of the frankly-scary <code>RunEndOfMonthReportsUnlessTheMonthStartsOnAFridayIn<br />
WhichCaseRunTheWeeklyReportInstead</code> (!)</p>
<p>The argument is that with names like these, you don&#8217;t need comments, since it is perfectly clear what the function does. Is it perfectly clear at the wrong level though? Function names like this, in my opinion, are so &#8216;clear&#8217; that they leak. These are function names that violate the principle of <a href="http://en.wikipedia.org/wiki/Encapsulation_(classes_-_computers)">encapsulation</a>.</p>
<p>If I write a square root function, why do I need to burden all my clients with information about how I&#8217;ve implemented it? By naming it <code>newtonRaphsonSquareRoot</code>, that&#8217;s exactly what I&#8217;m doing. Unless there are specific performance implications/requirements that favour <a href="http://en.wikipedia.org/wiki/Newton%27s_method">Newton-Raphson</a>, in most cases my clients just want a damn square root calculated to within a specified tolerance and don&#8217;t care whether I used Newton&#8217;s method or one of the <a href="http://en.wikipedia.org/wiki/Methods_of_computing_square_roots">army of alternatives</a>. The implementation should be private to the method, and no-one else&#8217;s business.</p>
<p>Worse, what if a requirements change means a switch to <a href="http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Reciprocal_of_the_square_root">Walsh&#8217;s fast reciprocal method</a>? Uh-oh, now my method name is completely misleading, so I have to change it. Oops, now I have to change all the client code that calls it! I&#8217;d better hope no-one has exposed this with <code>[WebMethodAttribute]</code> since I wrote it, otherwise there could be thousands of client applications out there relying on it. My funky rename refactoring can&#8217;t save me now.</p>
<p>If every tiny change propagates through the system requiring hundreds of source files to change, and possibly external apps as well, you may as well just copy &#8216;n&#8217; paste the code everywhere it&#8217;s needed and doing away with the function completely. Hell, who needs abstraction anyway?</p>
<p>We all do, of course, which is why I think names like this are a bad smell. The same goes for <code>RunEndOfMonthReportsUnless...</code> - what happens when the requirements change? This method name couples the public interface (method name) to the private implementation, which is exactly what you&#8217;re not supposed to do. <code>RunEndOfMonthReports</code> is probably sufficient. Separate interface and implementation. This is programming 101, people, it shouldn&#8217;t be beyond our grasp.</p>
<p><sup>[1]</sup>I agree with <a href="http://blog.uncommons.org/2008/07/25/no-your-code-is-not-so-great-that-it-doesnt-need-comments/">Dan Dyer</a> that the best choice is as follows:</p>
<pre>
<span class="Comment">/**</span>
<span class="Comment"> *  Approximate the square root of n, to within the specified</span>
<span class="Comment"> *  tolerance, using the Newton-Raphson method.</span>
<span class="Comment"> */</span>
<span class="Type">private</span> <span class="Type">double</span> approximateSquareRoot(<span class="Type">double</span> n, <span class="Type">double</span> tolerance)
{
    <span class="Type">double</span> root = n / <span class="Constant">2</span>;
    <span class="Statement">while</span> (abs(root - (n / root)) &gt; tolerance)
    {
        root = <span class="Constant">0.5</span> * (root + (n / root));
    }
    <span class="Statement">return</span> root;
}</pre>
<p>The function name is descriptive and clear whilst remaining general enough to allow an alternative implementation. Anyone who cares enough about the implementation (for performance reasons, or simply curiosity) can find enough information in the comment to start their investigation, without having the details jammed in their face every time they call it.</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=QwM7EJ"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=QwM7EJ" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/07/30/comment-discontent/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/07/30/comment-discontent/</feedburner:origLink></item>
		<item>
		<title>These…Are Not The Hammer</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/340987706/</link>
		<comments>http://basildoncoder.com/blog/2008/07/20/theseare-not-the-hammer/#comments</comments>
		<pubDate>Sun, 20 Jul 2008 22:40:16 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/07/20/theseare-not-the-hammer/</guid>
		<description><![CDATA[So, yesterday saw the end of a weird week of Whedon wonderfulness when the third and final episode of Dr Horrible&#8217;s Sing-Along Blog was released. All three episodes are now free to view until the end of the day, so run, don&#8217;t walk, over there now. After all, how often do you get the chance [...]]]></description>
			<content:encoded><![CDATA[<p>So, yesterday saw the end of a weird week of Whedon wonderfulness when the third and final episode of <a href="http://www.drhorrible.com/index.html">Dr Horrible&#8217;s Sing-Along Blog</a> was released. All three episodes are now free to view until the end of the day, so run, don&#8217;t walk, over there now. After all, how often do you get the chance to see a musical comedy about a supervillain who video blogs? With Doogie Howser and Malcolm Reynolds!</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=jaiezL"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=jaiezL" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/07/20/theseare-not-the-hammer/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/07/20/theseare-not-the-hammer/</feedburner:origLink></item>
		<item>
		<title>Lexical Closures in C# 3.0</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/324176894/</link>
		<comments>http://basildoncoder.com/blog/2008/07/01/lexical-closures-in-c-30/#comments</comments>
		<pubDate>Tue, 01 Jul 2008 16:49:36 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[.Net]]></category>

		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/07/01/lexical-closures-in-c-30/</guid>
		<description><![CDATA[What does this code print...How does this change if you tweak it like this...And most importantly, why?]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s a <a href="http://dobbscodetalk.com/index.php?show=The-next-big-programming-language-feature-after-closures.html">slightly weird article</a> up on <a href="http://dobbscodetalk.com/">Dobbs Code Talk</a> this week, speculating that aggregate functions are &#8220;the next big programming language feature&#8221; after closures. The slight weirdness comes from the fact that both features have been around for decades, and not just in dusty academic languages either.</p>
<p>Still, there&#8217;s some interesting discussion in the comments about whether .Net&#8217;s closures are proper first-class lexically-scoped closures. The answer is yes - but with a fun twist.</p>
<p>The twist has been around for a long time - <a href="http://blogs.msdn.com/brada/default.aspx">Brad Abrams</a> blogged about it <a href="http://blogs.msdn.com/brada/archive/2004/08/03/207164.aspx">way back in 2004</a>, for instance - but it&#8217;s probably worth going over it again, since the recent arrival of LINQ and lambda syntax in C# 3.0 will presumably lead to more people being bitten by this as the use of closures becomes more mainstream.</p>
<p>A key thing to remember is that C# lambdas are just anonymous delegates in skimpy syntax. Behind the scenes the compiler turns them into classes - if you were looking at disassembled MSIL you wouldn&#8217;t be able to tell whether the code was written with lambda syntax or anonymous delegate syntax. Therefore, the issue discussed by Brad has not gone anywhere.</p>
<p>Lets revisit the problem, with a 2008 sheen applied (i.e. I&#8217;ll use lambda syntax rather than anonymous delegate syntax). What does the following code display?</p>
<pre>
Func&lt;<span class="Type">int</span>&gt;[] funcs = <span class="Statement">new</span> Func&lt;<span class="Type">int</span>&gt;[<span class="Constant">10</span>];
<span class="Statement">for</span> (<span class="Type">int</span> i = <span class="Constant">0</span>; i &lt; <span class="Constant">10</span>; ++i)
{
    funcs[i] = () =&gt; i * i;
}

funcs.ForEach(f =&gt; Console.WriteLine(f()));</pre>
<p>If you answered something along the lines of &#8220;prints the square of every number between 0 and 9&#8243; you&#8217;d be&#8230;wrong. Really, try it out. See?</p>
<p>Now, a lexical closure is supposed to capture its environment, meaning that the lambda stored on the first loop would capture i when i==0, the second loop would capture i when i==1, and so on. If this happened, then executing all the lambdas would indeed result in the squares of the numbers 0-9 being printed. So what gives?</p>
<p>The problem stems from the fact that the lambda is binding itself to a variable that is accessible outside the closure, which is being changed in every iteration of the loop. The closure doesn&#8217;t capture the value of i, it captures a reference to i itself, which is mutable.</p>
<p>You could actually make a case that this is bad code anyway, since it gives two responsibilities to the loop index - control the loop, and act as data in the closures. If we were being pedantic, we could split the responsibilities by creating a new variable, j, to be the closure data each iteration, and let i concentrate on being an index:</p>
<pre>
<span class="Statement">for</span> (<span class="Type">int</span> i = <span class="Constant">0</span>; i &lt; <span class="Constant">10</span>; ++i)
{
    <span class="Type">int</span> j = i;
    funcs[i] = () =&gt; j * j;
}</pre>
<p>Lo and behold, the code now works! Pedantry rules! Take a look with Reflector or ildasm to see what&#8217;s going on here. The executive summary is that the compiler captures the environment (i in the first example, j in the second) by creating a member variable within the class it generates for the closure. Previously, since the same instance of i lived for the entire duration of the loop, only one instance of the generated class was created and shared. Now, however, a new instance of the generated class is created in each iteration of the loop (since j is scoped within the loop body and thus we have a new j every time round). Thus, the data is not shared and we get the expected output.</p>
<p>There are two important points to consider here:</p>
<ol>
<li>The problem goes away if you write code more declaratively. Do away with the clunky for loop and everything works OK.
<pre>
Enumerable.Range(<span class="Constant">0</span>, <span class="Constant">10</span>).Select(x =&gt; x * x);</pre>
</li>
<li>It isn&#8217;t always bad that multiple closures can capture a reference - since one closure can &#8217;see&#8217; updates made to the shared data by another closure, you could use this as a coordination mechanism.</li>
</ol>
<p>This is not an issue that&#8217;s going to crop up every day - the example above is fairly contrived - but knowing about it will save some painful debugging sessions when inevitably you do run into it. The fix is always to take a local copy of the mutable data to coerce the compiler into generating code that creates multiple instances of the class generated to represent the closure.</p>
<p>Simple, yes? <img src='http://basildoncoder.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /></p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=FYpreM"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=FYpreM" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/07/01/lexical-closures-in-c-30/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/07/01/lexical-closures-in-c-30/</feedburner:origLink></item>
		<item>
		<title>Project Euler Problem 5</title>
		<link>http://feeds.feedburner.com/~r/BasildonCoder/~3/308778328/</link>
		<comments>http://basildoncoder.com/blog/2008/06/10/project-euler-problem-5/#comments</comments>
		<pubDate>Tue, 10 Jun 2008 11:47:27 +0000</pubDate>
		<dc:creator>russ</dc:creator>
		
		<category><![CDATA[Coding]]></category>

		<category><![CDATA[Mathematics]]></category>

		<category><![CDATA[Project Euler]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<guid isPermaLink="false">http://basildoncoder.com/blog/2008/06/10/project-euler-problem-5/</guid>
		<description><![CDATA[On to the next Project Euler problem (after a bit of a hiatus)&#8230;
Problem 5
2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.
What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?
In common with many of [...]]]></description>
			<content:encoded><![CDATA[<p>On to the next Project Euler problem (after a bit of a hiatus)&#8230;</p>
<p><em><strong><a href="http://projecteuler.net/index.php?section=problems&amp;id=5">Problem 5</a></strong></em></p>
<blockquote><p>2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.</p>
<p>What is the smallest number that is <dfn title="divisible with no remainder">evenly divisible</dfn> by all of the numbers from 1 to 20?</p></blockquote>
<p>In common with many of the other Euler problems, there&#8217;s a brute-force way to solve this, and a clean algorithmic way. And in common with my other Euler posts so far, I&#8217;ll start with the brute-force way <img src='http://basildoncoder.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>This problem can be tackled head-on with the following approach: Start from <em>n</em>=1 and increment in a loop. Test each value of <em>n</em> by attempting to divide it by all numbers <em>m</em> from 1 to 20. The first number to pass the test (i.e. <em>n</em> mod <em>m</em> is 0 for all values of <em>m</em>) is the answer.</p>
<pre>
<span class="Type">private</span> <span class="Type">static</span> <span class="Type">long</span> BruteForceSolver()
{
    <span class="Type">long</span> result;
    <span class="Statement">for</span> (result = <span class="Constant">1</span>; !Check(result); ++result)
        ;
    <span class="Statement">return</span> result;
}

<span class="Type">private</span> <span class="Type">static</span> <span class="Type">bool</span> Check(<span class="Type">long</span> result)
{
    <span class="Statement">for</span> (<span class="Type">int</span> i = <span class="Constant">1</span>; i &lt;= <span class="Constant">20</span>; ++i)
    {
        <span class="Statement">if</span> (result % i != <span class="Constant">0</span>)
            <span class="Statement">return</span> <span class="Constant">false</span>;
    }

    <span class="Statement">return</span> <span class="Constant">true</span>;
}</pre>
<p>This works, but it takes &gt;12 seconds to execute on my PC, so it&#8217;s not what you&#8217;d call efficient (though it is well within the Euler execution time guidelines).</p>
<p>Some speed gains can be achieved by exploiting the information provided in the question itself. We are told that 2520 is the lowest number evenly divisible by all numbers from 1 to 10. Since the problem space (1 to 20) includes all these numbers, the answer must also be evenly divisible by 2520. This allows much bigger increments each loop - rather than incrementing by 1, why not increment by 2520? And since the answer must be greater than or equal to 2520, why not start the loop there instead of 1? Finally, since we already know that 1 to 10 divide evenly into 2520, each inner loop only needs to check numbers 11 to 20.</p>
<p>That should speed things up a bit:</p>
<pre><span class="Type">private</span> <span class="Type">long</span> BruteForceSolver()
{
    <span class="Type">long</span> result;
    <span class="Statement">for</span> (result = <span class="Constant">2520</span>; !Check(result); result += <span class="Constant">2520</span>)
        ;
    <span class="Statement">return</span> result;
}

<span class="Type">private</span> <span class="Type">bool</span> Check(<span class="Type">long</span> result)
{
    <span class="Statement">for</span> (<span class="Type">int</span> i = <span class="Constant">11</span>; i &lt;= <span class="Constant">20</span>; ++i)
    {
        <span class="Statement">if</span> (result % i != <span class="Constant">0</span>)
            <span class="Statement">return</span> <span class="Constant">false</span>;
    }

    <span class="Statement">return</span> <span class="Constant">true</span>;
}</pre>
<p>And indeed, on my machine this is now down to 150ms or so. It&#8217;s still not a very nice way to tackle the problem, though.</p>
<p>Thinking about it from a different angle yields an altogether smarter approach. Imagine we are looking for the lowest number evenly divisible by the numbers 1 to 2.</p>
<p>[1, 2]</p>
<p>Well that&#8217;s easy; since there are only two numbers we just find the lowest common multiple (LCM), which in this case is 2 (since 2 % 2 == 0, and 2 % 1 == 0). If we call this sequence s<sub>1</sub>, we can say that LCM(s<sub>1</sub>) = 2.</p>
<p>OK, now imagine we are solving the same problem for s<sub>2</sub>, which contains the numbers 1 to 3.</p>
<p>[1, 2, 3]</p>
<p>You&#8217;ll notice that s<sub>2</sub> contains s<sub>1</sub> in its entirety. LCM(s<sub>2</sub>) must therefore be a multiple of LCM(s<sub>1</sub>), so we can rewrite s<sub>2</sub> as [LCM(s<sub>1</sub>), 3], or [2, 3] (since we know LCM(s<sub>1</sub>) = 2). Now we are down to two numbers again, so we can calculate the LCM of 2 and 3, which is 6, so LCM(s<sub>2</sub>) = 6.</p>
<p>OK, now we solve the problem for the first 4 numbers (s<sub>3</sub>).</p>
<p>[1, 2, 3, 4]</p>
<p>This sequence contains s<sub>2</sub>, therefore LCM(s<sub>3</sub>) is a multiple of LCM(s<sub>2</sub>). We can rewrite s<sub>3</sub> as [LCM(s<sub>2</sub>), 4], or [6, 4]. Thus, LCM(s<sub>3</sub>) = 12.</p>
<p>This can be repeated as many times as necessary. Generally, we have s<em><sub>n</sub></em> = [LCM(s<sub><em>n</em>-1</sub>), <em>n</em>+1] where <em>n</em> &gt; 0.</p>
<p>This looks recursive, but a better way to think of it is as an excellent example of a fold. A fold is one of the fundamental tools of functional programming. In fact, it is perhaps the most fundamental, since map, filter etc can be implemented as right folds<sup>[1]</sup>.</p>
<p>I won&#8217;t inflict my pitiful Photoshop skills on anyone by trying to graphically represent a fold - try looking at <a href="http://en.wikipedia.org/wiki/Fold_(higher-order_function)">this Wikipedia article</a> if you want to try and visualise it.</p>
<p>Broadly, the behaviour of a fold is to apply a combining function to elements in a list (or other data structure) and accumulate the results. That&#8217;s exactly what we want here - our combining function is LCM, and our accumulating value is the LCM of the whole list. Effectively, for list s<sub>3</sub> above, we have</p>
<p>LCM(s<sub>3</sub>) = LCM(LCM(LCM(1,2),3),4) = 12</p>
<p>Note how the result of the innermost LCM (applied to values 1 and 2) becomes a parameter to the next LCM, which in turn becomes a parameter to the outermost LCM which returns the result we want.</p>
<p>By using a fold, we can generalise. In Haskell, the whole problem is a one-liner:</p>
<pre>
foldl lcm <span class="Constant">1</span> [<span class="Constant">1</span><span class="Statement">..</span><span class="Constant">20</span>]</pre>
<p>The 1 passed in as a parameter represents the terminating value to use when the end of the list is reached. It is common for this value to be the first element of the list, so Haskell provides a convenience function that removes the need to specify it as a parameter:</p>
<pre>
foldl1 lcm [<span class="Constant">1</span><span class="Statement">..</span><span class="Constant">20</span>]</pre>
<p>Not all languages and platforms provide an LCM function right out of the box, so to take this neat Haskell solution and port it to .Net, the LCM function needs to be implemented. This is easily done in terms of the greatest common divisor (GCD) like so:</p>
<p><a href="http://en.wikipedia.org/wiki/Least_common_multiple#Calculating_the_least_common_multiple"></a></p>
<p style="text-align: center"><a href="http://en.wikipedia.org/wiki/Least_common_multiple#Calculating_the_least_common_multiple"><img src="http://upload.wikimedia.org/math/4/d/2/4d244548521249d1b5a71941506d5f41.png" height="48" width="183" /></a></p>
<p>.Net doesn&#8217;t provide a GCD function either, so I&#8217;ll implement it using <a href="http://en.wikipedia.org/wiki/Euclidean_algorithm">Euclid&#8217;s Algorithm</a> as an extension method on long ints:</p>
<pre>
<span class="Type">public</span> <span class="Type">static</span> <span class="Type">long</span> GCD(<span class="Statement">this</span> <span class="Type">long</span> a, <span class="Type">long</span> b)
{
    <span class="Statement">while</span> (b != <span class="Constant">0</span>)
    {
        <span class="Type">long</span> tmp = b;
        b = a % b;
        a = tmp;
    }

    <span class="Statement">return</span> a;
}</pre>
<p>With GCD defined, LCM can be implemented as above:</p>
<pre>
<span class="Type">public</span> <span class="Type">static</span> <span class="Type">long</span> LCM(<span class="Statement">this</span> <span class="Type">long</span> a, <span class="Type">long</span> b)
{
    <span class="Statement">return</span> (a * b) / a.GCD(b);
}</pre>
<p>With this in place, it&#8217;s a simple matter to use .Net&#8217;s equivalent of fold - a method on IEnumerable<t> called Aggregate - to get the answer<sup>[2]</sup>: </t></p>
<pre>
<span class="Statement">return</span> LongEnumerable.Range(<span class="Constant">1</span>, <span class="Constant">20</span>)
    .Aggregate(<span class="Constant">1L</span>, (curr, next) =&gt; curr.LCM(next));</pre>
<p>And indeed, the same basic pattern can be used to solve the problem in a number of languages. In F#, given implementations of LCM and GCD as above, we have:</p>
<pre>
<span class="PreProc">List</span>.fold_left lcm <span class="Constant">1</span> <span class="Statement">[</span><span class="Constant">1</span>..<span class="Constant">20</span><span class="Statement">]</span></pre>
<p>And in ruby:</p>
<pre>
<span class="PreProc">require</span> <span class="Special">&#8216;</span><span class="Constant">rational</span><span class="Special">&#8216;</span>
(<span class="Constant">1</span>..<span class="Constant">20</span>).inject { |<span class="Identifier">c</span>, <span class="Identifier">n</span>| c.lcm n }</pre>
<p>Given that the right algorithm makes this problem a fairly trivial expression in all these languages, it&#8217;s pretty hard to identify which is the nicest. I think overall I&#8217;ll give the nod to Haskell, however, for not making me implement LCM and because I find ruby&#8217;s &#8216;inject&#8217; a less intuitive function name than foldr (but that&#8217;s probably because I learned the technique in Haskell in the first place and am set in my ways&#8230;)</p>
<hr /> <sup>[1]</sup>For example, in F#:</p>
<pre>
<span class="Statement">let</span> filter p lst <span class="Statement">=</span>
  <span class="PreProc">List</span>.fold_right <span class="Statement">(fun</span> x xs <span class="Statement">-&gt;</span> <span class="Statement">if</span> p x <span class="Statement">then</span> x<span class="Statement">::</span>xs <span class="Statement">else</span> xs<span class="Statement">)</span> lst <span class="Constant">[]</span>

<span class="Statement">let</span> map f lst <span class="Statement">=</span>
  <span class="PreProc">List</span>.fold_right <span class="Statement">(fun</span> x xs <span class="Statement">-&gt;</span> f x <span class="Statement">::</span> xs<span class="Statement">)</span> lst <span class="Constant">[]</span></pre>
<p><sup>[2]</sup>Note that in this code LongEnumerable is just a very simple partial reimplementation of Enumerable, using longs instead of ints</p>

<p><a href="http://feeds.feedburner.com/~a/BasildonCoder?a=5TZumm"><img src="http://feeds.feedburner.com/~a/BasildonCoder?i=5TZumm" border="0"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://basildoncoder.com/blog/2008/06/10/project-euler-problem-5/feed/</wfw:commentRss>
		<feedburner:origLink>http://basildoncoder.com/blog/2008/06/10/project-euler-problem-5/</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic Page Served (once) in 0.793 seconds --><!-- Cached page served by WP-Cache -->
