Archive for February, 2008

C# 3.0, Parallel LINQ, And The Betfair API - An Introduction

Saturday, February 23rd, 2008

My pal Jan has a habit of waxing lyrical about the wonders of Parallel LINQ (PLINQ) as soon as you make the mistake of mentioning multithreading within earshot. I’ve been playing around with .Net 3.5 recently, and I write a lot of async code day-to-day when struggling to keep desktop webservice clients responsive when making lots of webservice calls, so I thought it high time I took a closer look.

The Problem

A key goal for the kind of async work I do is to batch multiple calls up, so that I get all the responses at once. This is important for keeping the rest of the code clean. To illustrate, imagine you are writing an application against the Betfair API, and you have a screen that displays a market, your current profit and loss on that market, and your unmatched bets on that market. To populate this screen will require four API calls - getMarket(), getMarketPrices(), getMarketProfitAndLoss(), and getCurrentBets().

Now, the worst (though easiest) thing to do is make the four calls sequentially on the UI thread. The problem with this is it’s slow, and the UI freezes during the process (since you’re blocking on the UI thread), which is a lousy user experience.

A slightly better approach is to spin off a thread, and make the four calls there, raising an event on completion. This gets all the work off the UI thread and therefore keeps the application responsive, but it’s still slow as the calls are still sequential.

To speed it up, you can create a thread per call (so four threads in this case). There’s a whole lot of complexity around working out the optimum number of threads to use (depending on how many processors you have, how many simultaneous connections you are allowed to open, etc) but that’s a bit beyond the scope of this post, so for now we’ll go with the one-thread-per-task approach and assume it’s optimal.

So, each thread makes one webservice call, and raises an event to signify that it’s finished. Simple, right? Unfortunately, this can lead to some real headaches in collating the data.

Imagine a user has hundreds of bets on the market, and therefore the getCurrentBets() call takes a bit longer to execute than the other three. The user clicks on a market, and the threads responsible for getting market data and P&L raise their events quickly, so you display the screen with the data you have and plan to display the bets as and when they arrive.

Before the bets are received, however, the user clicks on another market. Again, the market data and P&L come back quickly and you display them. Then, finally, the original getCurrentBets() call completes. But wait! You’ve moved onto another market now, so you don’t care about those bets any more! So you have to write some code to make sure that each piece of data received is still relevant. This can become very onerous very quickly, as you struggle to determine your UI state and work out what data you want and what should be discarded.

Now imagine that your application has timers firing all over the place to update prices and P&L on the market every second or two, so you have events being raised all the time.

I’ve worked with code that ventured down this path, and believe me, you don’t want to go there.

The Solution

The best approach is to batch these calls up, so that each happens on a separate thread, but only one event is raised - when all of the data has been received. That way, you can be sure that when you handle the event, all the data is consistent.

Since this is one of the things that PLINQ does for you, it seems like a good candidate for kicking the tyres, so to speak. First, though, I’ll do a quick run through of how to do this without PLINQ, for comparison’s sake. The task will be to display a list of all the Premiership matches available on Betfair at the time the code runs.

Take Out The Old

Betfair list Premiership matches grouped by fixture date, under the Barclays Premiership node in the event tree. It looks something like this:

Soccer
    English Soccer
        Barclays Premiership
            Fixtures 23 February
                Fulham v West Ham
                Liverpool v Middlesbrough
                ...
            Fixtures 24 February
                Blackburn v Bolton
                Reading v Aston Villa
            Fixtures 25 February
                Man City v Everton

The Barclays Premiership event node has an ID that doesn’t change (2022802), so I can jump straight to that node and save myself the bother of having to navigate the Soccer and English Soccer parent nodes.

I’ll assume you already know how to create Service References for Betfair’s global WSDL, and skip straight on to creating some useful helper methods. I need to be able to call getEvents(), obviously:

private GetEventsResp GetEvents(int parentEventID)
{
    return m_global.getEvents(
            MakeEventRequest(parentEventID)).Result;
}

private getEventsIn MakeEventRequest(int parentEventID)
{
    return new getEventsIn(new GetEventsReq()
        {
            header = new APIRequestHeader()
            {
                sessionToken = m_sessionToken
            },
            eventParentId = parentEventID
        });
}

If you’re not used to C# 3.0, this is taking advantage of type initialisation to create nested objects without having to create a bunch of extra local variables. You can write the exact same method without type initialisation like this:

private getEventsIn MakeEventRequest(int parentEventID)
{
    APIRequestHeader header = new APIRequestHeader();
    header.sessionToken = m_sessionToken;
    GetEventsReq req = new GetEventsReq();
    req.header = header;
    req.eventParentId = parentEventID;
    return new getEventsIn(req);
}

The first thing I need to do is get a list of fixture nodes. I can do this by asking for child events of the Premiership node, and filtering for the events that start with the word ‘Fixture’. This can be achieved with a simple regex and a bit of normal LINQ:

private List<BFEvent> GetPremiershipFixtureEvents()
{
    return GetEvents(PREMIERSHIP).eventItems.Where(
        (ev, idx) => Regex.IsMatch(ev.eventName, “^Fixtures.*”)
        ).ToList();
}

Assume PREMIERSHIP is a const int with the value 2022802. The Where() method works as a filter - you pass it a delegate, and it executes that delegate against each member of the list and returns a new list containing only the elements for which the delegate returned true.

In this case, I’m creating the delegate with a lambda expression, which returns true for elements with an event name that is matched by the regex.

Now I’ve got the fixture events, I need to get the child events of each, which correspond to the actual matches. I want each call to be asynchronous so that they happen in parallel, rather than sequentially. I also want to wait for all calls to complete before continuing, so I use the WaitHandle.WaitAll() method:

private List<BFEvent> GetMatchEvents(
    List<BFEvent> fixtureDateEvents)
{
    List<BFEvent> matchEvents = new List<BFEvent>();
    var callbacks = (
        from ev in fixtureDateEvents
        select StartGetEvents(ev.eventId, matchEvents)
        ).ToList();
    WaitHandle.WaitAll(callbacks.ConvertAll(
                ar => ar.AsyncWaitHandle).ToArray());
    return matchEvents;
}

Here, the LINQ expression and the ConvertAll() method call are doing similar things - converting all elements of a list into another type. In the case of the LINQ expression, I am effectively obtaining a list of IAsyncResult objects by calling StartGetEvents() on each event in my list and storing the return value of each call. In the case of the ConvertAll() call, I am obtaining a list of WaitHandle objects by accessing the AsyncWaitHandle property of each IAsyncResult object in the list.

It is perfectly possible to replace the LINQ expression with a call to ConvertAll(), or the ConvertAll() call with another LINQ expression. Which one you use in cases like this is largely a matter of preference.

The StartGetEvents() method needs to make an asynchronous webservice call and append the results to the provided list. Since multiple threads are accessing the list, the write must be protected with a lock:

private IAsyncResult StartGetEvents(int parentEventID,
    List<BFEvent> matchEvents)
{
    return m_global.BegingetEvents(MakeEventRequest(parentEventID),
        delegate(IAsyncResult ar)
        {
            lock (matchEvents)
            {
                matchEvents.AddRange(
                    m_global.EndgetEvents(ar).Result.eventItems);
            }
        },
        m_global);
}

I am using an anonymous delegate for the callback here. All it does is lock the list and add the events contained in the response. Note that in production code you might want to be a bit more diligent about locking strategies and so on - I’ve written the code like this for conciseness, not necessarily for production-grade correctness.

Now the whole shebang can be invoked very simply:

var fixtures = GetPremiershipFixtureEvents();
GetMatchEvents(fixtures).ForEach(
        e => Console.WriteLine(e.eventName));

Note that the calling code is very clean and simple, and doesn’t care about threads or anything like that - all that async plumbing is nicely contained in the GetMatchEvents() and StartGetEvents() methods.

Bring In The New

So how can PLINQ help with this? Well, it lets me get rid of those GetMatchEvents() and StartGetEvents() methods, which contain all the fiddly async code and are easily the most complex methods in the code above.

First, I’ll create a simple task class which represents the task of getting events for a particular ID:

public class GetEventsTask
{
    private int m_parentEventID;
    private string m_sessionToken;

    public GetEventsTask(string sessionToken,
            int parentEventID)
    {
        m_sessionToken = sessionToken;
        m_parentEventID = parentEventID;
    }

    public List<BFEvent> GetEvents()
    {
        BFGlobalService svc = new BFGlobalServiceClient();
        APIRequestHeader header = new APIRequestHeader()
            { sessionToken = m_sessionToken };
        return new List<BFEvent>(svc.getEvents(
            new getEventsIn(new GetEventsReq()
            {
                eventParentId = m_parentEventID,
                header = header
            })).Result.eventItems);
    }
}

Once I’ve instantiated an instance of this class, a call to GetEvents() will get me all the child events for the specified parent node.

To use PLINQ, all I have to do is create an array of these task objects - one per fixture date - and use the AsParallel() extension method to specify that I want the task processing done in parallel:

    GetEventsTask[] tasks = (
            from ev in fixtureDateEvents
            select new GetEventsTask(m_sessionToken, ev.eventId)
            ).ToArray();
    var taskResults = (
            from t in tasks.AsParallel()
            select t.GetEvents()
            ).ToList();

Neat, eh? Note that PLINQ will also take care of deciding the optimal number of threads, neatly sidestepping the work I alluded to earlier.

One wrinkle is that my PLINQ statement results in a list of lists, so I need to flatten it out before returning.

List<BFEvent> matchEvents = new List<BFEvent>();
taskResults.ForEach(results => matchEvents.AddRange(results));

Obviously this is only scratching the surface, not only of PLINQ but of LINQ itself. Much more powerful expressions can be created with a little tweaking of the objects generated from the Betfair WSDL - but that’s a topic for another article.

Code CAN Be Beautiful

Friday, February 22nd, 2008

In his review of Code Is Beautiful, Jeff Atwood decides that no, actually it isn’t. He’s fairly adamant about it too:

Ideas are beautiful. Algorithms are beautiful. Well executed ideas and algorithms are even more beautiful. But the code itself is not beautiful. The beauty of code lies in the architecture, the ideas, the grander algorithms and strategies that code represents.

I just can’t agree with this. It’s effectively saying that a representation cannot be beautiful; only the underlying thing that’s being represented can be beautiful. Worse, this argument is extended to literature and art as well, and quotes a reader review from Amazon that quotes a little Russian poetry and rhetorically asks whether any non-Russian-speaking reader can see beauty in it.

This drives me nuts, it really does. Of course the representation can be beautiful, and it can also be ugly. And the beauty of the representation can have an amplifying effect on the subject of the representation. Form and content are related. A non-Russian-speaker may not appreciate Russian poetry, but that doesn’t mean that form itself has no value - it means that, in this case at least, the value of form is dependent on the content. If you don’t understand the content, you don’t appreciate the form.

This isn’t an absolute, though. In literature, there are many techniques for adding value to form. Alliteration, assonance, metre, and many more techniques are all structural techniques for beautifying form. I’d argue that pretty much anyone can appreciate the compact and succinct beauty of the phrase veni, vidi, vici without understanding what it means (”I came, I saw, I conquered”).

There are countless other examples. You don’t need to understand Italian to enjoy opera, for instance. In fact, I’ve even heard it argued that understanding the content of an opera can diminish the experience, since the actual meaning is often fairly bland and distracts from the simple appreciation of the complex sounds and interplay of the language in the hands (or lungs) of a world-class performer.

So what’s the equivalent in software? I think expressiveness and elegance are key. In particular, code that is able to express ideas without adding a lot of noise. I’m very partial to Haskell for this sort of thing - for instance the canonical quicksort implementation is wonderfully precise:

quicksort []        = []
quicksort (x:xs)    = quicksort less ++ [x] ++ quicksort greater
    where less      = [ y | y <- xs, y < x ]
          greater   = [ y | y <- xs, y >= x ]

If you know the quicksort algorithm, then the 2nd line of code there is about as precise an expression of the underlying concept as you could hope for. If you write the same algorithm in C or Visual Basic, I believe that you can objectively distinguish the ‘beauty’ of these representations of the same underlying concept. This is only possible if the representations do indeed have the quality of beauty.

Another, perhaps even better, example is the naive-recursive Fibonacci generator in the same language, which is remarkably close to the mathematical definition:

(from literateprograms.org)

fib n
    | n == 0    = 0
    | n == 1    = 1
    | n > 1     = fib(n-1) + fib(n-2)

Note I haven’t read the actual book under review here, and I have no reason to doubt the assertions that the book doesn’t deliver. I do, however, take umbrage at the statement that code (or language) cannot be beautiful.

Extending the Technical Debt Metaphor

Thursday, February 21st, 2008

A few months ago, the inestimable Steve McConnell (he of Code Complete fame) wrote about technical debt. McConnell looks to extend the metaphor beyond the simple idea of ‘code that is going to be a liability in the future’, identifying two main types of technical debt (deliberate and accidental), and identifying further correlations between the worlds of financial debt and technical debt.

For instance, based on the technical debt already accumulated, one team may have a worse ‘credit rating’ than another:

Different teams will have different technical debt credit ratings. The credit rating reflects a team’s ability to pay off technical debt after it has been incurred.

(McConnell, 2007)

There is a lot of insight in McConnell’s article, and I recommend you nip over and read it right now if you haven’t already. Technical debt is indeed a useful and rich analogy for communicating a particular class of technical problem to non-technical users.

I wonder, however, if McConnell hasn’t extended the metaphor in slightly the wrong direction. When considering technical debt, I like to think of the product managers as the debtors, and the development team as the creditors. The actual underlying concept remains the same, it’s just a shift in responsibilities.

Why?

As a developer, I don’t always get to make the decisions about whether something should be done in a quick ‘n’ dirty hack, or a properly-architected solution. Of course, I’m likely to recommend the latter where I can, but it’s a fact of life that I will often be overruled, and rightly so. There are occasions when incurring technical debt is the right thing to do. McConnell lists a few examples, e.g:

Time to Market. When time to market is critical, incurring an extra $1 in development might equate to a loss of $10 in revenue. Even if the development cost for the same work rises to $5 later, incurring the $1 debt now is a good business decision.

(McConnell, 2007)

This is a key issue. Software development considerations are not the be-all and end-all, no matter how much I (or any other developer) would like them to be. It’s the product teams that make these business decisions, however, and therefore it should be the product teams that incur the debt.

As developers, we are the ones who give the product guys what they want, and we take on the risk of that debt not being repaid, and that’s why we are the creditors.

So what does this mean? It means that, when considering whether to create some additional technical debt, it’s the product team that should have a credit rating. Have they been making quick-win decisions excessively over the last six months? Well then, maybe they’re at their credit limit, and cannot incur any more debt until they have used some of their budget on a project that reduces debt.

How about if a product manager hasn’t incurred any debt recently, but made a load of bandito decisions on a major project a year ago, and now the codebase is starting to feel the impact? Charge them interest on the debt, so that now it will cost more of their budget to pay off their debt. This is entirely fair, since with a longstanding debt it is often the case that more code has been built on top of it in the interim, which may have been written well but is inherently unstable due to the shaky foundations. Paying off the debt in full will involve refactoring this new code, too.

Of course, you need a fairly enlightened product team if this metaphor is to be accepted, not to mention significant buy-in from senior management if you are seriously at risk of jeopardising the product roadmap by sticking to your guns. However, since the technical debt metaphor is something of a meme at the moment, why not suggest it? If the technical debt metaphor really does improve understanding on the part of non-technical stakeholders, maybe it isn’t a hopeless daydream that they’ll also accept the logical extensions of the idea.

Reporting on NCover Exclusions

Wednesday, February 20th, 2008

On a recent project, my team was set the task of achieving 100% unit test pass-rate and code coverage. If you’ve ever been in this position, you’ll know it’s a double-edged sword - whilst it’s great when the Powers That Be embrace quality instead of fixating, limpet-like, on the next deadline, it can be a nightmare when that percentage figure on the weekly summary becomes the new focus for managerial concentration, especially given how difficult it can be to hit 100%.

The problem is that achieving the magical 100% is, in many cases, neither practical nor particularly useful. It can even be a problem, if the warm fuzzy feeling you get when you see “Coverage: 100%” leads to complacency. Even with 100% coverage and pass-rate, you don’t necessarily have quality software.

Our high-level project architecture involved a .Net client talking to a suite of web services written in Java. The .Net client, as an application with a GUI and a web service proxy, contained a great deal of generated code and was my main concern when the targets were set.

Now, it’s my belief that in most cases there’s no benefit to writing tests for generated code (unless you also wrote the generator). Unless you have a very, very good reason not to, you should trust that the tools are doing their job and generating sane code. That’s what they’re there for. If the tools are flaky, you probably shouldn’t use them at all - though I suppose that if you sometimes fell foul of a particular bug you could write a test to detect it[1].

The cause of my concern was that the UI and web reference code accounted for about 30-35% of the SLOC in the application, and so any coverage report that covered the whole app would be way short of the targets we were set. There are a number of ways to deal with this:

  1. Bite the bullet and write tests for everything. That includes InitializeComponent(), drag ‘n’ drop handlers, and the sync and async versions of every web service stub. Best of luck, and see you in 2017[2].
  2. Explain patiently that some code does not need testing (or at least, is on the wrong side of the productivity bell curve and subject to massively diminishing returns in terms of effort/value). Of course, then you’ll be asked to prove that you’re not pulling a fast one and that the delta of your target and actual coverage percentage can be accounted for entirely by generated code. This will be tricky if you count SLOC for the generated code and use decision points for your test coverage, and maintaining this is another administrative task that you probably don’t want to do.
  3. Separate your code such that some assemblies contain only generated code, and the rest contain only business logic. Then exclude the former from your test suite so they don’t show on the coverage report. This is probably achievable, though it can lead to some fairly hideous contortions to maintain the boundary, and can even result in sensible design decisions being discarded in favour of wacky ones that have no redeeming feature other than supporting your arbitrary separation rules.
  4. Swear indiscriminately and refuse. Then clear your desk, probably.

None of those appealed, so we set out to find another approach. What we wanted was a more flexible variant of option 3, where we could exclude methods or classes without having to exclude the whole assembly. If we could exclude code at a fairly granular level, then it became both more realistic and useful to aim for 100% coverage of our actual business code, using all the normal techniques.

It turns out that code exclusion isn’t so tough - NCover will ignore methods and classes tagged with an attribute named CoverageExclude in the global namespace[3].

This still requires a little discipline - for example making sure that if Joe marks a class as excluded, Jim doesn’t add some business logic to that class a week later without removing the attribute - but nothing that can’t easily be dealt with in regular code reviews.

The Powers That Be are wily, alas, and when we pitched the idea to them they approved in principle but were wary of allowing bits of code to be arbitrarily dropped off the coverage reports. If a class was excluded, who excluded it and why?

This seemed reasonable for accountability - the information would be available in the source check-in notes, but that’s a bit fiddly since you don’t know when the attribute was added; our source control system doesn’t have anything analogous to subversion’s ‘blame’ so you have to go rummaging through a potentially very long version history. A better solution would be to find a way to add the information directly to the coverage report, so that it’s right there for all to see. So, how?

The first step was to get the appropriate metadata into the code. The reference implementation for the CoverageExclude attribute is as follows:

public class CoverageExcludeAttribute : Attribute { }

We wanted to capture additional information when the attribute was used, however, so we added a couple of read-only properties and did away with the default constructor.

public class CoverageExcludeAttribute : Attribute
{
    private string m_author;
    private string m_reason;

    public CoverageExcludeAttribute(string reason,
        string author)
    {
        this.m_reason = reason;
        this.m_author = author;
    }

    public string Author
    {
        get { return this.m_author; }
    }

    public string Reason
    {
        get { return this.m_reason; }
    }
}

Now, when anyone uses the attribute, the compiler forces them to add some additional data.

[CoverageExclude(“John Q Dev”, “No testable code here, buster”)]
public void MethodToBeExcluded(int x, int y)
{
    // …
}

NCover can be told to pay attention to this attribute with the excludeAttributes parameter, as explained here.

With the easy bit out of the way, the next task was to report on these exclusions. Our build system, after running the test suite, used NCoverExplorer to generate a summary report. You can tell NCoverExplorer to list exclusions in reports, so we figured that would be a good place to start. The appropriate NAnt incantation is:

<ncoverexplorer failonerror=“false”
  program=“C:\NCoverExplorer\NCoverExplorer.Console.exe”
  projectName=“Atmosphere Processor::LV426″
  reportType=“4″
  xmlReportName=“Report.xml”
  mergeFileName=“CoverageMerge.xml”
  showExcluded=“True”
  satisfactoryCoverage=“80″ >

  <fileset>
    <include name=“Coverage.xml”/>
  </fileset>
  <exclusions>
    <exclusion type=“Assembly” pattern=“*Tests” />
    <exclusion type=“Assembly” pattern=“*Fixtures*” />
  </exclusions>
</ncoverexplorer>

Note the reportType and showExcluded attributes, which specify the summary report we want, with details of excluded code appended to the report. Note also the exclusion nodes, which specify that we want our test assemblies excluded from coverage metrics. The report will include a table like this:

Our goal was to somehow get our custom properties (Author and Reason) into this report. To do so, firstly we needed to modify the above table with two extra columns to hold this custom data. NCoverExplorer ships with stylesheet called CoverageReport.xsl; the table modification was achieved by tweaking the ‘exclusions summary’ section as follows:

<!– Exclusions Summary –>
<xsl:template name=“exclusionsSummary”>
  <tr>
    <td colspan=“5″>&#160;</td>
  </tr>
  <tr>
    <td class=“exclusionTable mainTableHeaderLeft”
      colspan=“1″>
      Excluded From Coverage Results</td>
    <td class=“exclusionTable mainTableGraphHeader”
      colspan=“1″>
      All Code Within</td>
    <td class=“exclusionTable mainTableGraphHeader”
      colspan=“2″>
      Reason For Exclusion</td>
  <td class=“exclusionTable mainTableGraphHeader”
      colspan=“1″>
      Developer</td>
  </tr>
  <xsl:for-each select=“./exclusions/exclusion”>
    <tr>
      <td class=“mainTableCellBottom exclusionTableCellItem”
          colspan=“1″>
        <xsl:value-of select=“@name” /></td>
      <td class=“mainTableCellBottom mainTableCellGraph”
          colspan=“1″>
        <xsl:value-of select=“@category” /></td>
      <td class=“mainTableCellBottom mainTableCellGraph”
          colspan=“2″>
        <xsl:value-of select=“@reason” /></td>
      <td class=“mainTableCellBottom mainTableCellGraph”
          colspan=“1″>
        <xsl:value-of select=“@author” /></td>
    </tr>
  </xsl:for-each>
</xsl:template>

The next step was to actually inject our custom data into the report. This was a two-stage process:

  1. Use reflection to iterate through the application assemblies, looking for anything tagged with our attribute
  2. Open the report data file generated by NCoverExplorer and shoehorn our new data into it.

We created a simple little post-processor application to perform this work. To complete stage 1, we needed to iterate through a directory of assemblies, loading each one in turn. In each assembly, we iterated through the types contained therein, and looked for our custom attribute on each one. Then, we iterated through the methods on each type, and looked for the custom attribute there too. This is actually very simple - the code skeleton looks like this:

foreach (FileInfo assemblyFile in assemblies)
{
    try
    {
        // Attempt to load the file as an assembly, and grab  
        // all the types defined therein
        Assembly assembly = Assembly.LoadFrom(
            assemblyFile.FullName);
        Type[] types = assembly.GetTypes();

        // Spin through the types, looking for classes and 
        // methods tagged with CoverageExclude
        foreach (Type type in types)
        {
            object[] classAttributes = type.GetCustomAttributes(
                typeof(CoverageExcludeAttribute), false);
            foreach (CoverageExcludeAttribute classAttribute
                    in classAttributes)
            {
                // …
            }

            MethodInfo[] methods = type.GetMethods(
                BindingFlags.Public |
                BindingFlags.NonPublic |
                BindingFlags.Instance |
                BindingFlags.Static);
            foreach (MethodInfo method in methods)
            {
                object[] methodAttributes =
                    method.GetCustomAttributes(
                    typeof(CoverageExcludeAttribute), false);
                foreach (CoverageExcludeAttribute methodAttribute
                    in methodAttributes)
                {
                    // …
                }
            }
        }
    }
    catch (Exception ex)
    {
        // Probably not a .Net assembly, do some appropriate 
        // complaining to the user
    }
}

In the loops, we cached the fully-qualified names of the types and methods tagged with the attribute.

Stage 2 was implemented by tweaking the XML data file NCoverExplorer generates for the report. This is straightforward too - suck the report into an XmlDocument, grab the exclusion nodes, and add a couple of attributes to each one. All the types and methods were already listed since we’d set the excludeAttributes parameter in the NAnt configuration (see above).

Therefore, all we needed to do was match up the FQNs we cached in stage 1 with the nodes already in the report:

XmlDocument doc = new XmlDocument();
doc.Load(coverageFile.FullName);

// Go through all the exclusion nodes and try to match 
// them up against the exclusion data we have sucked 
// out of the assemblies
foreach (XmlNode node in doc.SelectNodes(“//exclusion”))
{
    switch (node.Attributes[“category”].Value)
    {
        case “Class”:
            // Find and remove the first exclusion reason for 
            // this class
            FindExclusionAndModifyNode(exclusions.ClassExclusions,
                node);
            break;
        case “Method”:
            // Find and remove the first exclusion reason for
            // this method
            FindExclusionAndModifyNode(exclusions.MethodExclusions,
                node);
            break;
        default:
            // Exclusion at either assembly or namespace level
            break;
    }
}

The implementation of FindExclusionAndModifyNode simply loops through the cached FQNs to see if we have data that corresponds to the current node, and if so it creates two new attributes - one containing the name of the developer that added the CoverageExcludeAttribute to the code, and another containing their justification for doing so. Then the modified XmlDocument is written out to disk, overriding the original.

The end result is a report that looks something like this, with all the excluded code neatly documented to keep suspicious managers happy.

Since the post-processor was written as a simple command-line application, we could create a custom NAnt task for it and integrate the whole process seamlessly with our continuous integration setup.

[1] I’ve seen it happen a few times before, for example an XML generator (which shall remain nameless) that occasionally ‘forgot’ our custom namespace and used a default, which caused parsers of that XML to scream in agony. It’s rare though, unless you regularly dig up tools from CodeProject and use them in your production code, in which case you deserve everything you get ;-)

[2] Written in 2008. So if you’re reading this on December 31st 2016, adjust accordingly and don’t come crying to me.

[3] Yes, I know that NCover 2.x has built-in regex-based exclusions that do all this, but a) not everyone has an NCover 2.x pro licence, and b) we weren’t using NCover 2.x as it hadn’t been released at the time.

Descent Into Incompetence

Saturday, February 2nd, 2008

I am fairly heavily involved with recruitment where I work, being the author of the technical test and phone screen questions we use for evaluating candidates, and conducting face-to-face interviews with many of the hopefuls that get over these early hurdles.

Naturally, in order to gain these responsibilities I have gone through a number of required HR ass-covering exercises in which it was drilled into me that I am legally forbidden from asking questions about sexuality, marital status, family-planning, and anything else which might lead me into rejecting a candidate on grounds our beloved government considers discriminatory.

Never mind that I have never shown the least inclination to discriminate against someone because they might want to possibly think about maybe taking some [mp]aternity leave in the next 30 years, or (gasp) prefer the company of their own gender, or whatever; I have to go through all this training so that the company can throw me to the wolves if a candidate claims to have been discriminated against. “Not our fault, guvnor; we explained the rules”.

Still, fair enough I suppose; we live in litigious times, and not being a bigot I have no particular fears of transgressing.

But what if the rules are changed? And what if they’re changed in horribly unexpected ways? A recent article on the BBC News site contained, quite without fanfare, some shocking intelligence.

Previously standard questions about age, length of experience and religious views are now illegal, [Which?] points out.

Wait, what? Length of experience is now a forbidden topic? So if I’m recruiting a senior developer or team lead, I now have to waste valuable time interviewing fresh-out-of-college tyros who haven’t written a single line of commercial code or spent a single day working in a professional team?

I can kind of see what is trying to be achieved here, but it is an unavoidable fact that experience is a vital attribute for many senior roles, and needs to be taken into consideration when trying to fill those roles. It’s not just me either - a quick trawl through the endless agency emails I seem to get every day (despite telling them I’m not on the market) reveals that most tech jobs are still specifying n years of experience; this seems somewhat pointless now that candidates can’t be asked about it. I wonder if they know?

Even more interesting is the fact that many contract positions are still paid at ‘rates negotiable on experience’. Hah, how does that work when experience is a forbidden subject? If I were graduating from university this year I’d be whoring myself around the City applying for £500-per-day contracting gigs and suing any bank that dared ask me to justify my rate.

Rob Grant’s novel Incompetence just became slightly less hysterical.

Article 13199 of the Pan-European Constitution: “No person shall be prejudiced from employment in any capacity, at any level, by reason of age, race, creed or incompitence (sic).