Archive for the ‘ .Net ’ Category

Lexical Closures in C# 3.0

There’s a slightly weird article up on Dobbs Code Talk this week, speculating that aggregate functions are “the next big programming language feature” after closures. The slight weirdness comes from the fact that both features have been around for decades, and not just in dusty academic languages either.

Still, there’s some interesting discussion in the comments about whether .Net’s closures are proper first-class lexically-scoped closures. The answer is yes – but with a fun twist.

The twist has been around for a long time – Brad Abrams blogged about it way back in 2004, for instance – but it’s probably worth going over it again, since the recent arrival of LINQ and lambda syntax in C# 3.0 will presumably lead to more people being bitten by this as the use of closures becomes more mainstream.

A key thing to remember is that C# lambdas are just anonymous delegates in skimpy syntax. Behind the scenes the compiler turns them into classes – if you were looking at disassembled MSIL you wouldn’t be able to tell whether the code was written with lambda syntax or anonymous delegate syntax. Therefore, the issue discussed by Brad has not gone anywhere.

Lets revisit the problem, with a 2008 sheen applied (i.e. I’ll use lambda syntax rather than anonymous delegate syntax). What does the following code display?

Func<int>[] funcs = new Func<int>[10];
for (int i = 0; i < 10; ++i)
{
    funcs[i] = () => i * i;
}

funcs.ForEach(f => Console.WriteLine(f()));

If you answered something along the lines of “prints the square of every number between 0 and 9″ you’d be…wrong. Really, try it out. See?

Now, a lexical closure is supposed to capture its environment, meaning that the lambda stored on the first loop would capture i when i==0, the second loop would capture i when i==1, and so on. If this happened, then executing all the lambdas would indeed result in the squares of the numbers 0-9 being printed. So what gives?

The problem stems from the fact that the lambda is binding itself to a variable that is accessible outside the closure, which is being changed in every iteration of the loop. The closure doesn’t capture the value of i, it captures a reference to i itself, which is mutable.

You could actually make a case that this is bad code anyway, since it gives two responsibilities to the loop index – control the loop, and act as data in the closures. If we were being pedantic, we could split the responsibilities by creating a new variable, j, to be the closure data each iteration, and let i concentrate on being an index:

for (int i = 0; i < 10; ++i)
{
    int j = i;
    funcs[i] = () => j * j;
}

Lo and behold, the code now works! Pedantry rules! Take a look with Reflector or ildasm to see what’s going on here. The executive summary is that the compiler captures the environment (i in the first example, j in the second) by creating a member variable within the class it generates for the closure. Previously, since the same instance of i lived for the entire duration of the loop, only one instance of the generated class was created and shared. Now, however, a new instance of the generated class is created in each iteration of the loop (since j is scoped within the loop body and thus we have a new j every time round). Thus, the data is not shared and we get the expected output.

There are two important points to consider here:

  1. The problem goes away if you write code more declaratively. Do away with the clunky for loop and everything works OK.
    Enumerable.Range(0, 10).Select(x => x * x);
  2. It isn’t always bad that multiple closures can capture a reference – since one closure can ‘see’ updates made to the shared data by another closure, you could use this as a coordination mechanism.

This is not an issue that’s going to crop up every day – the example above is fairly contrived – but knowing about it will save some painful debugging sessions when inevitably you do run into it. The fix is always to take a local copy of the mutable data to coerce the compiler into generating code that creates multiple instances of the class generated to represent the closure.

Simple, yes? ;-)

  • Share/Bookmark

C# 3.0, Parallel LINQ, And The Betfair API – An Introduction

My pal Jan has a habit of waxing lyrical about the wonders of Parallel LINQ (PLINQ) as soon as you make the mistake of mentioning multithreading within earshot. I’ve been playing around with .Net 3.5 recently, and I write a lot of async code day-to-day when struggling to keep desktop webservice clients responsive when making lots of webservice calls, so I thought it high time I took a closer look.

The Problem

A key goal for the kind of async work I do is to batch multiple calls up, so that I get all the responses at once. This is important for keeping the rest of the code clean. To illustrate, imagine you are writing an application against the Betfair API, and you have a screen that displays a market, your current profit and loss on that market, and your unmatched bets on that market. To populate this screen will require four API calls – getMarket(), getMarketPrices(), getMarketProfitAndLoss(), and getCurrentBets().

Now, the worst (though easiest) thing to do is make the four calls sequentially on the UI thread. The problem with this is it’s slow, and the UI freezes during the process (since you’re blocking on the UI thread), which is a lousy user experience.

A slightly better approach is to spin off a thread, and make the four calls there, raising an event on completion. This gets all the work off the UI thread and therefore keeps the application responsive, but it’s still slow as the calls are still sequential.

To speed it up, you can create a thread per call (so four threads in this case). There’s a whole lot of complexity around working out the optimum number of threads to use (depending on how many processors you have, how many simultaneous connections you are allowed to open, etc) but that’s a bit beyond the scope of this post, so for now we’ll go with the one-thread-per-task approach and assume it’s optimal.

So, each thread makes one webservice call, and raises an event to signify that it’s finished. Simple, right? Unfortunately, this can lead to some real headaches in collating the data.

Imagine a user has hundreds of bets on the market, and therefore the getCurrentBets() call takes a bit longer to execute than the other three. The user clicks on a market, and the threads responsible for getting market data and P&L raise their events quickly, so you display the screen with the data you have and plan to display the bets as and when they arrive.

Before the bets are received, however, the user clicks on another market. Again, the market data and P&L come back quickly and you display them. Then, finally, the original getCurrentBets() call completes. But wait! You’ve moved onto another market now, so you don’t care about those bets any more! So you have to write some code to make sure that each piece of data received is still relevant. This can become very onerous very quickly, as you struggle to determine your UI state and work out what data you want and what should be discarded.

Now imagine that your application has timers firing all over the place to update prices and P&L on the market every second or two, so you have events being raised all the time.

I’ve worked with code that ventured down this path, and believe me, you don’t want to go there.

The Solution

The best approach is to batch these calls up, so that each happens on a separate thread, but only one event is raised – when all of the data has been received. That way, you can be sure that when you handle the event, all the data is consistent.

Since this is one of the things that PLINQ does for you, it seems like a good candidate for kicking the tyres, so to speak. First, though, I’ll do a quick run through of how to do this without PLINQ, for comparison’s sake. The task will be to display a list of all the Premiership matches available on Betfair at the time the code runs.

Take Out The Old

Betfair list Premiership matches grouped by fixture date, under the Barclays Premiership node in the event tree. It looks something like this:

Soccer
    English Soccer
        Barclays Premiership
            Fixtures 23 February
                Fulham v West Ham
                Liverpool v Middlesbrough
                ...
            Fixtures 24 February
                Blackburn v Bolton
                Reading v Aston Villa
            Fixtures 25 February
                Man City v Everton

The Barclays Premiership event node has an ID that doesn’t change (2022802), so I can jump straight to that node and save myself the bother of having to navigate the Soccer and English Soccer parent nodes.

I’ll assume you already know how to create Service References for Betfair’s global WSDL, and skip straight on to creating some useful helper methods. I need to be able to call getEvents(), obviously:

private GetEventsResp GetEvents(int parentEventID)
{
    return m_global.getEvents(
            MakeEventRequest(parentEventID)).Result;
}

private getEventsIn MakeEventRequest(int parentEventID)
{
    return new getEventsIn(new GetEventsReq()
        {
            header = new APIRequestHeader()
            {
                sessionToken = m_sessionToken
            },
            eventParentId = parentEventID
        });
}

If you’re not used to C# 3.0, this is taking advantage of type initialisation to create nested objects without having to create a bunch of extra local variables. You can write the exact same method without type initialisation like this:

private getEventsIn MakeEventRequest(int parentEventID)
{
    APIRequestHeader header = new APIRequestHeader();
    header.sessionToken = m_sessionToken;
    GetEventsReq req = new GetEventsReq();
    req.header = header;
    req.eventParentId = parentEventID;
    return new getEventsIn(req);
}

The first thing I need to do is get a list of fixture nodes. I can do this by asking for child events of the Premiership node, and filtering for the events that start with the word ‘Fixture’. This can be achieved with a simple regex and a bit of normal LINQ:

private List<BFEvent> GetPremiershipFixtureEvents()
{
    return GetEvents(PREMIERSHIP).eventItems.Where(
        (ev, idx) => Regex.IsMatch(ev.eventName, "^Fixtures.*")
        ).ToList();
}

Assume PREMIERSHIP is a const int with the value 2022802. The Where() method works as a filter – you pass it a delegate, and it executes that delegate against each member of the list and returns a new list containing only the elements for which the delegate returned true.

In this case, I’m creating the delegate with a lambda expression, which returns true for elements with an event name that is matched by the regex.

Now I’ve got the fixture events, I need to get the child events of each, which correspond to the actual matches. I want each call to be asynchronous so that they happen in parallel, rather than sequentially. I also want to wait for all calls to complete before continuing, so I use the WaitHandle.WaitAll() method:

private List<BFEvent> GetMatchEvents(
    List<BFEvent> fixtureDateEvents)
{
    List<BFEvent> matchEvents = new List<BFEvent>();
    var callbacks = (
        from ev in fixtureDateEvents
        select StartGetEvents(ev.eventId, matchEvents)
        ).ToList();
    WaitHandle.WaitAll(callbacks.ConvertAll(
                ar => ar.AsyncWaitHandle).ToArray());
    return matchEvents;
}

Here, the LINQ expression and the ConvertAll() method call are doing similar things – converting all elements of a list into another type. In the case of the LINQ expression, I am effectively obtaining a list of IAsyncResult objects by calling StartGetEvents() on each event in my list and storing the return value of each call. In the case of the ConvertAll() call, I am obtaining a list of WaitHandle objects by accessing the AsyncWaitHandle property of each IAsyncResult object in the list.

It is perfectly possible to replace the LINQ expression with a call to ConvertAll(), or the ConvertAll() call with another LINQ expression. Which one you use in cases like this is largely a matter of preference.

The StartGetEvents() method needs to make an asynchronous webservice call and append the results to the provided list. Since multiple threads are accessing the list, the write must be protected with a lock:

private IAsyncResult StartGetEvents(int parentEventID,
    List<BFEvent> matchEvents)
{
    return m_global.BegingetEvents(MakeEventRequest(parentEventID),
        delegate(IAsyncResult ar)
        {
            lock (matchEvents)
            {
                matchEvents.AddRange(
                    m_global.EndgetEvents(ar).Result.eventItems);
            }
        },
        m_global);
}

I am using an anonymous delegate for the callback here. All it does is lock the list and add the events contained in the response. Note that in production code you might want to be a bit more diligent about locking strategies and so on – I’ve written the code like this for conciseness, not necessarily for production-grade correctness.

Now the whole shebang can be invoked very simply:

var fixtures = GetPremiershipFixtureEvents();
GetMatchEvents(fixtures).ForEach(
        e => Console.WriteLine(e.eventName));

Note that the calling code is very clean and simple, and doesn’t care about threads or anything like that – all that async plumbing is nicely contained in the GetMatchEvents() and StartGetEvents() methods.

Bring In The New

So how can PLINQ help with this? Well, it lets me get rid of those GetMatchEvents() and StartGetEvents() methods, which contain all the fiddly async code and are easily the most complex methods in the code above.

First, I’ll create a simple task class which represents the task of getting events for a particular ID:

public class GetEventsTask
{
    private int m_parentEventID;
    private string m_sessionToken;

    public GetEventsTask(string sessionToken,
            int parentEventID)
    {
        m_sessionToken = sessionToken;
        m_parentEventID = parentEventID;
    }

    public List<BFEvent> GetEvents()
    {
        BFGlobalService svc = new BFGlobalServiceClient();
        APIRequestHeader header = new APIRequestHeader()
            { sessionToken = m_sessionToken };
        return new List<BFEvent>(svc.getEvents(
            new getEventsIn(new GetEventsReq()
            {
                eventParentId = m_parentEventID,
                header = header
            })).Result.eventItems);
    }
}

Once I’ve instantiated an instance of this class, a call to GetEvents() will get me all the child events for the specified parent node.

To use PLINQ, all I have to do is create an array of these task objects – one per fixture date – and use the AsParallel() extension method to specify that I want the task processing done in parallel:

    GetEventsTask[] tasks = (
            from ev in fixtureDateEvents
            select new GetEventsTask(m_sessionToken, ev.eventId)
            ).ToArray();
    var taskResults = (
            from t in tasks.AsParallel()
            select t.GetEvents()
            ).ToList();

Neat, eh? Note that PLINQ will also take care of deciding the optimal number of threads, neatly sidestepping the work I alluded to earlier.

One wrinkle is that my PLINQ statement results in a list of lists, so I need to flatten it out before returning.

List<BFEvent> matchEvents = new List<BFEvent>();
taskResults.ForEach(results => matchEvents.AddRange(results));

Obviously this is only scratching the surface, not only of PLINQ but of LINQ itself. Much more powerful expressions can be created with a little tweaking of the objects generated from the Betfair WSDL – but that’s a topic for another article.

  • Share/Bookmark

Reporting on NCover Exclusions

On a recent project, my team was set the task of achieving 100% unit test pass-rate and code coverage. If you’ve ever been in this position, you’ll know it’s a double-edged sword – whilst it’s great when the Powers That Be embrace quality instead of fixating, limpet-like, on the next deadline, it can be a nightmare when that percentage figure on the weekly summary becomes the new focus for managerial concentration, especially given how difficult it can be to hit 100%.

The problem is that achieving the magical 100% is, in many cases, neither practical nor particularly useful. It can even be a problem, if the warm fuzzy feeling you get when you see “Coverage: 100%” leads to complacency. Even with 100% coverage and pass-rate, you don’t necessarily have quality software.

Our high-level project architecture involved a .Net client talking to a suite of web services written in Java. The .Net client, as an application with a GUI and a web service proxy, contained a great deal of generated code and was my main concern when the targets were set.

Now, it’s my belief that in most cases there’s no benefit to writing tests for generated code (unless you also wrote the generator). Unless you have a very, very good reason not to, you should trust that the tools are doing their job and generating sane code. That’s what they’re there for. If the tools are flaky, you probably shouldn’t use them at all – though I suppose that if you sometimes fell foul of a particular bug you could write a test to detect it[1].

The cause of my concern was that the UI and web reference code accounted for about 30-35% of the SLOC in the application, and so any coverage report that covered the whole app would be way short of the targets we were set. There are a number of ways to deal with this:

  1. Bite the bullet and write tests for everything. That includes InitializeComponent(), drag ‘n’ drop handlers, and the sync and async versions of every web service stub. Best of luck, and see you in 2017[2].
  2. Explain patiently that some code does not need testing (or at least, is on the wrong side of the productivity bell curve and subject to massively diminishing returns in terms of effort/value). Of course, then you’ll be asked to prove that you’re not pulling a fast one and that the delta of your target and actual coverage percentage can be accounted for entirely by generated code. This will be tricky if you count SLOC for the generated code and use decision points for your test coverage, and maintaining this is another administrative task that you probably don’t want to do.
  3. Separate your code such that some assemblies contain only generated code, and the rest contain only business logic. Then exclude the former from your test suite so they don’t show on the coverage report. This is probably achievable, though it can lead to some fairly hideous contortions to maintain the boundary, and can even result in sensible design decisions being discarded in favour of wacky ones that have no redeeming feature other than supporting your arbitrary separation rules.
  4. Swear indiscriminately and refuse. Then clear your desk, probably.

None of those appealed, so we set out to find another approach. What we wanted was a more flexible variant of option 3, where we could exclude methods or classes without having to exclude the whole assembly. If we could exclude code at a fairly granular level, then it became both more realistic and useful to aim for 100% coverage of our actual business code, using all the normal techniques.

It turns out that code exclusion isn’t so tough – NCover will ignore methods and classes tagged with an attribute named CoverageExclude in the global namespace[3].

This still requires a little discipline – for example making sure that if Joe marks a class as excluded, Jim doesn’t add some business logic to that class a week later without removing the attribute – but nothing that can’t easily be dealt with in regular code reviews.

The Powers That Be are wily, alas, and when we pitched the idea to them they approved in principle but were wary of allowing bits of code to be arbitrarily dropped off the coverage reports. If a class was excluded, who excluded it and why?

This seemed reasonable for accountability – the information would be available in the source check-in notes, but that’s a bit fiddly since you don’t know when the attribute was added; our source control system doesn’t have anything analogous to subversion’s ‘blame’ so you have to go rummaging through a potentially very long version history. A better solution would be to find a way to add the information directly to the coverage report, so that it’s right there for all to see. So, how?

The first step was to get the appropriate metadata into the code. The reference implementation for the CoverageExclude attribute is as follows:

public class CoverageExcludeAttribute : Attribute { }

We wanted to capture additional information when the attribute was used, however, so we added a couple of read-only properties and did away with the default constructor.

public class CoverageExcludeAttribute : Attribute
{
    private string m_author;
    private string m_reason;

    public CoverageExcludeAttribute(string reason,
        string author)
    {
        this.m_reason = reason;
        this.m_author = author;
    }

    public string Author
    {
        get { return this.m_author; }
    }

    public string Reason
    {
        get { return this.m_reason; }
    }
}

Now, when anyone uses the attribute, the compiler forces them to add some additional data.

[CoverageExclude("John Q Dev", "No testable code here, buster")]
public void MethodToBeExcluded(int x, int y)
{
    // ...
}

NCover can be told to pay attention to this attribute with the excludeAttributes parameter, as explained here.

With the easy bit out of the way, the next task was to report on these exclusions. Our build system, after running the test suite, used NCoverExplorer to generate a summary report. You can tell NCoverExplorer to list exclusions in reports, so we figured that would be a good place to start. The appropriate NAnt incantation is:

<ncoverexplorer failonerror="false"
  program="C:\NCoverExplorer\NCoverExplorer.Console.exe"
  projectName="Atmosphere Processor::LV426"
  reportType="4"
  xmlReportName="Report.xml"
  mergeFileName="CoverageMerge.xml"
  showExcluded="True"
  satisfactoryCoverage="80" >

  <fileset>
    <include name="Coverage.xml"/>
  </fileset>
  <exclusions>
    <exclusion type="Assembly" pattern="*Tests" />
    <exclusion type="Assembly" pattern="*Fixtures*" />
  </exclusions>
</ncoverexplorer>

Note the reportType and showExcluded attributes, which specify the summary report we want, with details of excluded code appended to the report. Note also the exclusion nodes, which specify that we want our test assemblies excluded from coverage metrics. The report will include a table like this:

Our goal was to somehow get our custom properties (Author and Reason) into this report. To do so, firstly we needed to modify the above table with two extra columns to hold this custom data. NCoverExplorer ships with stylesheet called CoverageReport.xsl; the table modification was achieved by tweaking the ‘exclusions summary’ section as follows:

<!-- Exclusions Summary -->
<xsl:template name="exclusionsSummary">
  <tr>
    <td colspan="5">&#160;</td>
  </tr>
  <tr>
    <td class="exclusionTable mainTableHeaderLeft"
      colspan="1">
      Excluded From Coverage Results</td>
    <td class="exclusionTable mainTableGraphHeader"
      colspan="1">
      All Code Within</td>
    <td class="exclusionTable mainTableGraphHeader"
      colspan="2">
      Reason For Exclusion</td>
  <td class="exclusionTable mainTableGraphHeader"
      colspan="1">
      Developer</td>
  </tr>
  <xsl:for-each select="./exclusions/exclusion">
    <tr>
      <td class="mainTableCellBottom exclusionTableCellItem"
          colspan="1">
        <xsl:value-of select="@name" /></td>
      <td class="mainTableCellBottom mainTableCellGraph"
          colspan="1">
        <xsl:value-of select="@category" /></td>
      <td class="mainTableCellBottom mainTableCellGraph"
          colspan="2">
        <xsl:value-of select="@reason" /></td>
      <td class="mainTableCellBottom mainTableCellGraph"
          colspan="1">
        <xsl:value-of select="@author" /></td>
    </tr>
  </xsl:for-each>
</xsl:template>

The next step was to actually inject our custom data into the report. This was a two-stage process:

  1. Use reflection to iterate through the application assemblies, looking for anything tagged with our attribute
  2. Open the report data file generated by NCoverExplorer and shoehorn our new data into it.

We created a simple little post-processor application to perform this work. To complete stage 1, we needed to iterate through a directory of assemblies, loading each one in turn. In each assembly, we iterated through the types contained therein, and looked for our custom attribute on each one. Then, we iterated through the methods on each type, and looked for the custom attribute there too. This is actually very simple – the code skeleton looks like this:

foreach (FileInfo assemblyFile in assemblies)
{
    try
    {
        // Attempt to load the file as an assembly, and grab  
        // all the types defined therein
        Assembly assembly = Assembly.LoadFrom(
            assemblyFile.FullName);
        Type[] types = assembly.GetTypes();

        // Spin through the types, looking for classes and 
        // methods tagged with CoverageExclude
        foreach (Type type in types)
        {
            object[] classAttributes = type.GetCustomAttributes(
                typeof(CoverageExcludeAttribute), false);
            foreach (CoverageExcludeAttribute classAttribute
                    in classAttributes)
            {
                // ...
            }

            MethodInfo[] methods = type.GetMethods(
                BindingFlags.Public |
                BindingFlags.NonPublic |
                BindingFlags.Instance |
                BindingFlags.Static);
            foreach (MethodInfo method in methods)
            {
                object[] methodAttributes =
                    method.GetCustomAttributes(
                    typeof(CoverageExcludeAttribute), false);
                foreach (CoverageExcludeAttribute methodAttribute
                    in methodAttributes)
                {
                    // ...
                }
            }
        }
    }
    catch (Exception ex)
    {
        // Probably not a .Net assembly, do some appropriate 
        // complaining to the user
    }
}

In the loops, we cached the fully-qualified names of the types and methods tagged with the attribute.

Stage 2 was implemented by tweaking the XML data file NCoverExplorer generates for the report. This is straightforward too – suck the report into an XmlDocument, grab the exclusion nodes, and add a couple of attributes to each one. All the types and methods were already listed since we’d set the excludeAttributes parameter in the NAnt configuration (see above).

Therefore, all we needed to do was match up the FQNs we cached in stage 1 with the nodes already in the report:

XmlDocument doc = new XmlDocument();
doc.Load(coverageFile.FullName);

// Go through all the exclusion nodes and try to match 
// them up against the exclusion data we have sucked 
// out of the assemblies
foreach (XmlNode node in doc.SelectNodes("//exclusion"))
{
    switch (node.Attributes["category"].Value)
    {
        case "Class":
            // Find and remove the first exclusion reason for 
            // this class
            FindExclusionAndModifyNode(exclusions.ClassExclusions,
                node);
            break;
        case "Method":
            // Find and remove the first exclusion reason for
            // this method
            FindExclusionAndModifyNode(exclusions.MethodExclusions,
                node);
            break;
        default:
            // Exclusion at either assembly or namespace level
            break;
    }
}

The implementation of FindExclusionAndModifyNode simply loops through the cached FQNs to see if we have data that corresponds to the current node, and if so it creates two new attributes – one containing the name of the developer that added the CoverageExcludeAttribute to the code, and another containing their justification for doing so. Then the modified XmlDocument is written out to disk, overriding the original.

The end result is a report that looks something like this, with all the excluded code neatly documented to keep suspicious managers happy.

Since the post-processor was written as a simple command-line application, we could create a custom NAnt task for it and integrate the whole process seamlessly with our continuous integration setup.

[1] I’ve seen it happen a few times before, for example an XML generator (which shall remain nameless) that occasionally ‘forgot’ our custom namespace and used a default, which caused parsers of that XML to scream in agony. It’s rare though, unless you regularly dig up tools from CodeProject and use them in your production code, in which case you deserve everything you get ;-)

[2] Written in 2008. So if you’re reading this on December 31st 2016, adjust accordingly and don’t come crying to me.

[3] Yes, I know that NCover 2.x has built-in regex-based exclusions that do all this, but a) not everyone has an NCover 2.x pro licence, and b) we weren’t using NCover 2.x as it hadn’t been released at the time.

  • Share/Bookmark

Turbocharging .Net Webservice Clients

Since the first version of .Net and its associated toolset, Microsoft have sought to make it easy to write SOAP services and SOAP clients. And, generally, they have succeeded quite well. Whilst the open-source world has tended to prefer the simpler REST approach, MS (and Sun, and Apache) have done an admirable job of taking the large, complex SOAP protocol and making it reasonably straightforward to work with most of the time.

One of the areas in which things get somewhat less straightforward is high performance. Granted, most web services don’t have particularly eye-popping requirements in terms of hits or transactions, but occasionally you find an exception. Betfair, for instance, have an API that has peak rates in excess of 13,000 requests per second, many of which hit the database, with individual users making tens or even hundreds of requests per second. Betfair’s data changes at a breathtaking rate and there is a perceived advantage to getting hold of up-to-the-millisecond information.

To put that in context, the Digg Effect is estimated to peak at around 6K-8K hits per hour as I write in December 2007, which translates to a piffling couple of hits per second (8000 / 60 / 60 = 2.2). Even a more extreme prediction of 50K hits per hour is only around 14 hits per second, so we’re talking about handling three orders of magnitude more requests than Digg generates.

If you are writing a .Net client and want to get the best out of this sort of situation, you are hamstrung unless you learn a few tricks. Read on to learn five of the best. I’ll be using the Betfair API as an example throughout, but the techniques apply to any high-performance web service where the usage profile involves frequent small (<1KB) SOAP requests.

For those who normally turn to the back of the book for answers and don’t really care about the whys and wherefores, here’s the executive summary:

  1. Switch off Expect 100 Continue. This should be done in your App.config file (see below).
  2. Switch off Nagle’s Algorithm. This should be done in your App.config file (see below).
  3. Use multithreading. Unfortunately this is not a simple configuration file setting, it is a fundamental part of your application design.
  4. Remove the maximum connection bottleneck. This should be done in your App.config file (see below). This is vital if your application is multithreaded.
  5. Use gzip compression. .Net 2 has this built-in if you switch it on; .Net 1.1 needs a helping hand.

The XML snippet that configures these settings is shown below, and should be added to your App.config file:

<system.net>
    <connectionManagement>
        <add address="*" maxconnection="20" />
    </connectionManagement>
    <settings>
        <servicePointManager
            expect100Continue="false"/>
        <servicePointManager
            useNagleAlgorithm="false"/>
    </settings>
</system.net>

For those who share my distrust of unexplained sorcery, here’s the gory details.

Expect-100 the Unexpected

RFC 2616 (the specification for HTTP 1.1) includes a request header and response code that together are known as ‘Expect 100 Continue’. When using the Expect header, the client will send the request headers and wait for the server to respond with a response code of 100 before sending the request body, i.e. splitting the request into two parts with a whole round-trip in-between.

Why would you ever want to do this? Well, imagine you are trying to upload a 101MB file to a web server with a 100MB file size limit. If you just submit the whole thing as one request, you’ll sit there for ages waiting for the upload to complete, only to have it fail right at the last minute when you hit the 100MB limit. Using Expect 100 Continue, you submit just the headers initially, and wait for the server’s permission to continue; in our example, this gives the server a chance to look at the Content-Length parameter and identify that your file is too big, and return an error code instead of permission to continue. This way you know the upload will fail, without needlessly transmitting a single byte of the large file.

By default, the .Net Framework uses the Expect 100 Continue approach. Most SOAP requests are not, however, anywhere near 101MB in size, and a server designed to deal with thousands of requests per second is not likely to be returning anything other than 100 Continue if the Expect header is sent. Depending on latency, the round-trip penalty may be unacceptable.

For example, Betfair’s Australian exchange (physically located in Australia) contains all their Australian markets, so if you’re in the UK and want to trade on the Australian Open tennis you’re subject to a round-trip time of about 350ms. If you allow your requests to be split into two, then you have two round-trips, so your request will take 700ms plus processing time.

You don’t want that, so switch it off. Note that the request headers and body are still sent separately, but the latter is no longer dependent on the response to the former (so they are shown as sent together in this diagram).

The setting corresponding to this in the XML snippet above is:

<servicePointManager
            expect100Continue="false"/>

Nagle’s Algorithm

Nagle’s algorithm is a low-level algorithm that tries to minimise the number of TCP packets on the network, by trying to fill a TCP packet before sending it. TCP packets have a 40-byte header, so if you try to send a single byte you incur a lot of overhead as you are sending 41 bytes to represent 1 byte of information. This 4000% overhead causes chaos on congested networks. A TCP packet size is configurable, but is often set to 1500 bytes, and so Nagle’s algorithm will buffer outgoing data in an attempt to send a small number of full packets rather than a huge amount of mostly empty packets.

Nagle’s algorithm can be paraphrased in English like this:

If I have less than a full packet’s worth of data, and I have not yet received an acknowledgement from the server for the last data I sent, I will buffer new outbound data. When I get a server acknowledgement or have enough data to fill a whole packet, I will send the data across the network.

Now, with our use case (frequent SOAP requests, each <1KB) a request doesn’t fill a packet, so requests are subject to buffering. Furthermore, as explained above, even with Expect 100 Continue switched off the request headers are still sent separately from the request body, so the body of the first request is buffered until the request headers reach their destination and the server sends back a TCP ACK.

Let us again consider a UK client communicating with Betfair’s Australian API. Your application issues two requests one for current prices and one for your current market position (Req1 and Req2 in the diagram below). Together, both these requests are less than 1500 bytes. The headers for the first request are transmitted, but Nagle’s algorithm buffers the request body, plus the whole second request, since they don’t fill a TCP packet.

Due to the round trip latency, the acknowledgment of the first request’s headers is not received for 350ms, so that is how long the requests are buffered for. When the requests do get sent, they too are subject to 350ms latency around the world, so again you end up with around 700ms added to each of your Australian API calls.

Without Nagle, we dispense with the buffering and send out our smaller packets immediately, saving a round-trip.

One word of warning – disabling Nagle can cause problems if you are on a highly-congested network or have overworked routers and switches, since it increases the number of network packets flying around. If you don’t own your network, or are deploying your web service client to a large number of users, you might want to think about this carefully as you won’t be popular if your software saturates the network. The setting corresponding to this in the XML snippet above is:

servicePointManager
            useNagleAlgorithm="false"/>

Happenin’ Threads

This one should be pretty obvious. A single-threaded application can only do one thing at a time. A multi-threaded application can do multiple things at a time. So if, for example, you want to display some information and need to collate the results from four web service requests to build your view, calling them one after another is going to be slower than calling them all simultaneously (not to mention freezing your UI if you’re writing a WinForms application and making calls on the UI thread).

Explaining how to design and write multi-threaded .Net applications is way out of scope for this blog post, but any time you spend reading and learning about it either online or in books is going to be time well-spent. Go on, do it now. Also read the next section, otherwise your application will not get as much benefit as you expect.

Walk and Chew Gum

.Net, by default, has a bottleneck on simultaneous web service calls against the same host. This one catches loads of people out – they write clever multi-threaded applications that issue many, many requests, and never realise that beneath the application layer a lot of their requests are called in sequence rather than in parallel.

Tsk, damn Microsoft for unnecessarily crippling their framework, right? Well no, actually, since this is an example of Microsoft following standards and recommendations and is to be applauded, lest they go back to their old ways of “embrace and extend”. The recommendation in question is from RFC 2616 again, and states:

A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.

(RFC 2616, section 8.1.4)

.Net uses HTTP 1.1 persistent connections by default (this is a good thing – you don’t want to incur the cost of establishing a TCP connection with every request, especially if you have long round-trip times as well since it involves multiple round-trips), so MS have done the right thing and restricted processes to two simultaneous connections by default.

What does this mean for your application? It means that if you want to make 20 requests, and do so on 20 threads as recommended above, under the hood .Net only makes two at a time and queues the rest up. Therefore your 20 threads are wasted, as your throughput is no better than if you’d only created two threads, and innocent web servers are protected from overzealous clients.

When we know that we’re talking to an insane city-flattening Godzilla-on-steroids of a server, however, the two-connection limit is unreasonable and we can ramp-up safely. The corresponding setting in the XML snippet above is:

<connectionManagement>
        <add address="*" maxconnection="20" />

Sorted. You can, of course, change the number 20 to whatever is appropriate for your application.

Warning: Do not, under any circumstances, get into the habit of configuring this for all your web service applications. With all 20 threads issuing one request per second, you are exceeding the 14-request-per-second Digg Effect example I used above; with this technique and enough bandwidth your application will be quite capable of taking a lot of websites down, and those that manage to weather the storm will probably blacklist your IP address and/or close your account. Only use this if you are absolutely sure the server is up to it and such aggressive behaviour is permitted by the owners.

Small is Beautiful

The last step is to enable compression on your responses. Depending on the nature of the service you are using, the value of this tip may vary, but it’s likely to be of some benefit given the number of requests per second we are issuing. Of course, it is dependent on the web service actually supporting compression, but it’s been a standard feature of just about every web server for years, so this shouldn’t be a problem.

Lets look at a worked example. The getMarketPrices response from the Betfair API is about 30KB of XML. If, as above, we have 20 threads issuing one request per second, and each thread is interested in prices from a different market, that’s about 600KB of data per second, which will quite easily saturate a lot of home broadband connections.

With gzip compression, however, each response comes down to about 5KB (XML compresses very well, generally, since it is just text with a lot of repetition and whitespace), so the 20 threads now demand a more manageable 100KB per second.

Great. So how do we use it? In .Net 2.0 and above it’s very easy – just set the EnableDecompression property of your web service proxy object (ignore IntelliSense, which incorrectly claims the value is true by default; it’s actually false by default, as stated on MSDN). For example, to get compressed responses from Betfair’s global server:

BFGlobalService service = new BFGlobalService();
service.EnableDecompression = true;

If you’re still using .Net 1.1, you have a bit more work to do, since support for gzip was inexplicably left out of the framework. First, you need to subclass the generated BFGlobalService proxy class, and override some key methods so you can a) include the Accept-Encoding header to tell the server that you understand gzip, and b) decompress the gzipped response before the XML deserializer sees it, otherwise it’ll choke.

public class BFGlobalWithGzip : BFGlobalService
{
    /// <summary>
    /// Adds compression header to request header
    /// </summary>
    protected override System.Net.WebRequest
         GetWebRequest(Uri uri)
    {
         HttpWebRequest request = 
             (HttpWebRequest)base.GetWebRequest(uri);

         // Turn on compression
         request.Headers.Add("Accept-Encoding",
             "gzip, deflate");
         return request;
    }

    /// <summary>
    /// Decompress response before the Xml serializer gets
    /// its hands on it
    /// </summary>
    protected override WebResponse GetWebResponse(
        WebRequest request)
    {
        return new HttpWebResponseDecompressed(request);
    }

    /// <summary>
    /// Need to override this method if performing
    /// asynchronous calls, otherwise de-compression
    /// will not be performed and will throw an error
    /// </summary>
    protected override WebResponse GetWebResponse(
        WebRequest request, IAsyncResult result)
    {
        return new HttpWebResponseDecompressed(
            request, result);
    }
}

Next, implement the HttpWebResponseDecompressed class. This subclasses .Net’s WebResponse class and knows how to decompress a response if it has ContentEncoding ‘gzip’:

public class HttpWebResponseDecompressed : WebResponse
{
    private HttpWebResponse m_response;
    private MemoryStream m_decompressedStream;

    public HttpWebResponseDecompressed(WebRequest request)
    {
        m_response = (HttpWebResponse)request.GetResponse();
    }

    public HttpWebResponseDecompressed(WebRequest request,
        IAsyncResult result)
    {
        m_response = (HttpWebResponse)
            request.EndGetResponse(result);
    }

    public override long ContentLength
    {
        get
        {
            if (m_decompressedStream == null
                || m_decompressedStream.Length == 0)
                    return m_response.ContentLength;
            else return m_decompressedStream.Length;
        }
        set { m_response.ContentLength = value; }
    }

    public override string ContentType
    {
        get { return m_response.ContentType; }
        set { m_response.ContentType = value; }
    }

    public override Uri ResponseUri
    {
        get { return m_response.ResponseUri; }
    }

    public override WebHeaderCollection Headers
    {
        get { return m_response.Headers; }
    }

    public override System.IO.Stream GetResponseStream()
    {
        Stream compressedStream = null;

        if (m_response.ContentEncoding == "gzip")
            compressedStream = new GZipInputStream(
                m_response.GetResponseStream());
        else if (m_response.ContentEncoding == "deflate")
            compressedStream = new InflaterInputStream(
                m_response.GetResponseStream());

        if (compressedStream != null)
        {
            m_decompressedStream = new MemoryStream();
            int size = 4096;
            byte[] buffer = new byte[size];
            while (true)
            {
                size = compressedStream.Read(
                    buffer, 0, size);

                if (size > 0)
                    m_decompressedStream.Write(
                        buffer, 0, size);
                else
                    break;
            }

            m_decompressedStream.Seek(0, SeekOrigin.Begin);
            compressedStream.Close();
            return m_decompressedStream;
        }
        else return m_response.GetResponseStream();
    }
}

To decompress the data, we need a decompression library since .Net 1.1 doesn’t provide one. In most cases, SharpZipLib will do the business:

using ICSharpCode.SharpZipLib.GZip;
using ICSharpCode.SharpZipLib.Zip.Compression.Streams;

Now, when creating an instance of BFGlobalService you can use the gzip-supporting subclass and everything else happens automatically.

BFGlobalService service = new BFGlobalServiceWithGzip();

Fin

Jebus, that went on for a bit. Now, go forth and write high-performance clients at will – but heed the warnings about only doing this when you know the server is on-the-ball, because these tips really can cause havoc if used irresponsibly.

  • Share/Bookmark