Gravatar rfw.name

Methods of Asynchrony: Paradigms for Asynchronous Programming

Asynchronous architectures seem to be the hip new(-ish) wave of writing software – no more wasting system resources waiting for I/O operations to complete, giving the CPU more time to do actual useful work. We just schedule an operation to execute in some kind of magical background thread pool and when it finishes we get the results and then we can do stuff with them.

That sounds awesome, right? Well, as it turns out, it is… and it isn’t. Here’s some techniques for asynchronous programming in various programming languages, with varying degrees of fun-ness to use.

Event Handlers

Event handlers are a way to handle events that may occur asynchronously. The HTML DOM API is an example of this:

window.addEventListener("load", function (e) {
    // e is the event object with details about what occurred.
});

Essentially, we set up actions to happen when a certain thing (“event”) occurs. An example of this being used asynchronously is XMLHttpRequest in the DOM:

var xhr = new XMLHttpRequest();
xhr.addEventListener("load", function (e) {
    // Do stuff here.
});
xhr.open("GET", "foo", true);
xhr.send();

Generally, multiple event handlers (subscribers) can be attached to one event emitter (publisher).

Benefits

Fits in very well with publish/subscribe systems. Events are automatically fanned-out to event subscribers from event publishers.

Event handlers can be unregistered at any time, in case a callback needs to be stopped from executing.

Event sources can be defined and subscribed to, e.g. mouse clicks, key presses – the handlers are automatically triggered every time an event occurs. This decouples emitting events from handling events.

Drawbacks

In a request/reponse system, code for making the request and handling the response can be in very different places. Request sending is decohered (is that a word? I hope it’s a word) from response handling.

If event streams are multiplexed into a single bus, it can be annoying to separate event types from each other.

Suffers from all the pitfalls of explicit continuation-passing style (see below) if used in an request/response system.

Explicit Continuation-Passing Style

Callbacks are perhaps the most explicit way of handling request/response asynchrony in programs. The premise is simple: we have a function that does some work at some point and, when it finishes the work, runs the callback passed to it.

This should be familiar to Node.js programmers:

// Schedule two reads with separate handlers.
fs.readFile("foo",  function (err, data) {
    foobar(data);
});

fs.readFile("bar",  function (err, data) {
    foobar(data);
});

fs.readFile schedules an I/O operation in the Node.js runtime and, when the operation completes, calls the callback. Nothing new, hopefully.

Benefits

Scheduling is very explicit. May or may not be a benefit, but it helps with understanding code.

Drawbacks

Callback pyramids. Code tends to grow sideways (“arrow”-shaped) when callbacks are nested together.

Separate exception handling mechanism. Node.js convention espouses that callbacks take an error for the first function parameter, and these must be manually passed back up the chain (if (err) cb(err);).

Awkward loop structures. These need to be manually unraveled into explicit continuations, e.g. here’s a synchronous read of files in a list:

for (var i = 0; i < files.length; ++i) {
    console.log(fs.readFileSync(files[i]));
}

It becomes, in asynchronous style:

var cont = function (i) {
    if (!(i < files.length)) return;
    console.log(fs.readFile(files[i]), function () {
        cont(i + 1);
    });
}
cont();

Generators

Generators are a way to pause execution at arbitrary points in a given block of code. At these points, generators can “yield” values either out of or into the caller. Python supports these as a language feature, e.g.:

def count(i):
    while True:
        yield i # Execution pauses here and can be resumed later.
        i += 1

Python’s Twisted framework can use generators to defer tasks to other tasks (via its Deferred mechanism):

from twisted.internet import defer

@defer.inlineCallbacks
def async_print_two_files(filename1, filename2):
    # Schedule two async reads.
    deferred1 = async_read_file(filename1)
    deferred2 = async_read_file(filename2)

    # Wait for both async reads to complete.
    contents1 = yield deferred1
    contents2 = yield deferred2

    print(contents1 + contents2)

(Somewhat relatedly, an interesting aspect is that they can adapt event-based code into linear code, making it easier to write state machines for multi-step flows:

class GeneratorHandler(MessageHandler):
    def __init__(self, f):
        self.g = f(self)

    def on_create(self):
        next(self.g)

    def on_receive(self, message):
        self.g.send(message)

def handler():
    # This code is executed in on_create.
    foo()

    # Receive a username message first...
    username = yield

    # ... then a password message.
    password = yield

    authenticate(username, password)

    # Now process the user's commands in a loop.
    while True:
        cmd = yield
        process(cmd)

handler = GeneratorHandler(handler)

The more you know!)

Benefits

Asynchronous flow is now linearized – the generator is paused and can be resumed later when an event occurs. No more awkward arrow-shaped callback pyramids!

We can use language-level error handling, like trycatch.

Can be used for purposes other than asynchrony, e.g. infinite and lazy lists.

Drawbacks

Generators need to be handled explicitly in functions – anywhere a generator can yield, yield from needs to be used. Miguel de Icaza talks about this issue pertaining to .NET – and also discusses the benefits of coroutines (see below!).

In Python, generators masquerade as functions – if a yield statement is inserted into a function body, the function’s semantics completely change. For instance:

def foo():
    return
    yield

Calling foo() will return a generator, rather than immediately return.

Coroutines

Coroutines are an extended form of generators. They support the same operations, with the exception that yielding from a coroutine will propagate to the first coroutine running on the call stack, rather than the caller.

Lua has coroutine support built-in. Here’s an example of some coroutine-based code in Lua:

count = coroutine.wrap(function (i)
    while true do
        coroutine.yield(i)
        i = i + 1
    end
end)

This coroutine can be called from another coroutine, and any yields will propagate across coroutine boundaries.

Benefits

Coroutines don’t have to be explicitly handled – yields automatically propagate up to the top-most coroutine that’s running.

All of the benefits associated with generators.

Drawbacks

Reader washort points out that unlike generators, coroutine yields are now implicit – it becomes difficult to reason about when a coroutine will pause and, as such, when other code will execute and mutate state.

Promises

Promises (or futures or deferreds) are a way of wrapping values that will exist at a later point in time – hence the name. This sounds really contrived, but it just means that we have a type of some sort that we will get a computed value from at some point in time.

For example, in jQuery:

// GET a URL, returning a promise.
var p = $.get("some_url");

p.then(function (data) {
    console.log(data);
});

You might be thinking that this looks exactly like event handlers and explicit CPS. In essence, that’s what they are – a chimera of them. They’re one-shot event handlers that fire an event when their associated operation completes.

Their flexibility, however, is their ability to be composed together – a promise can return another promise such that data can be piped through multiple actions did somebody say monads?.

// GET a URL, returning a promise.
var p = $.get("some_url");

p.then(function (data) {
    return $.get("some_other_url?k=" + encodeURIComponent(data));
}).then(function (data) {
    // This logs the result of the second request.
    console.log(data);
});

Benefits

Composability – multiple actions can be composed together to return a single promise, which is fulfilled when the entire chain of actions completes.

Can poll promises for completion – we can check if a promise has been “fulfilled”.

Drawbacks

Error handling still uses a separate mechanism, generally by passing a second function to the .then() function that determines what to do in the case of an error.

Awkward loop structures still need to be manually unraveled into explicit continuations.

Each callback passed to .then() functions have separate scopes, e.g. variables defined in the first .then() can’t be referred to in the second .then() (as expected, but still clunky).

async and await

C# programmers may be familiar with async and await – special language-level features that handle promises (known as “tasks” in .NET parlance). What this does is capture a continuation at the point of await. When the result is ready, the continuation is invoked again with the result to resume execution – similar to how yield point work in generators.

async Task<string> GetTwoUrls(string url1, string url2) {
    HttpClient client1 = new HttpClient();
    HttpClient client2 = new HttpClient();

    // Schedule two tasks independent of each other.
    Task<string> task1 = client1.GetStringAsync(url1);
    Task<string> task2 = client2.GetStringAsync(url2);

    // Await both of them.
    string contents1 = await task1;
    string contents2 = await task2;

    return contents1 + contents2;
}

Benefits

Code is linearized much like with generators and coroutines – we specify an await point for code to wait at until a result is received.

Drawbacks

Requires tight language integration for a very specific use case.

Conclusions

As it turns out, the answers range from “sort of” to “yeah, it’s alright”. For starters, everything needs to be rewritten to use asynchronous I/O – great for a novel platform like Node.js, but not so great for platforms that were originally written with blocking I/O in mind.

Even then, it pays to have a language with facilities for handling asynchronous I/O such that it doesn’t turn into piles of unsightly pyramidal code (especially when you’ve seen what’s on the other side).

comments powered by Disqus