What Kind of Asynchronous is Right For You?

Posted on 15 June 2023

There are a few ways for (micro-)services to communicate with other (micro-)services that can be rightfully classified as asynchronous1. But they have very different characteristics. Let’s have a look at them and their impact on the overall system structure.

Asynchronous Option 1: Non-blocking network calls (e.g. HTTP, gRPC)

On a low technical level, the term asynchronous is used to mean “not blocking the thread of execution”. If you call an asynchronous method, that method will be executed concurrently. Usually, control will return to the caller immediately and the method will be executed on another thread.

In languages like Java, where operating system threads are the standard concurrency model, if the asynchronous method returns something, you’ll have to provide a callback. Or it’ll return a Future that will complete at some later point in time, like the HttpClient in the Java standard library (since Java 11). sendAsync doesn’t return the HttpResponse, but a CompletableFuture<HttpResponse>.

Other languages have other mechanisms to avoid blocking. In Go, for example, you can wrap the call in a goroutine that is executed asynchronously. The actual remote call will look like a synchronous (blocking) call, but only the goroutine that contains the call will be blocked. As goroutines are managed by the internal Go scheduler, this won’t block the worker thread that executes it.

Either way (with futures or co-/goroutines), as the operating system thread will be freed for other work while waiting for the network I/O to complete, these non-blocking calls are very efficient. The maximum number won’t depend on the number of OS threads that can be handled reasonably. Only by other OS limitations such as maximum file handles and TCP port range, and that should only kick in somewhere beyond 10k.

So it’s a lightweight approach, also in the sense that it’s very direct, you don’t need any middleware. HTTP/gRPC client to HTTP/gRPC server, nothing is needed in between. There might be, such as CDN nodes, API gateways, (reverse) proxies, load balancers, and sidecars, but that’s all optional: None of it’s needed to make the fundamental communication work.

Asynchronous Option 2: Messaging

HTTP (or gRPC) calls can be asynchronous in the sense of non-blocking. But if you were to draw an architecture diagram of the services communicating, you’d probably draw them as synchronous calls. Viewed from the outside, the caller sends a request and waits for the response to continue its work. It’s just not “busy waiting”, but waiting more efficiently.

Another form of asynchronous communication (and one which would be visible as such in architecture diagrams) is messaging. Here the caller becomes a sender and uses a message bus to send a message to one or multiple recipients. Again there’s no blocking: As soon as the message bus has accepted the message, the sender will go on to do other things.

As opposed to the HTTP or gRPC call, this will even work if the recipient is unreachable at the time the message is sent.

Based on this, you could say this approach removes a runtime coupling between sender and receiver - they aren’t required to be up at the same time. This would then lead to more robust systems, as a failure of one service does not cascade and degrade the functionality of all other services using it.

However, in practice, the actual decoupling is very limited if you use messaging in a request-response way. If the sender instructs the recipient to do something (i.e. the request is a command), it’ll need to know if the operation succeeded (the response will be a message indicating success or error). The runtime coupling is still there then, as the overall process is stuck until the response is received. The same is true if the sender asks for information (i.e. the request is a query, the response is the requested data) that it needs to continue.

So while technically the introduction of the message bus decouples the two systems, if the communication follows a request-response pattern, the sender is still blocked from proceeding until the response is received.

Asynchronous Option 3: Event-driven Communication

To achieve real runtime decoupling, we must look at the semantics of our messages. In addition to the commands and queries mentioned above, there’s the third option of emitting an event.

Table 1: Commands, Queries, Events
Message Type Command Query Event
Describes.. An intention to perform an operation or change a state A request for information about the current state of one or many objects A fact, something that undisputedly happened in the past
Expected Response A confirmation that the command has been executed, or an error message The requested information None
Communication Pattern Request-Response Request-Response Fire-And-Forget

An event describes a fact. As opposed to commands or queries it’s not directed at a receiver. If you ask another service to do something, or request information, your message goes to that specific service. But if you just tell “the world” that something happened, any service that this information is relevant to can pick it up. Thus the sender becomes a publisher, and other services can become subscribers to the information.

Emitting events, as opposed to sending commands, will lead to a fundamentally different design of your services. Think of a simple order process. Part of it may be collecting the payment and blocking the inventory. In a request-response case, one service (or maybe a workflow engine) would take care of the overall process, send messages to the payment service and the inventory service, and wait for the responses.

With events, the service taking the order would simply publish the information that the order has been deemed valid and accepted. The other services would then observe the event and would be responsible to take appropriate action. It’d be up to them to handle any errors or raise any alerts - there’s no back channel to the publisher.

This is the highest level of runtime decoupling. Just like in the other cases, there’s no blocking, and in addition to that, there’s no waiting for any response.

Real Life Analogies

The different patterns and their impact on the overall flow may not be intuitive. Maybe it helps to think about how this would work with people communicating, instead of services. Assume you want to order a pizza.

The approach that resembles the (HTTP/gRPC) network call most closely is a phone call2. You call the pizza place and will get an immediate response.

Maybe they also take orders by e-mail or text message. This would be equivalent to the messaging option. Once you sent your message, you’re not blocked. But the order may fail - maybe they’re missing a topping you ordered, or you sent the message outside of their business hours. You won’t know when (and if) they’re going to deliver the pizza until you get a response message.

For the event-driven option, you’ll have to imagine you’re not a customer, but a waiter in a pizza restaurant. You receive an order, put it on a piece of paper, and stick the paper into the kitchen order system (“tab grabber”). At this point, you forget about it. You know a chef will pick up the order, make that pizza, put it in the service hatch, and ring a bell when it’s ready. Maybe your shift has ended by then - it’ll still be delivered to the table by some other waiter. The overall workflow is now divided into completely decoupled sub-processes (order taking, food preparation, food delivery) The order sub-process is completed once you posted that piece of paper. You don’t wait for a chef to confirm that they’ll actually prepare the food.

A kitchen order system or "tab grabber"
A kitchen order system or "tab grabber". Image by ThinkStock

The Different Options Compared

The table below summarizes the main characteristics of the three different asynchronous options.

Table 2: Types of Asynchronous. "No" is better!
The thread of execution on the caller/sender side is blocked The server/recipient has to be available The client/sender expects a response
Async Network Call No Yes Yes
Messaging (request-response) No No Yes
Event-driven No No No

Using events removes the runtime coupling entirely. It does mean you have to build your system event-driven, though. If you’re used to building everything using HTTP or gRPC calls (which many of us are, as they’re relatively similar to local function calls), this may be a bit mind-bending. But not doing so, to quote Confluent’s “command” pattern page, is a missed opportunity:

in moving from a function call within a monolith to a system that posts a specific command to a specific recipient, we’ve decoupled the function call without decoupling the underlying concepts.

Fortunately, many people start their system design with events nowadays, applying discovery and modeling techniques such as event storming, domain storytelling, or event modeling3. If you think in events from the start, you’re much more likely to be successful in building an event-driven system.

Even in an event-driven architecture, there might still be someplace where you need a response to proceed. An HTTP (or gRPC) call is a straightforward option for this. This is what the front end uses to communicate with services, this is probably what you use in your authentication mechanism - so there’s a very good chance you’re using it in your application already.

So where does this leave option 2, messaging, then? It doesn’t give you the runtime decoupling you get from being event-driven. If you send a command or query, even if it’s via a message bus, there’ll be a response you’re interested in and that you have to wait for. Neither does it give you the simple directness of a network call:

  1. You need additional middleware (message bus) to make it work.
  2. You have to correlate responses with their requests. That usually means you need to store request information in persistent storage under some key (request id or correlation id) and make sure to include that key in both request and response, so you can associate the response with the request
  3. You need to deal with timeouts, i.e. requests that don’t get a response in the expected time.

All this makes option 1, the HTTP/gRPC call, much more attractive for the command and query (i.e. request-response) cases. There’s no extra middleware. And the response is handled in the same place in your code where the request is sent. A lot of complexity of messaging arises from the fact that the response may be received by another node, not the one who sent the message. In an HTTP call, you don’t worry about correlating the response to the request. As all the handling will happen locally in the same process, it can be done “under the hood”, making the developer’s life simpler. Timeouts can be handled by a circuit breaker, which is available as a library or as sidecar functionality.

Messaging creates the illusion of being more robust or reliable, and of somehow increasing the decoupling. But don’t be fooled. It adds a lot of complexity, without giving you the benefits of real events. Look at the semantics of your communication. If the communication is request-response, what value does messaging add over network calls? There’s no shame in using HTTP or gRPC if you need a response.

Just like it doesn’t make anything more robust, neither does it make anything faster or more scalable. If a service isn’t able to handle n HTTP/gRPC calls in a given time, it’ll be equally unable to handle n messages in that time. How great it would be if the choice of communication channel had such an effect!

An argument I heard for request-response messaging is, that there are cases with a very long time between requests and responses. You send a request and want a response, but only eventually, maybe only the next day. I find this to be an odd design. If you don’t need the response to give some feedback to your caller, your system can, and should, be changed to not need the response at all. This seems to be a case of the “[decoupling] the function call without decoupling the underlying concepts” mentioned above. Better redesign the process to use proper events.

Still, request-response messaging won’t disappear completely. It’s a valid option for integration with third-party or legacy systems. Some commercial off-the-shelf software may not offer alternative modes of communication. But if you control both ends of the communication, it seems a very unattractive proposal.

Which Type of Asynchronous to Choose?

Event-driven systems provide a high level of runtime decoupling. In an event-driven system, services will be able to serve requests without invoking other services.

Choose event-driven communication wherever you can. When designing a new system, start with events4.

If you need a response, use an HTTP or gRPC call. In those situations, where you choose to use commands or queries, i.e. apply a request-response pattern of communication, keep it simple. Just make sure the libraries and methods you use are non-blocking, to keep it efficient.

Avoid the added complexity of doing request-response with messaging: the need for a message bus, to correlate messages, and to deal with timeouts. Reserve request-response messaging for places where you have no choice: The integration of third-party or legacy systems.

  1. For a deep dive into the possible meanings of asynchronous, you can watch this talk by Sam Newman 

  2. This doesn’t really account for the asynchronous nature of modern network call libraries as described above, but from the outside, a synchronous and asynchronous call will look the same anyway. For the async part, you could imagine while the phone is ringing at the pizza place, you do other things, and you phone notifies you the moment the other side picks up. 

  3. Disclaimer: Of these, I’ve only ever used event storming in a professional context. I mention the techniques only to illustrate that there are several popular movements that support event-driven architecture. Before choosing an approach to analyze your domain and model your systems, please do additional research. 

  4. A good article on “events first” by Russ Miles.