Event Collaboration And Event Sourcing

Posted on 09 June 2022

“Events! Events everywhere!” - or so it seems. This isn’t a bad thing, not at all: Event-driven Architecture can help to make microservices architectures scalable and resilient. But there are different patterns at play that we need to distinguish. And not everything is an event. An attempt for clarification.

Events! Events everywhere!

The term event is omnipresent in software architecture today. We hear about Event Sourcing and Event Streaming. In workshops with the subject matter experts, we do Event Storming. While events are truly everywhere, the growth in popularity seems to lead to a loss of precision in terminology. Apache Kafka becomes an “event bus”, all asynchronous messages are declared events, and consuming an event stream is declared Event Sourcing. Blurring the different concepts like this isn’t helpful for architectural discussions.

The terms Event Sourcing, Event Streaming, and Event Collaboration deserve closer examination and differentiation. Before we address them, let’s sketch the context in which we’re operating. This article is about applications that implement business processes and consist of various subsystems or services. Let’s look at an example from a presentation by Allard Bujze in figure 1. An application takes orders for goods and arranges for them to be shipped, provided that payment is made (note it’s merely an illustration - not based on any real application, and in no way intended to serve as a model for one).

Fig. 1: Services that communicate via events
Fig. 1: Services that communicate via events

Communication takes place via a bus over which messages are sent asynchronously. A very specific form of messages, in fact. The names already indicate it: Each of these messages describes something that has already taken place. We refer to these, and only these, messages as events.

👉 Definition: Event
An event is a fact. It describes something that has happened in the past.

This simple insight already has a profound impact on the software architecture. For example: There’s no arguing about an event. When I receive an event, there’s no point in validating it, I can’t doubt it: What it describes has already happened. If the receiver of an event doesn’t like it, or can’t process it, it’s always the receiver’s problem, never the sender’s.

(In practice, I’ve seen systems where the publisher of an event attempts to track the successful processing on the receiver side, and can even take further actions itself in case of an error. There were probably good intentions behind this, but it’s based on a fundamental misunderstanding of event semantics. An important motivation for using event-driven communication is to avoid such complexity!)

The above definition of the event term is very general. In the context of communication between (micro-) services, it can lead to misunderstanding, because it would also include UI events (MouseClicked, ButtonPressed), log events, or IoT data. In the context of business systems architecture, we should be a bit more specific and limit ourselves to “interesting” events related to our business domains. Let’s call these “Domain Events”.

👉 Definition: Domain Event
A Domain Event describes an interesting fact that occurred in the past and has an impact within the domain.

The Domain Event doesn’t lose its factual character at all, the more general event definition above still applies.

Event Streaming

Where do Domain Events originate? In a service, the state of a business object changes. The service persists this state change internally. If it deems the state change relevant beyond its internal context, it informs the interested outside world that the change has occurred. For this purpose, an event is published on a message bus. The technology that has become established for this is called “Log-based Message Bus”, the best-known representative being Apache Kafka. During operation, this results in a continuous stream of events to which other services can subscribe. The architecture pattern in which communication is based on publishing and subscribing to these streams is accordingly called “Event Streaming”.

👉 Definition: Event Streaming
Publishing events to a channel that other services can subscribe to.

Just like the term event itself, this is a very broad term, and Event Streaming is also used for the examples of logging and IoT data mentioned above. For the more specific case that the individual services of a business application communicate with each other via event streams, the term Event Collaboration has become established (see for example Ben Stopford’s book “Designing Event-Driven Systems”, O’Reilly 2018).

👉 Definition: Event Collaboration
Multiple components communicate using Event Streaming of Domain Events.

When a service subscribes to an event stream from another, it can use this to generate an internal representation of the sender’s business objects. Let’s take the PaymentService from our simple example. It subscribes to OrderCreated events and can internally store the interesting parts of the orders. This way it can remember for each payment which goods are to be paid for with it. In case of a query, the PaymentService can provide additional information about the payment, without depending on the OrderService at runtime.

As long as the events that the data was generated from are still available on the bus, the local representation could be discarded and re-created from the events. For readers who have already encountered Event Sourcing, this will sound familiar. But while this shares some aspects of it (deriving data from a series of events), we should not call this alone Event Sourcing. We need to distinguish two separate perspectives: the inner and outer worlds of a service. Let’s take a step back and look at the basics of Event Sourcing.

Event Sourcing

We’ve looked at the communication between the services so far. Now let’s direct our attention to the inside of a service. Let’s stay with the PaymentService, and assume that the service receives a command from the outside (in figure 2: PaymentCommand). A payment is to be marked as received. The command is subject to a check (this could be: Does the claim to which the success message refers exist?). In the picture, this is done by the CommandHandler. In the happy path, when the service can process the command, we want to record the new state. There are several ways to do this. In a typical CRUD application, perhaps an UPDATE would be made to the payment table, and the success noted in the appropriate column. This is where Event Sourcing comes into play. Instead of changing the current state directly, the CommandHandler creates an event. The service has received the information that the payment has been made, and has been able to process it and assign it to the correct payment. So now an interesting event has occurred - the money has arrived. For this, a PaymentReceived event is created. This event is stored in the Event Journal. To query the current state of a payment later, all events that refer to this payment are read and processed sequentially. This is what’s called Event Sourcing: Writing to, and retrieving the state from, a journal of events. All of this takes place within the PaymentService: Not at the level of collaboration between services, but the level of the service’s internal data store.

👉 Definition: Event Sourcing
Instead of storing the current state, a sequence of events leading to that state is stored.

Event Sourcing is usually combined with additional views in the form of “command-query responsibility segregation”, or CQRS for short. Here, the data models for reading and writing data are separated. In the combination of CQRS with Event Sourcing, the event journal acts as the write side. With each update on the write side, one or more projections are also made into other formats. This can be, for example, a relational model, which then allows us to easily retrieve data accumulated over all payments - a typical query model. Another projection could also be to generate an event for the outside world for this update, and publish it on the bus (see figure 2). The event for the outside world doesn’t have to be identical to the internal one (for a good discussion on this, see Domain Events vs. Event Sourcing.

Fig. 2: Event Sourcing and CQRS
Fig. 2: Event Sourcing and CQRS

The projection of the internal events to the event stream going out is the bridge between Event Sourcing and Event Collaboration. An internal state change within a service results in a Domain Event of interest to the outside world. It’s distributed to interested consumers via Event Streaming. However, Event Sourcing and Event Collaboration are independent of each other, as Twitter knows:

A service that uses Event Sourcing internally doesn’t have to produce events externally. Conversely, services that don’t use Event Sourcing internally (but instead use CRUD, for example) can still act as a subscriber, and even as a publisher, in Event Collaboration. In other words: Event Sourcing is a micro-architecture pattern, it takes place within a service, and for each service within a larger system, it can be decided individually whether it’s used or not. Event Collaboration (based on Event Streaming) is a macro-architecture pattern. It relates to communication between services and there are always multiple services involved.

Table 1: Event Sourcing vs. Event Collaboration
Pattern Event Sourcing Event Collaboration
Implemented using Event Journal Event Streaming
Description Instead of storing the current state, store the events that lead to this state Publish Domain Events on a channel that others can subscribe
Scope Micro-architecture, inside of a service Macro-architecture, between services
Typical implementation technology Any database (RDBMS or noSQL), or specialized, e.g. EventStoreDB. (A log-based message bus could be used, if it has the additional capability to efficiently query by key.) Log-based message bus, e.g. Apache Kafka, AWS Kinesis, Azure EventHub

(Don’t) Listen to Your Own Events

When looking at the overall picture, with the internal event journal and the event stream to the outside, one could ask if one of the two couldn’t be omitted. Both provide events from which state can be derived. A service could just read the external event stream and do without the internal event journal. In this case, the CommandHandler would publish each event to the external bus, and the service would subscribe to the stream itself and then produce the required internal state from the external events, as shown in figure 3. Indeed this pattern is encountered in practice. It’s sometimes referred to as “Data Inside Out”, and more often as “Listen to Your Own Events.”

Fig. 3: Listen to Your Own Events?
Fig. 3: Listen to Your Own Events?

Before this approach is taken, the implications, both technical and conceptual, should be well considered.

Technically, Event Sourcing and Event Streaming tend to use different technologies. In Event Streaming, each event is usually read only once by each subscriber. A log-based message bus is ideal for this. In Event Sourcing, on the other hand, the state of a particular object must be efficiently recovered from the events at any time! It’s necessary to determine all events for a specific key with a simple query. Databases (both SQL and noSQL) are often a good choice for the event journal. If a log-based message bus is to be used for this purpose, only products that offer additional functions (such as virtual tables) that allow such queries can be considered.

Conceptually, consider that when the service relies on the external event stream, it gives up sovereignty over its internal data model. Instead of a model that is encapsulated within a service and isolated from the outside world, we now rely on a global data model in the external stream that is used by many. If we were developing decoupled services without events using a relational database, we probably wouldn’t come up with the idea that all services use a common database, where a data model is shared by all. Why would this suddenly be a good idea in the case of Event Sourcing?

Admitted, by definition, “Listen to Your Own Events” is also a type of Event Sourcing. From an architectural perspective, however, this particular variant of Event Sourcing has far-reaching implications that present significant challenges. If you encounter articles or blog posts describing how the authors failed with Event Sourcing, it’s worth taking a close look: Is it “classic” Event Sourcing that takes place entirely within a service, or is it the “inside out” variety that they struggle with? An example for a misnomer seems to be the article Don’t Let the Internet Dupe You, Event Sourcing is Hard. If I understand the scenario correctly, its author struggles not with Event Sourcing, but with “Listen to Your Own Events”.

Addendum 1: Types of Events

The different uses of events (for internal data management with Event Sourcing on the one hand, and Event Collaboration between services based on Event Streaming on the other hand) have caused Martin Fowler, among others, to classify different types of events. In his article on the topic he talks about three categories: event-sourcing events, notification events, and event-carried state transfer.

Event Sourcing events should take a special position here. As described above, these aren’t relevant for Event Collaboration, but internal persistence within a service. Event Sourcing events are the events that allow the state of an object to be completely recreated from a series of events.

In the collaboration area, Fowler distinguishes between event notifications (which themselves contain no data except a reference to the changed object) and event-carried state transfers.

The standard case in Event Collaboration should be event-carried state transfer. Here, the relevant data related to the event is supplied in the event. In the case of a change in the payment method, to remain in the payment area, this would be, for example, the bank details that were specified as the new payment method.

A notification event, on the other hand, only says that something has been changed. To get the associated data, the current state of the affected object must be queried from the service that published the event. This has at least two disadvantages:

Firstly, the link between the event and the data is lost if the reference sent along refers to a mutable object. It’s conceivable that at the time the event is processed at the receiver, another change has already taken place. Perhaps an order was changed but canceled shortly thereafter. Later, when the notification event has been received and the state is queried, the result would be an unexpected state (or even an error message).

Furthermore, the necessity of querying destroys a major advantage of the event-driven architecture. We want to improve the reliability of our system by reducing runtime dependencies. At runtime, the PaymentService shouldn’t depend on the OrderService to answer any requests.

Notification events should only be used in selected cases. For example, if the payload of the event would be so large that it wouldn’t make technical sense to publish it on a bus. The reference in a notification event should always refer to an immutable object. A useful use case is the generation of documents. Once the generation of a document is complete, a notification is published with the URL to a static file.

Coming back to event-carried state transfer - here it’s the developer’s or architect’s job to determine what data should be included in a meaningful way. Only actual changed data? Or the complete object that was changed? Or anything in between? Interesting thoughts on this can be found in this post by Mathias Verraes.

Addendum 2: Not Every Message is an Event

Last but not least, an article about events should probably also mention messages that aren’t events. In the architecture of service communication, we distinguish between three message types: events, commands, and queries.

Table 2: Commands, Queries, Events
Pattern Event Command Query
Describes.. An event that has happened in the past An intention to perform an operation or change a state A request for information about the current state of one or many objects
Expected Response None A confirmation that the command has been executed, or an error message The requested information

This brings us back to the beginning of the article. Don’t forget: A message doesn’t become an event just because it’s transmitted asynchronously or published on a Kafka bus! An event is an event because of its specific semantics, not because of the means of transport.

Caveat Lector!

If you work your way through articles on the internet on the topics discussed here, you’ll encounter a very liberal use of the terms. This ranges from nonsensical combinations such as “command events”, to the “Listen to Your Own Events” (anti-) pattern being presented as the only true form of Event Sourcing.
Before you engage in any architectural discussion about Event-driven Architecture, make sure the parties involved share the same understanding of the relevant terms. Maybe this article can help.