What is event sourcing?

What is event sourcing?

Event sourcing is an architecture that keeps track of events (actions) that occur in an application and builds data that the application needs from those events. This is quite different from traditional database-backed applications where data is Created/Read/Updated/Deleted (CRUD) from within an application event and only the latest version of the data persists.

By keeping a permanent record of what events occurred in an application, event sourcing adds to data additional semantics and a dimension of time. This opens possibilities for solving problems that are beyond other architectures.

Additionally, event sourcing has characteristics that help build applications that can offer fast data access, high availability, and the potential to scale at low cost.

How event sourcing compares to traditional database-backed applications

The diagram below shows how a typical application reads and writes data to a database. The key characteristic that might go unnoticed is that data is written and read from the same place.

Application writes data to a database and then reads data from the same database storage medium.

Let's contrast this with the data flow in an event sourcing architecture seen in the following diagram. We'll explore the details of each component part in the diagram in the subsequent section. For now, notice how in event sourcing the application writes data to one place but reads data from another place. Separating read and write mediums is a key characteristic that makes the benefits of event sourcing possible.

In an event sourcing application, events are generated by an application and sent for permanent storage to a broker. These events are read by microservices which make meaning data from these events and store this data in temporary storage. The original application that created the events now reads data from microservices.

Another major difference between event sourcing and a database solution is how data is stored. Databases organize data into tables that look like the one seen below, which is quite different from how applications actually use data.

ProductId ProductName Quantity
260 Barley 40
261 Hops 34
265 Yeast 12
274 Spring water 75

By contrast, event sourcing stores data in a way that is very close to how data is structured inside of applications – as objects. These objects can be serialized or active in memory. A serialized object might look like this:

[
    {"ProductId":260,"ProductName":"Barley","Quantity":40.0},
    {"ProductId":261,"ProductName":"Hops","Quantity":34.0},
    {"ProductId":265,"ProductName":"Yeast","Quantity":12.0},
    {"ProductId":274,"ProductName":"Spring water","Quantity":75.0}
]

One final difference is that databases use queries written in SQL to aggregate data whenever data is requested by the user of an application. Event sourcing also performs aggregations, but they are done in the language of the application whenever something happens in an application.

Now let's explore in more detail how event sourcing works.

Follow the data flow
From the Application to the Broker…

When an event (any action) occurs in your application, event sourcing packages information about the event into a message, sends it to an application called a broker that runs on a server, which then saves the message to storage.

The flow of data is in one direction - from the application to the broker.

For example, in an application used by people, an event could be triggered by the action of pressing the Save button on a form, plus the data in the form. In an Internet of Things (IoT) application, an event could be caused by a timer triggering a sensor reading, plus data from the sensor reading.

Below is an example of an event message that is triggered by an action in an inventory tracking system—the withdrawal of a product:

InventoryWithdrawalEvent_v1
{
    "productId": 260,
    "amount": 46,
    "unitOfMeasure": "lbs",
    "noticedDate": "2018-01-22T14:15:00"
}

Here, we called the event message generated InventoryWithdrawalEvent_v1. The _v1 in the event message name is a way of versioning an event message. In the future, should you want to change the structure of the message by adding or modifying its properties (such as adding the product's name), you would name the new event message with an incremented number such as InventoryWithdrawalEvent_v2.

Once the event message is received by the broker, it is saved to storage. The type of physical storage medium is not that important to the overall architecture, because all interactions are done through the broker. However, the way data is organized logically by the broker is quite important. First, the messages are stored in the form of lists, and in the order they are received. Also, event messages are only ever appended to lists, since they represent a record of the history of actions in your application. So event messages should not be deleted or modified, only appended.

Conceptually, permanent storage of events resembles multiple blockchains which are lists.

Lastly, each event message is stored in a list with other related messages. These lists are called streams and they are identified by name in the form of a path that is similar to a URL or a file system path. So a stream name might look like this: /Inventory/North East Region

From the Broker to Microservices…

Microservices are single purpose applications that host data that the application needs. Typically, the data is held in temporary storage like RAM, and in the form of an object that the application can easily consume. But other forms of storage/structure can also work. The application data is kept current in real time as the microservice reads new event messages received from the broker.

Unread messages flow from the broker to microservices. Data in event messages is transformed into meaningful data the application needs and this data is stored in temporary storage.

Event messages are however structured differently from application data. Event messages contain data about actions that occur in the application. For example, in an inventory application, an event message might contain the information that 46 pounds of a given product were consumed. But microservices contain data about the current state of the application itself. So, in our inventory application example, the current state would be the current inventory level of all products.

Updating the current state of the application is achieved by using business rules to extract data from event messages. Business rules are written in the language of your choice and they basically aggregate data by grouping it, or by performing some form of calculation on the data. Thus, in our inventory application example, a rule might take a change in inventory level and add it to a running total for the relevant product, in order to give the current inventory level of the product. Because rules perform aggregations on data, the resulting data is called an aggregate.

An aggregate is a collection of objects that store the latest application state.

You have probably noticed that microservices store data in temporary storage. This can be in RAM or in an actual database. When we say the data store is temporary, we really mean that it is of no real importance if the data should be lost, say during a reboot, since the data can be rebuilt by re-processing event messages managed by the broker.

From the Microservices to the Application…

In the final step in event sourcing, your application will use web API to read all or part of the aggregate data held by the microservice. Inside the microservice, data is typically stored as objects, in memory. Some or all of the objects are then serialized and sent to the application when needed.

Microservices can be thought of as extensions of the application that store data transformation rules and aggregated data usually in the form of objects in memory.

The key benefits of event sourcing
A true history of what occurred

Events are the system of record for your application. This makes them ideal for change-tracking, auditing, compliance and debugging. They keep a true track of the history of your application.

With event sourcing, your application has the ability to go back in time to get application states as they were at a specific time in history. For example, if you use event sourcing for inventory tracking, you could obtain the exact inventory level of every product 4 months ago, at precisely 2PM.

An easy, inexpensive way to scale data access

In a traditional database application, for every user requesting data, there will be a query against the database. As the number of users grows, the demand on the database grows, ultimately making it disproportionately expensive to scale.

In event sourcing, queries are made against data stored in microservices, not against the centralized data store. And the data inside microservices is pre-aggregated and typically stored in fast data storage like RAM. So, as the number of users requesting data increases, scaling becomes as easy as running more instances of the same microservices.

It might help to think of aggregated data hosted by microservices as a cache of data that is always kept up-to-date. And creating copies of this cache is both cheap and easy.

Better performance than traditional database applications

When reads and writes take place in the same storage medium, such as a database, locking can be necessary for data consistency. This can slow down data access when the storage system is in heavy use. But event sourcing separates the primary write storage from the primary read storage. Events are written to a centralized data storage, but, as we have seen, business applications don’t read data from there. They read aggregated data from the storage space controlled by microservices.

This means that event sourcing avoids locking completely whenever data is written, because events are never being updated or deleted, only appended. And it reduces the need for locking when data is read. Furthermore, if locking is required during write operations, it is faster to lock RAM than other storage mediums.

Builds high availability

The microservices used in event sourcing are ideally suited for high-availability applications. They are small, self-contained, and manage their own data. You can have many copies of the same microservice processing the same stream of events and each copy of the microservice will have identical data.

If one copy of the microservice intentionally or unintentionally goes down, you can re-route traffic to another copy. This makes the entire system more resilient to failures and easier to upgrade to new versions.

Summary

Event sourcing: