Cluster Consensus

From a microservice perspective Talon's message processing flow is not dissimilar to many other event or message processing architectures: the microservice exposes message handlers to the platform which in turn passes inbound messages to it for processing. In the act of processing the inbound event, the microservice will make changes to its state and send some outbound events. The key difference between the Talon architecture and traditional architectures is that microservice state is stored in memory and resiliency is provided not by synchronous persistence to disk or a data grid, but instead by streaming state changes to a backup's memory in an asynchronous, pipelined fashion. The combination of in-memory state and asynchronous 'persistence' allows Talon to operate at extreme performance levels without sacrificing on reliability.

Key aspects of any streaming application platform that are of critical concern for stream oriented applications are:

Exactly once processing of inbound and outbound messages.
Atomicity between state updates and the messaging stream: A trading application that thinks it has sent a request to buy 100,000 shares of IBM must be able to rely on that being what is actually sent out. This is particularly important in application or machine failure scenarios. If the application fails after sending out the request to buy 100,000 IBM and the application were to recover and reprocess the event this time buying 200,000 shares it could turn into a costly business problem.

In traditional application architectures the problems of state/messaging atomicity and exactly once processing are often solved using distributed transactions. For example, in J2EE state changes would be committed to a databases and messaging sent over JMS with XA transactions used to coordinate commit on both. Such schemes involve a lot of overhead and kill performance. The application flow in Talon provides the same level of reliability without the synchronous overhead of distributed transaction coordination.

The following diagram depicts the message processing flow in a Talon microservice that ensures state consensus between the microservice cluster members:

Elaborating on the diagram above, Talon ensures the same level of reliability as traditional architectures as follows:

The AEP Engine receives an inbound message
It starts a transaction and dispatches the message to the microservice
Microservice updates state (monitored by the engine)
Microservice sends outbound messages (through the engine)
The engine holds onto the outbound messages until the message handler completes and control reaches the engine. At that point, the engine starts the process of establishing consensus with the other cluster members. The first step step is to replicate the state changes and outbound messages as an atomic unit to the backup instances via memory to memory replication.
Once the state and outbound messages (processing "effects") are stabilized on a backup, the backup notifies the primary via an asynchronous stability acknowledgement
The primary then releases the outbound messages and sent downstream.
The engine receives acknowledgements of all outbound messages sent i.e. the downstream receiver - message broker or end receivers - have stabilized the messages.
The engine then acknowledges the inbound message upstream indicating to the upstream sender that it has stabilized the inbound message.

Pipelining

The AEP engine executes the above in a non-blocking pipelined manner of execution. What this means is that the engine does not block on any of the operations. At any point in time, there are several transactions in the pipeline being executed concurrently - or, more accurately, in a time sliced manner

Key Takeaways

The stabilization of state changes and outbound effects to the backup before sending outbound messages ensures that in the event of process or machine failure that the backup has the same conception of state as the former primary communicated externally (e.g. I have an outstanding request to buy 100,000 shares of IBM).
The acknowledgement of the inbound message after replication to the peer ensures that in the event of failure a duplicate can be detected to prevent duplicate dispatch to the application if the message is re-transmitted.
The entire flow above is pipelined. The application can begin processing the next inbound message before receiving stability from the backup. Nowhere in the flow is there a need to block or perform synchronous operations.
State and outbound messages can be journaled to a transaction log on disk, but it is not fsync'd and is not a primary recovery mechanism. Disk based journaling is used more for operational tasks, but can also serve as a backup recovery mechanism.
And best of all ... the above is done transparently to the application which need only concern itself with writing the business logic in message handlers.

PreviousMessage Processing NextCluster Failover

Last updated 3 months ago