AEP Module

The AEP (Application Event Processing) module contains the canonical end-to-end performance benchmark for X Platform. This benchmark measures the complete Receive-Process-Send flow of a clustered microservice.

Overview

The AEP module benchmarks exercise the entire X Platform stack, including:

  • Messaging: Inbound and outbound message handling

  • Serialization: Message encoding/decoding (Xbuf2)

  • Handler Dispatch: Event routing to business logic

  • State Management: Object store operations

  • Persistence: Transaction log writes

  • Clustering: State replication to backup instances

  • Consensus: Acknowledgment protocol between primary and backup

This represents the most comprehensive benchmark in the suite and is used to publish official performance metrics for X Platform releases.

Test Programs

The AEP module provides two test programs:

ESProcessor (Event Sourcing)

Class: com.neeve.perf.aep.engine.ESProcessor

The Event Sourcing processor is the canonical benchmark used for published X Platform performance results. It uses Event Sourcing HA policy where:

  • Messages are the source of truth

  • State is replayed from message log on recovery

  • Optimal for high-throughput message processing

Used for: Official X Platform performance benchmarks published in documentation.

SRProcessor (State Replication)

Class: com.neeve.perf.aep.engine.SRProcessor

The State Replication processor uses State Replication HA policy where:

  • State objects are the source of truth

  • State changes are replicated to backup

  • State is persisted and recovered directly

Used for: Benchmarking state-heavy applications.

Canonical Benchmark Details

For complete details about the canonical benchmark methodology and published results, see:

Test Message

The benchmark uses a Car message (defined in nvx-perf-models) that:

  • Exercises the complete Xbuf2 data model

  • Contains ~200 bytes when serialized

  • Includes primitives, strings, nested objects, and arrays

  • Represents a realistic business message

Message Access Methods

The benchmark tests two message access patterns:

Indirect Access (xbuf2.serial/xbuf2.random)

Message data accessed via POJO getters/setters:

Direct Access (Serializer/Deserializer)

Message data accessed via zero-copy serializers:

Direct access provides ~10% lower latency and 2.4x higher throughput

Running the Benchmark

Prerequisites

  1. Extract the X Platform performance benchmark distribution

  2. Two Linux servers with InfiniBand or 10GbE networking (for clustered tests)

  3. Synchronized time between servers

Quick Single-Instance Test

For quick validation without clustering:

Full Clustered Latency Test

For the complete canonical benchmark matching published results, see the Test Description for detailed command-line examples with clustering, persistence, and CPU affinitization.

Command-Line Parameters

General Parameters

Parameter
Default
Description

--encoding

xbuf2.serial

Message encoding: xbuf2.serial, xbuf2.random

--count

10,000,000

Number of messages to inject

--rate

100,000

Message injection rate (msgs/sec)

--warmupTime

2

Warmup time in seconds before collecting stats

--printIntervalStats

false

Print interval stats during test

CPU Affinity Parameters

Parameter
Default
Description

--injectorCPUAffinityMask

null

CPU mask for message injector thread

--muxCPUAffinityMask

null

CPU mask for event multiplexer thread

--busDetachedSendCPUAffinityMask

null

CPU mask for detached sender thread

Persistence Parameters

Parameter
Default
Description

--enablePersistence

false

Enable transaction log persistence

--persisterLogLocation

.

Directory for transaction log

--persisterFlushOnCommit

false

Flush log on every commit

Clustering Parameters

Parameter
Default
Description

--enableClustering

false

Enable clustering (primary/backup)

--clusteringLocalIfAddr

0.0.0.0

Local interface for replication

--clusteringDiscoveryLocalIfAddr

0.0.0.0

Local interface for discovery

For complete parameter reference, see the source distribution or run with --help.

Interpreting Results

Latency Results

The test outputs latency percentiles in microseconds:

Includes:

  • Inbound deserialization

  • Handler dispatch

  • Business logic execution

  • Persistence

  • Replication to backup

  • Consensus acknowledgment

  • Outbound serialization

  • Round-trip wire latency (~23µs on unoptimized network)

Throughput Results

The test outputs maximum sustained throughput:

Represents: Maximum rate at which the clustered microservice can process messages while maintaining:

  • Full persistence

  • Replication to backup

  • Consensus acknowledgment

Published Results

Official performance results from this benchmark are published in:

Results include:

  • Multiple CPU configurations (MinCPU, Default, MaxCPU)

  • Optimization modes (Latency, Throughput)

  • Message access methods (Indirect, Direct)

  • Detailed performance analysis and tuning recommendations

Next Steps

Last updated