AEP Module
The AEP (Application Event Processing) module contains the canonical end-to-end performance benchmark for X Platform. This benchmark measures the complete Receive-Process-Send flow of a clustered microservice.
Overview
The AEP module benchmarks exercise the entire X Platform stack, including:
Messaging: Inbound and outbound message handling
Serialization: Message encoding/decoding (Xbuf2)
Handler Dispatch: Event routing to business logic
State Management: Object store operations
Persistence: Transaction log writes
Clustering: State replication to backup instances
Consensus: Acknowledgment protocol between primary and backup
This represents the most comprehensive benchmark in the suite and is used to publish official performance metrics for X Platform releases.
Test Programs
The AEP module provides two test programs:
ESProcessor (Event Sourcing)
Class: com.neeve.perf.aep.engine.ESProcessor
The Event Sourcing processor is the canonical benchmark used for published X Platform performance results. It uses Event Sourcing HA policy where:
Messages are the source of truth
State is replayed from message log on recovery
Optimal for high-throughput message processing
Used for: Official X Platform performance benchmarks published in documentation.
SRProcessor (State Replication)
Class: com.neeve.perf.aep.engine.SRProcessor
The State Replication processor uses State Replication HA policy where:
State objects are the source of truth
State changes are replicated to backup
State is persisted and recovered directly
Used for: Benchmarking state-heavy applications.
Canonical Benchmark Details
For complete details about the canonical benchmark methodology and published results, see:
Canonical Benchmark Overview - What is benchmarked and key metrics
Test Description - Complete test methodology, hardware, and configuration
Test Results - Published performance results by release
Test Message
The benchmark uses a Car message (defined in nvx-perf-models) that:
Exercises the complete Xbuf2 data model
Contains ~200 bytes when serialized
Includes primitives, strings, nested objects, and arrays
Represents a realistic business message
Message Access Methods
The benchmark tests two message access patterns:
Indirect Access (xbuf2.serial/xbuf2.random)
Message data accessed via POJO getters/setters:
Direct Access (Serializer/Deserializer)
Message data accessed via zero-copy serializers:
Direct access provides ~10% lower latency and 2.4x higher throughput
Running the Benchmark
Prerequisites
Extract the X Platform performance benchmark distribution
Two Linux servers with InfiniBand or 10GbE networking (for clustered tests)
Synchronized time between servers
Quick Single-Instance Test
For quick validation without clustering:
Full Clustered Latency Test
For the complete canonical benchmark matching published results, see the Test Description for detailed command-line examples with clustering, persistence, and CPU affinitization.
Command-Line Parameters
General Parameters
--encoding
xbuf2.serial
Message encoding: xbuf2.serial, xbuf2.random
--count
10,000,000
Number of messages to inject
--rate
100,000
Message injection rate (msgs/sec)
--warmupTime
2
Warmup time in seconds before collecting stats
--printIntervalStats
false
Print interval stats during test
CPU Affinity Parameters
--injectorCPUAffinityMask
null
CPU mask for message injector thread
--muxCPUAffinityMask
null
CPU mask for event multiplexer thread
--busDetachedSendCPUAffinityMask
null
CPU mask for detached sender thread
Persistence Parameters
--enablePersistence
false
Enable transaction log persistence
--persisterLogLocation
.
Directory for transaction log
--persisterFlushOnCommit
false
Flush log on every commit
Clustering Parameters
--enableClustering
false
Enable clustering (primary/backup)
--clusteringLocalIfAddr
0.0.0.0
Local interface for replication
--clusteringDiscoveryLocalIfAddr
0.0.0.0
Local interface for discovery
For complete parameter reference, see the source distribution or run with --help.
Interpreting Results
Latency Results
The test outputs latency percentiles in microseconds:
Includes:
Inbound deserialization
Handler dispatch
Business logic execution
Persistence
Replication to backup
Consensus acknowledgment
Outbound serialization
Round-trip wire latency (~23µs on unoptimized network)
Throughput Results
The test outputs maximum sustained throughput:
Represents: Maximum rate at which the clustered microservice can process messages while maintaining:
Full persistence
Replication to backup
Consensus acknowledgment
Published Results
Official performance results from this benchmark are published in:
X Platform Release Notes
Results include:
Multiple CPU configurations (MinCPU, Default, MaxCPU)
Optimization modes (Latency, Throughput)
Message access methods (Indirect, Direct)
Detailed performance analysis and tuning recommendations
Next Steps
View Results: See Test Results for published performance data
Understand Methodology: Read Test Description for complete test details
Component Benchmarks: Explore other modules for component-level benchmarks:
Serialization Module - Message encoding/decoding only
Persistence Module - Persistence layer only
Link Module - Cluster replication transport
Return: Go back to Modules Overview
Last updated

