Engine Statistics

Configure AEP engine statistics collection including message latencies, transaction metrics, and per-message-type statistics.

Overview

AEP engine statistics provide detailed metrics about message processing, transaction execution, and bus activity. Most engine metrics are low overhead and always collected, but latency statistics and per-message-type stats can impact performance and are disabled by default.

Global Statistics Configuration

An XVM collects stats that are enabled for the applications that it contains. The following global statistics can be configured via environment properties:

Global Environment Properties

<env>
  <nv>
    <!-- Sample size configuration -->
    <stats.latencymanager.samplesize>65536</stats.latencymanager.samplesize>
    <stats.series.samplesize>10240</stats.series.samplesize>

    <!-- Global latency stats enablement -->
    <msg.latency.stats>true</msg.latency.stats>
    <ods.latency.stats>true</ods.latency.stats>
    <event.latency.stats>true</event.latency.stats>
    <msgtype.latency.stats>false</msgtype.latency.stats>

    <!-- Low-level I/O timestamps -->
    <link.network.stampiots>true</link.network.stampiots>
  </nv>
</env>
Environment Property
Default
Description

nv.stats.latencymanager.samplesize

nv.stats.series.samplesize

The global default size used for capturing latencies. Latencies stats are collected in a ring buffer which is sampled by the stats thread at each collection interval.

Tip: This should be sized large enough such that datapoints aren't missed, but not so large that it adversely affects processor cache performance.

nv.stats.series.samplesize

10240

Property that can be used to control the default sampling size for Series stats.

Tip: If the number of datapoints collected in a stats interval exceeds this size, the computation for histographical data will be lossy. Increasing the value will reduce loss of datapoints but results in greater overhead in stats collection in terms of both memory usage and pressure on the process caches.

nv.msg.latency.stats

false

This global property instructs the platform to collect latency statistics for messages passing through various points in the process pipeline.

Tip: When not enabled, many latency stats will not be available in heartbeats.

nv.msgtype.latency.stats

false

Property that enables tracking of message latency stats on a type by type basis. When set to true, timings for each message type are individually tracked as separate stats.

nv.ods.latency.stats

false

Globally enables collection of application store latencies

nv.event.latency.stats

false

Indicates whether or not event latency statistics are captured. Enabling Event latency stats record timestamps for enqueue and dequeue of events across event multiplexers, such as the AepEngine's input multiplexer queue. Enabling event latency stats is useful for determining if an engine's event multiplexer queue is backing up by recording the time that events remain on the input queue.

Tip: These stats must be enabled in order to capture input queuing times.

nv.link.network.stampiots

false

Instructs low level socket I/O stamp input/output times on written data.

Tip: This should be enabled to capture store latencies based on wire time, or for certain latencies in the direct message bus binding.

Per-Engine Statistics Configuration

Latencies related to a particular microservice's transaction pipeline can be configured at the application level:

Configuration Setting
Default
Description

captureTransactionLatencyStats

false

Property that enables collection of latency stats as messages flow through the AEP engine's transaction processing machinery.

captureEventLatencyStats

false

Property that globally enables collection of message latency stats as messages flow through the system. These statistics include latencies in the flow outside of transaction processing. For received messages these statistics include transmission, deserialization and dispatch costs. For sent messages these include serialization and transmission costs. When set to true, timings for messages are captured as they flow through the system. Enablement of these stats is required to collect message bus latency stats. Enabling this property can increase latency due to the overhead of tracking timestamps.

captureMessageTypeStats

false

Property that enables tracking of message statistics on a per message type basis. When set to true, statistics for each message type are individually tracked.

| | messageTypeStatsLatenciesToCapture | all | Property controlling which latency stats on a per message type basis. This property is specified as a comma separated list of values. Valid values include: - all - Indicates that all available per message type latency stats should be collected. - none - Indicates that no message type latency stats should be collected. - c2o - Indicates create to offer latencies should be captured. - o2p - Indicates offer to poll (input queueing time) should be captured. - mfilt - Indicates that time spent in application message filters should be captured. - mpproc - Indicates that time spent in the engine prior to message dispatch should be captured. - mproc - Indicates that the time spent in application message handlers should be captured. The values 'all' or 'none' may not be combined with other values. This value only applies when captureMessageTypeStats is true. When not specified the value defaults to all. |

Message Type Specific Stats

To enable message type specific stats and include them in heartbeats:

Statistics Output Threads (Development/Testing Only)

The following output threads can be enabled to trace individual types of statistics, which is useful for testing and performance tuning. Enabling these output threads is not required for collecting stats. Statistics trace output is not zero garbage, so in a production scenario it usually makes more sense to collect stats via XVM Heartbeats, which emits zero garbage heartbeats.

When an AepEngine is running inside of an XVM (the most common case), engine statistics are included in XVM heartbeats and should be traced using the XVM tracing facilities. The trace threads described below should not be enabled when running within an XVM as collection by the trace threads and XVM Stats collector thread can interfere with one another.

See Tracing Heartbeats

Configuration Setting
Default
Description

nv.aep.<engine>.stats.interval

0

The interval (in seconds) at which engine stats will be traced for a given engine. Can be set to a positive integer to indicate the period in seconds at which the engine's stats dump thread will dump recorded engine statistics. Setting a value of 0 disables creation of the stats thread. When enabled, engine stats are traced to the logger 'nv.aep.engine.stats' at a level of Tracer.Level.INFO; therefore, to see dumped stats, a trace level of 'nv.aep.engine.stats.trace=info' must be enabled. NOTE: Disabling the engine stats thread only stops stats from being periodically traced. It does not stop the engine from collecting stats; stats can still be collected by an external thread (such as the XVM which reports the stats in XVM heartbeats). In other words, enabling the stats thread is not a prerequisite for collecting stats, and disabling the stats reporting thread does not stop them from being collected. NOTE: While collection of engine stats is a zero garbage operation, tracing engine stats is not zero garbage when performed by this stats thread. For latency sensitive apps, it is recommended to run in an XVM which can collect engine stats and report them in heartbeats in a zero garbage fashion.

nv.aep.<engine>.sysstats.interval

0

The interval (in seconds) at which engine sys stats will be reported. Set to 0 (the default) to completely disable sys stats tracing for a given engine. In most cases, AEP sys stats will not be used and system level stats would be recorded in the XVM Statistics from which an AEPEngine is running.

nv.event.mux.<name>.stats.interval

0

The interval (in seconds) at which multiplexer stats will be traced. Multiplexer stats can also be reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero.

nv.msg.latency.stats.interval

0

The interval (in seconds) at which message latency stats are traced. This setting has no effect if nv.msg.latency.stats is false. This allows granular tracing of just message latency stats on a per bus basis. Message latency stats can also be reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero.

nv.aep.busmanager.<engine>.<bus>.stats.interval

0

The interval (in seconds) at which bus stats will be traced. Bus stats are reported as part of the overall engine stats from the engine stats thread, so there is no need to set this to a non-zero value if nv.aep.<engine>.stats.interval is greater than zero. When engine stats output is disabled this can be used to trace only bus stats for a particular message bus.

Next Steps

  1. Determine which latency statistics are needed for your use case

  2. Configure global latency stats if detailed performance metrics are required

  3. Enable per-engine statistics for transaction and event latencies

  4. Test performance impact of statistics collection under load

  5. Balance operational visibility against performance requirements

Last updated