Choosing an Encoding Type

Overview

The ADM code generator supports the following encoding formats for generated classes

  • Json

  • Protobuf

  • Xbuf2

From an API perspective the generated interfaces are functionally equivalent (with a few exceptions), but each encoding has different performance characteristics and serialize to the appropriate encoding format.

Encoding Types

Json

The Json encoding generates fairly simple classes that serialize to/from json. This encoding type is suitable for lightweight applications or for applications that natively work with json (e.g. web applications).

Pros

  • Memory utilization

    • Because there isn't much serialization machinery or caching of the backing serialized format, Json generated objects don't use much memory which can be useful for long lived state objects.

Cons

  • Performance

    • JSON serialization is slow and produces a lot of garbage, and JSON objects can't be pooled.

  • Size

    • Serializing to text is not very compact which leads to higher disk usage and network bandwidth.

Protobuf

With protobuf encoding objects are create with backing google protobuf generated objects. Protobuf is suitable for applications with higher performance requirements than is afforded by Json encoding. It should be used by applications with moderate to high performance requirements.

Protobuf is recommended for generating ADM objects used for the microservice store.

Pros

  • Memory Utilization

    • Protobuf generated objects are fairly compact in memory compared to Xbuf objects, non repeated field values are store directly in the generated message object, making protobuf encoded objects a good candidate for usage as state entities.

  • Performance

    • Faster serialization than Json.

  • Interoperability

    • Protobuf is a well known standard, making it easy to interoperate with applications not uses ADM generated code.

Cons

  • Performance Predictability

    • Google protobuf generated messages are not zero garbage and, thus, can result in large garbage collection related pauses.

Xbuf2

Xbufs generated objects, backed by Talon's high performance implementation of Google protobufs, supports zero garbage operation and cut-through serialization (the ability to read/write fields directly to from a backing buffer). It should be used for applications with the most stringent performance requirements.

Xbuf2 is recommended for use with ADM message models particularly for applications that require very low latency.

Pros

  • Performance (Throughput & Latency)

    • Faster serialization than Json or Protobuf encoding types

    • Optimized for both messages and state

  • Lower Memory Footprint

    • Object recycling and zero garbage support results in lower memory footprint than the Json and Protobuf encoding types

  • Interoperability

    • Protobuf is a well known standard, making it easy to interoperate with applications not uses ADM generated code.

  • Tunability

    • Offers several knobs to manage the tradeoff between performance and memory conservation

  • Flexibility

    • Offers multiple data access patterns

      • Random access (as is offered by the other encoding types)

      • Serial access

        • Direct Deserialization: The ability to serially traverse a Google Protobuf encoded buffer and dispatch the fields to the application via a callback

        • Direct Serialization: The ability for an application to directly serialize application fields into a buffer in the Google Protobuf wire format

Cons

  • Complexity

    • Xbuf2 stores field data off-heap. This can result in more complexities in the following areas

      • Troubleshooting issues

      • Monitoring memory utilization

      • Performing capacity planning particularly related to memory utilization.

    • Xbuf2 generated classes are larger

    • Working with pooling can result in higher development complexity

Known Limitations

  • Does not support the following field types

    • UUID

    • UUID[]

    • Currency

    • Currency[]

API Differences

For the most part code generated for the different encoding types behaves the same, but there are some key differences that stem from both the underlying serialization mechanisms and features supported.

Unrecognized Field Values

  1. For Json encoding unrecognized enum array values are treated as null, and for non array fields an unrecognized array value will be treated as null and hasXXX will return true.

  2. For Protobuf and Xbuf2, unrecognized fields (those with unrecognized field tags) are preserved when an inbound message is written to a transaction log (although they are inaccessible). If the message is copied by serializing to bytes and deserializing into a new message instance, the unrecognized fields from the original message are sent on the wire. If the message is modified prior to sending, the unrecognized fields may be lost.

  3. For repeated enum fields in Protobuf, unrecognized enum values are ignored. For Protobuf encoding the underlying protobuf may reorder the unrecognized enum values and put them at the end. Xbuf2 generated code preserves the order of unrecognized enums. When deserializing from Json, unrecognized enum values are treated as null so the effect on a deserialized message or entity is the same as adding an enum array with null values (see below).

Null Value Handling

  1. Classes generated with Json encoding support serializing null values and null values in arrays.

  2. For Protobuf and Xbuf2, setting a null value for a String, Date, Enum or Embedded Entity Field results in the field being cleared (the Google Protobuf wire format doesn't support null values on the wire).

  3. For Protobuf, setting a Date[], String[], or Enum[] containing a null element results in a NullPointerException being thrown. For Xbuf2, the behavior is the same as with Entity[] i.e. the null values are ignored.

  4. For Protobuf and Xbuf2, setting an Entity[] with a null element results in the null value(s) being ignored during serialization. The same holds true when using the XIterator setters or when calling addXXX to add the set of values.

  5. For Protobuf, after setting null values in an array field, subsequently calling the getter MAY or MAY NOT result in the null values being returned. Applications are encouraged to use the getXXXIterator accessors, and should be coded to handle either case for maximum portability both between encodings and for handling cases where the null values have been filtered out due to serialization. For Xbuf2, a subsequent call to get or iterator over array elements after setting a null element will NOT return the null element.

Pooling Considerations

A major difference between Xbuf2 and Protobuf or Json encoded entities is that Xbuf2 messages and entities are pooled by the platform by default. From a coding standpoint this means that when working with Xbuf2 encoded messages or entities:

  1. An application may not hold onto an Xbuf2 encoded message beyond the scope of a message handler.

  2. An application may not hold onto an XString or embedded entity type from a message beyond the scope of a message handler because these objects are pooled along with the message and will be reset once the message is returned to its pool. See Zero Garbage Nested Entities for detailed usage, but the general rule of thumb is to copy any entity that needs to be retained in the microservice store, or to use the more advanced 'take' apis. Note that 'take' is not supported for String fields as string fields are not pooled for Xbuf2 messages.

  3. Setting an XString or embedded entity field on a message transfers ownership to the message. If the application wants to retain the entity in the microservice store, then it should copy it into a new entity or use the more advanced 'lend' apis. See Zero Garbage Nested Entities for more details.

  4. An application may not mutate a returned array type from a message and should not hold onto to the array beyond the duration of a message handler. See Zero Garbage Array Accessors for more details.

\

\

Last updated