X-Trace Specification Document: X-Trace Metadata Format

Authors:Rodrigo Fonseca, George Porter
Draft-Created:18-Jul-2006
Last-Edited:07-Apr-2008
Specification Version:
 2.1.1
Revision:svn 53

Note

This version of this specification document is 2.1.1, and it specifies the X-Trace metadata version 1. The X-Trace metadata version is the version number that is encoded in the metadata itself, and is independent of the version (2.1.1) of this document.

Summary

This document describes the X-Trace metadata format version 1, and how it is encoded in both binary and textual form. It also describes the mandatory fields, and the format of optional fields. Separate documents describe individual options, including how they are represented, propagated, and how they relate to X-Trace reports.

1. Introduction

X-Trace [1] is a framework for tracing the execution of distributed systems. It provides a logging-like API for the programmer, allowing the recording of events during the execution. X-Trace records the causal relations among these events in a deterministic fashion, across threads, software layers, different machines, and potentially different network layers and administrative domains. X-Trace groups events, which we also call operations, to tasks. These are sets of causally related events with a definite start. Events in a task form a directed acyclic graph (DAG). Each task is given a unique task identifier, and each event within a task is given an identifier which is unique within the task.

X-Trace tracks the causal relation among events by propagating a constant-sized metadata along with the execution, both in messages exchanged by the different components and within the components themselves. The metadata contains the task identifier of the current task, and the last event recorded in the current execution sequence. Each event is reported to a separate reporting infrastructure, and each report records the identifier of the current event, as well as the identifier(s) of the event(s) that caused the current event. The reports contain other information in the form of key/value pairs. The format of these reports is described in the accompanying specification [2].

The X-Trace metadata has two mandatory components, and a set of optional ones, described in their own specifications. The first mandatory component is a Task Identifier, or TaskId, which uniquely identifies a task (in the context of a reporting infrastructure and during a finite window of time). The TaskId is the same across all operations comprising a task, and serves the purpose of identifying all such operations as belonging to the task. The second component is called the Operation Identifier, or OpID, and should be unique for each operation or event within the task. The OpID identifies the operation that generated the metadata, and is used by the subsequent operation(s) to record the causal relationship between the two operations by means of reporting.

There are also optional fields, which may or may not be present, and enhance the functionality of the X-Trace framework. At the time of this writing, there are three types of options defined, but more can be added in the future. They are the Destination option, for specifying where to send reports if not to a default location; the ChainId option, to explicitly indicate concurrent chains of events; and the Severity option, that indicates how important it is to record events related to a particular task.

This document specifies in detail the layout and meaning of the X-Trace Metadata how it is represented in binary and textual form, and how it is propagated in the same and between adjacent layers. It does not discuss reporting in detail, nor how to implement the propagation. It is also beyond the scope of this document how different protocols include X-Trace Metadata associated to their respective protocol data units.

2. X-Trace Metadata Binary Representation

The X-Trace Metadata (XTR-Md) is represented in a compact wire format by the following byte layout. This layout is used when the XTR-Md is encoded in binary protocols, as is the case of the encodings of XTR-Md in IP options and in TCP options. A hexadecimal textual representation of this binary layout is also used when XTR-Md is encoded in text-based protocols, as is the case of HTTP and SIP, for example.

Flags: 1 byte
TaskId: 4, 8, 12, or 20 bytes
OpId: 4 or 8 bytes
Options Block (optional):
OptionsLen: 1 byte
Options: OptionsLen bytes

The only mandatory fields are flags, TaskId, and OpId. The other fields are optional.

2.1 Flags

There are 8 bits (1 byte) allocated for flags. The bits are presented here with 7 being the most significant bit and 0 the least significant bit.

7 6 5 4 3 2 1 0
Version OpIdLen Options IdLen
Bit Name Description
0-1 IdLen Encodes the length of the TaskId
2 Options Whether there are option fields present
3 OpIdLen If 0, OpIdLen = 4, if 1, OpIdLen = 8
4-7 Version Encodes the version of the X-Trace Metadata

Bits 0 and 1 encode the length of the TaskId as follows:

IdLen Mask Bytes for TaskId
1 0
0 0 0x00 4
0 1 0x01 8
1 0 0x02 12
1 1 0x03 20

Bits 4 through 7 encode the version of the metadata as an integer, with 4 being the least significant bit. The value 15 (bits 4 through 7 set to 1) is reserved for future expansion of the version number, if necessary. The current version id represented by this spec is 1. It is backwards compatible with version 0. The difference between the two is that version 0 has bit 3 of the flags set to 0, and only allows OpIds of length 4.

Version Mask Version
7 6 5 4
0 0 0 0 0x00 0
0 0 0 1 0x10 1
0 0 1 0 0x20 2
...
1 1 1 1 0xF0 reserved

2.2 TaskId

The TaskId identifies a Task, or a higher level operation that generally corresponds to a user-initiated task. It can be composed of several operations or events that span different software components, nodes, and layers. As noted above, it can be 4, 8, 12, or 20 bytes long, with the length being selected by the IdLen bits in the flags field. The TaskId MUST be the same across all operations comprinsing a task, or else the operations will not be associated with the same task.

The set of TaskIds comprised of all 0's is reserved to mean INVALID TaskIds. An X-Trace Metadata with an INVALID TaskId is invalid, and MUST not be propagated or generate reports.

2.3 OpId

The OpId field identifies each operation or event that is part of a task and needs to be distinguished from the point of view of the X-Trace framework. It is a 4 or 8 bytes in length, depending on the setting of the flags bit 3. The value is interpreted as an opaque string of bytes, not as a number, and needs to be unique unique within the context of a task.

If the OpId length is 4 bytes, the version can be set to 0 or 1. The table below specifies how implementations of versions 0 and 1 of the X-Trace metadata specification treat the different settings of the OpId length field.

Version OpIdLen OpId
Version 0
Impl.
Version 1
Impl.
0 0 4 bytes ok ok
0 1 INVALID METADATA fail fail
1 0 4 bytes fail ok
1 1 8 bytes fail ok

If ChainsIds are being used as options to capture the concurrency structure of a task, then the OpId needs to be unique only within the context of a ChainId.

2.4 Metadata Length

Given these three mandatory fields, the smallest well-formed X-Trace metadata is 9 bytes long, comprising the flags field, a 4-byte TaskId, and a 4-byte OpId. As two examples, in hex representation, a well-formed and valid X-Trace metadata can be 00 01020304 03030303 (with spaces added between the fields for clarity). The smallest well-formed, invalid X-Trace metadata is 00 00000000 00000000. Note that if the OpId length is set to 4, the settings of version to 0 or 1 are both valid.

The maximum size is 1 + 20 + 8 + 1 + 255 = 285 bytes, but so far we have seen very little use of options, and no long options have been defined.

2.4 Optional / Extended fields

The option bit in the Flags field indicates the presence of optional or extension fields in the metadata.

2.4.1 Options Length

To make determining the size of the XTR-Md easier, there is a Length field that contains the length of all options combined, in bytes. This length does not include the length field itself. Thus, for determining the total length of an X-Trace metadata with options, one would add:

1 (flags) + (length of TaskId) + (length OpId) + 1 (OptionsLen byte) + OptionsLen.

If present, the options length field MUST NOT be set to 0. If there are no options, the O bit in the Flags field MUST be set to 0.

2.3.1 Option Format

Following the Options Length field, there may be one or more options. Options have a Type field, represented by one byte.

If the type is 0, it is a no-option type. The option of type 0 has a total length of 1 byte, and no body. Option type 0 MAY be used for padding, when it is more efficient or not possible to include arbitrary-length byte fields in protocols. It MUST only occur at the end of the options list. Implementations MUST NOT try to parse options past the first type 0 option.

No-option option format
Type: 1 byte (0)

If the option type is not 0, i.e., between 1 and 255, then the option is a normal option, and follows the option format below:

Normal option format
Type: 1 byte (1-255)
Length: 1 byte (0-253)*
Payload: Length bytes

The length only includes the size of the payload. Because of the total length of all options being limited to 255 bytes, the maximum length of each option can be at most 253 bytes (because of the type and length fields of the option itself. It there are more than one options, then the maximum length of each will be correspondingly smaller.

3. X-Trace Metadata Text Representation

For text-based protocols (ASCII, Unicode), like HTTP, SIP, or email, for example, X-Trace Metadata is represented as a the hexadecimal string of the successive bytes in the binary representation. The characters A to F SHOULD be represented in CAPITAL LETTERS, but implementations SHOULD be tolerant to non-capital letters. Each byte MUST be 0-padded and thus represented by two characters each.

4. Propagation

For the propagation of the X-Trace metadata, a node implementation will generate a unique OpId and replace the OpId of the previous operation(s) that happened before the current one. The TaskId MUST remain the same. So that the graph remains connected, every time a new OpId replaces a preceeding one, this binding MUST be registered in a report.

How specific options are propagated will depend on the semantics of each option. How each option is propagated is part of the option's specification. However, if an implementation doesn't know how to propagate a specific option, it MUST copy the option to any new metadata it generates based on the current one.

Figure 1 below shows an example of a simple HTTP transaction through a proxy that propagates X-Trace Metadata across layers and operations of the task. In the figure, (a) represents an OpId, and <T,a> represents metadata with TaskId T and OpId a.

   +--------+                                      +--------+
(a)| HTTP   |           <T,a>                   (b)|  HTTP  |    <T,b>
   | Client | ...................................> | Proxy  | ............>
   +--------+                                      +--------+
       |                                            ^     |
       | <T,a>                               <T,d>  |     |     <T,b>
       \/                                           |     \/  (e)
    +--------+         <T,c>             (d)+--------+   +--------+ <T,e>
 (c)|  TCP   |.............................>|  TCP   |   |  TCP   |........>
    +--------+                              +--------+   +--------+
        |                                       ^           |
        |                                       |           |  <T,e>
        \/  (f)                (g)              |  (h)      \/   (i)
     +--------+  <T,f>   +--------+ <T,g>   +--------+    +--------+ <T,i>
     |  IP    | -------->|  IP    |-------->|  IP    |    |  IP    |------->
     +--------+          +--------+         +--------+    +--------+

                               Figure 1.

5. Author's Address

Rodrigo Fonseca
465 Soda Hall
Berkeley, CA 94720-1776
USA

email - rfonseca at cs dot berkeley dot edu

George Porter
465 Soda Hall
Berkeley, CA 94720-1776
USA

email - gporter at cs dot berkeley dot edu

Citations

[1]Rodrigo Fonseca, George Manning Porter, Randy H. Katz, Scott Shenker and Ion Stoica, "X-Trace: A Pervasive Network Tracing Framework". In Proceedings of the 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI'07). Cambridge, MA, USA, April 2007.
[2]X-Trace Specification Document - Report Format. http://www.x-trace.net

Appendix: Change Log

Changes marked with a '*' mean changes that have implementation implications. Otherwise changes just refer to the document (fixes and clarifications). The versioning reflects this: minor numbers will change with at least one '*' change, e.g., from 1.2.x to 1.3.0.

  • 2.1.1 - minor changes and fixes

  • 2.1.0 - * upgraded the Metadata version to 1, backwards compatible with version 0.

    This update introduces variable length OpId field (4 and 8 bytes).

  • 2.0.0 - major revision of the X-Trace metadata format, simplifying the metadata contents and propagation.

  • 1.3.1 - added change log. fixed section numbering

  • 1.3.0 - ! added a length byte to the destination field

  • 1.2.1 - fixed typo in the mask for the task id length of 20 bytes: was 0x0C, should be 0xC0

  • 1.2.0

    • ! added invalid XTR id
    • updated the description example to have 4-byte tree info operation ids
    • Added sentence to cover propagation operations on metadata with no tree info.
    • fixed typo and clarified IdLen flags
    • added reference to NSDI paper