From fd.io
Jump to: navigation, search

CPEL Performance Event Log Files

This page describes the CPEL file format. Over the last decade, we've constructed a set of tools to collect, display and report fine-grained performance event data. As the toolset has grown, limitations inherent in previous file formats have become an issue.

CPEL files consist of a set of sections, similar to ELF-files. Specific tools may or may not understand a particular section. Sections are TLV's: (tag, length, value) tuples. Specific tools can skip or copy sections without needing to know anything about the data therein.

File Format

Octet offset Data
0x0 Endian bit (0x80), File Version, 7 bits (0x1...0x7F)
0x1 Unused, 8 bits
0x2-0x3 Number of sections (16 bits) (NSECTIONS)
0x4 File date (32-bits) (POSIX "epoch" format)
0x8 Type of first section (32 bits)
0xC Length of first section (32 bits), not including type and length
0x10 First section data
NSECTIONS-1, or up to 64K-2 additional sections
... Type of next section (32 bits)
... Length of next section (32 bits)
... Next section data

File format notes

  1. When set, the endian bit (0x80 of the first octet of the file) means that data throughout the file is in little endian byte order. Tools SHOULD support little-endian data but may not. Tools MUST either support little-endian data or announce that they do not.
  2. The [historical] Gist event-log format is easily distinguished from this format. Gist event-logs are big-endian; the first 4 octets of the file contain the number of events in the file. Therefore, Gist event-logs containing less than 2**24 [?16 million] events will always appear to have a CPEL version of zero. Due to TFTP-server limitations, one strongly suspects that all extant Gist event-logs contain less than 2**24 events.
  3. Sections may appear in any order

Section Formats

CPEL files are intended to stand alone, meaning that one should not need to preserve event definition header files, remember CPU clock rates, and so forth to use the data later. In addition to beefing up the event section type, we define additional section types to enable additional functions in the G2 graphical event viewer.

String Table Section - Section Type 1

String table sections concatenate NULL terminated strings and (possibly) arbitrary binary data. The first string in a string table is the table s name. Event definition and event sections refer to strings in a string section by offset from the base of the section.

String table sections MUST be padded to a 4-octet boundary by the addition of 1-3 NULL bytes, as needed.

Tools, particularly visualization tools such as the g2 event viewer MUST support multiple string sections. Tools MAY choose to combine string tables.

Symbol Table Section - Section Type 2

Symbol table sections allow tools to translate arbitrary hexadecimal values into function+offset format. Symbol table sections MUST be numerically sorted by the value field. The section data portion of a symbol table section has the following format:

 struct symbol_section_header {
 	char string_table_name[64];
 	unsigned long number_of_symbols;

 struct symbol_section_entry {
 	 	unsigned long value;
 	unsigned long name_offset_in_string_table;

See the next section for a usage description.

Event Definition Section - Section Type 3

Event definition sections provide display and formatting information for events found in event sections. The section data portion of an event definition section has the following format:

 struct event_definition_section_header {
     char string_table_name[64];
     unsigned long number_of_event_definitions;

 struct event_definition_entry {
     unsigned long event_code;
     unsigned long event_format_offset_in_string_table;
     unsigned long datum_format_offset_in_string_table;

These definitions generalize the stylized header-file definition scheme a bit by completely decoupling event formatting from datum formatting.

String sections enable visualization and reporting tools to neatly represent text events [e.g. syslog strings]. Symbol table sections function likewise when reporting hexadecimal data in routine+offset format.

Format String Interpretation

Visualization tools SHOULD interpret the following printf-like (sub)strings as described:

Format String Interpretation
%s String -- insert the indicated string-table entry
%k Symbol -- find nearest match in symbol table, output routine+offset format
%d, %o, %u, %x, etc. Standard printf formats

Specifying an event_format_offset_in_string_table of zero is equivalent to specifying an event format of E%d. Specifying a datum_format_offset_in_string_table of zero is equivalent to specifying a datum format of [NULL string, prints nothing]. The former behavior is less useful than the latter, which one often uses in practice.

Visualization tools SHOULD interpret missing event definitions as equivalent to specifying the definition [code, StringTableOffset(E%d), 0]

Visualization tools MAY implement arbitrary extended formatting functions. Examples might include symbolic display of protocol PDUs, IP addresses, and arbitrary data structures. Data which occupies more than 4 octets must be added to an event section's associated string table, and referred to in individual events by string section offset.

Embedding arbitrary binary data in a string section should not cause confusion.

This document is not the right venue for defining a meta-language to specify arbitrary bit-fiddling, mapping values to strings, etc. Reporting and visualization tools deal with millions of events. Tool performance is of importance. One expects that a useful set of extended formatting functions will evolve.

Tools MAY interpret multiple printf-like format strings in the following manner:

sprintf(buf, "0x%08x(%d)", event->event_datum, event->event_datum);

Track Definition Section - Section Type 4

Track definition sections provide display and formatting information for track data found in event sections. Visualization tools such as the g2 viewer display events grouped by track. The idea is to provide a means to label per-track displays with something more interesting than the numeric track-id.

For example, if a given CPEL file comprises event data where the track field is the process-id of a specific process, one might choose to display the name of the process in addition to its numeric PID.

struct track_definition_section_header {
    char string_table_name[64];
    unsigned long number_of_track_definitions;

struct track_definition {
	unsigned long track_id;
	unsigned long track_format_offset_in_string_table;

Event Section - Section Type 5

Event sections contain fine-grained performance event data, in the following format:

struct event_section_header {
    char string_table_name[64];
    unsigned long number_of_events;
	 unsigned long clock_ticks_per_microsecond;

struct event_entry {
	unsigned long time[2];
	unsigned long track;
	unsigned long event_code;
	unsigned long event_datum;

The event_entry format is identical to the format understood by our current generation reporting and visualization tools. In cooperation with reporting and visualization tools, string table references allow (track, event_code, event_datum) tuples to specify three arbitrary data, of arbitrary size.

Dbarach (talk) 18:10, 12 February 2016 (UTC)