Note: Logging

Nodes will encounter errors. It will be useful to have a common method of recovering information about those errors. This is generally called “logging”, which consists of methods for informing interested parties that log information is available, methods for retrieving that information, and for deleting it from the log.

This note is an early draft of a proposal in this area.

Use Cases

A node encounters a transient error and stores information about it. A central logging program is informed of that, and later retrieves the information.

Several logging programs can be active and retrieve all log information available.

A logging program is only interested in information from a subset of nodes, and retrieves only those.

Logging programs can come and go.

Discussion

Three parts: Notification, retrieval, identification, buffer management. Two actors: Logger and reader.

Events provide a convenient way of globally notifying readers that new entries are available. We define a “well-known EventID” consisting of the NodeID followed by 0xFFF8 to indicate that a node has logged a new entry.

No way to get “list of entries not read yet”, because the logging node can't keep track of that for an arbitrary number of nodes.

Instead, we provide a way to retrieve the index of the oldest and newest entries still available. The reader then keeps track of which ones have been read.

A log can contain as many entries as the node has space for, limited only by the 16-bit field that we provide for entry numbers.

“A node does not have to provide logging. If it does have some logging mechanism, it doesn't have to use this protocol. But it may only be labelled as providing this common protocol for log access if it provides it completely.”

This protocol retrieves short, ASCII coded strings that represent each log entry. The detailed format of log entries is not discussed here.

Base Protocol Proposal

Notification

Retrieval

Datagram content:

Trying to get the “Log Request” into a single-frame datagram, although “Log Reply” won't fit.

Log Reply always carries the number of this entry, and the highest available number. The get oldest/get youngest bits are also set.

Use a Log Request to set up a stream: Log Request for stream, reply carries a Stream ID for write to the node, or initiates the stream for readback.

CAN protocol proposal

Basically the same.

Examples

Initial Occurrence, Single Retrieval

Node 53 records event in log buffer with number e.g. 27

Event “Log in Node 53” sent →

← Datagram: Retrieve Most Recent Log Entry

Datagram: Log Entry, entry number 27, latest entry 27, content →

Retrieve Full Log

← Datagram: Retrieve Oldest Log Entry

Datagram: Log Entry, entry number 24, latest entry 27, content →

← Datagram: Retrieve Log Entry 25

Datagram: Log Entry, entry number 25, latest entry 27, content →

← Datagram: Retrieve Log Entry 26

Datagram: Log Entry, entry number 26, latest entry 27, content →

← Datagram: Retrieve Log Entry 27

Datagram: Log Entry, entry number 27, latest entry 27, content →

Single Occurrence, Multiple Retrieval

Node 53 records event in log buffer with number e.g. 27

Event “Log in Node 53” sent →

← Datagram from A: Retrieve Most Recent Log Entry

Datagram to A: Log Entry, entry number 27, max entry 27, content →

← Datagram from B: Retrieve Most Recent Log Entry

Datagram to B: Log Entry, entry number 27, max entry 27, content →

Overlapping Occurrences

Node 53 records event in log buffer with number e.g. 27

Event “Log in Node 53” sent →

Node 53 records event in log buffer with number e.g. 28

← Datagram: Retrieve Most Recent Log Entry

Event “Log in Node 53” sent -->

Datagram: Log Entry, entry number 27, max entry 27, content →

← Datagram: Retrieve Most Recent Log Entry

Datagram: Log Entry, entry number 27, max entry 27, content -->

Numerology

A datagram can carry 70 (64?) bytes. Two byte sequence number. One byte for flags and to indicate if more entries are available. End up with log entries being limited to e.g. 64 bytes. Is that an issue? It's a short line of text. Propose concatenation, e.g. log “records” instead of entries?

Extensions

Except for the sequence field & length, the format of a log entry is not defined. That would be an extension.


Support for larger/faster retrieval via streams would be a good extension. E.g. after getting back an indication that there are 100 messages queued, another request would stream them back to the requester.

Testing

For doing conformance testing, there must be some mechanism to force creation of one or more log entries. We probably need a standard way to do that, e.g. via a command datagram that creates some recognizable entry, sends the event, etc.


Site hosted by

This is SVN $Revision: 785 $