ERNI Technology Post No. 53: Semantic logging: Log what you mean

Tracing, an essential part of software applications, demands a solid logging strategy. This TechLetter describes what aspects of logging are needed to have a beneficial impact on software quality.

Why do these applications exist?

Applications exist to provide value for their users.

We often narrow the term "users" to end users.
Users are all entities interacting with the application.

However, within today’s complex systems there are even more stakeholders involved. For example:

  • Operations - responsible for flawless running of the application, reasonable use of system resources, usually using some kind of monitoring systems
  • Customer Support – responsible for identifying potential bugs in the system
  • Developers – responsible for fixing the bugs

These stakeholders depend on the application’s ability to provide information about its state and trace ongoing activities. The better the quality of those traces the easier it is to locate and fix problems in the system. This is where logs came into play.

Note:
There is a significant difference between Audit and Trace.
Audit must be written, Traces are optional. If the audit operation cannot be performed, the application mustn’t continue. On the other hand, inability to trace should not influence application behaviour.
*This article is strictly related to Tracing.

Existing logging frameworks simplify the logging process. Their main advantage is separation of the log event creation from actual event-writing. Developers don't need to care where and how the log should be written. Their responsibility is to log the event. The framework will take care of formatting the message, adding auxiliary information and writing the event into a final destination (file, event log, and database). All the necessary information is provided in configuration file.

For more information about the topic check ERNI the tech letter #50 Using log4Net in .NET projects

Developers very often consider logging as a debug tool just for themselves; the log messages are seldom standardized, the log structure and requirements are usually neglected and seen as a by-product.

Commonly used logging frameworks have the following flaws:

  • Logs are free form.
    • Developers don't follow any formal rules as to what and when something should be logged.
    • There is an assumption that logs will be read by humans. They are rarely machine-readable.
    • The metadata (like log channels, severity or verbosity) are scattered around the code (there is no central point where they might be managed).
  • Logs are text based.
    • Text concatenation could be a performance burden.
    • There is no strong type support for log methods.
  • Logs are isolated

If there are multiple log files, it is hard to correlate the contained events from different sources or with the other system activities.

Semantic Logging: Log what you mean

To overcome the flaws of free text logging, let’s introduce semantic logging. The approach is similar to constraining a free form document to adhere to given schema.

  1. The logging methods will be strongly typed
  2. Methods along with the parameters should have self-descriptive name
  3. Keep all the meta-information (log level, categories...) inside the method
  4. Introduce a centralized class that will handle your logging activity
  5. Use only these methods for logging

Note:
The class you use to handle the logging might become larger over time. So it is good practice to split that class up into smaller classes and group them semantically, as you would with any other big class. However, beware of creating too many classes. The number of classes should be reasonably low to allow for a single point of maintenance.

The semantic logging approach doesn't depend on any specific technology or framework. You can use any existing logging framework internally. If you already have logging in place, you can start refactoring your code by unifying the log messages in a semantic way while still gaining the advantages of your framework.

If your code looks similar to this:

     if(log.IsDebugEnabled) {
         log.Debug("Entry number: " + i + " is " + entry[i]);
     }

you could refactor it:
Usage:

     Log.EntryNumber(i);

Log class definition (using the same framework):

     public void EntryNumber(int i) {
       if(internalLogger.IsDebugEnabled) {
         internalLogger.Debug("Entry number: " + i + " is " + entry[i]);
       }
     }

Semantic logging will bring you the following advantages:

  • You must be sure that you format and structure your log messages in a consistent way because then there is no chance of making a coding error when you write the log message.
    For example, an event with a particular ID will always have the same verbosity, extra information, and payload structure.
  • It is also easier to have a holistic view of all the application events, as they are defined in a centralized place and refactored to make them consistent with each other.
  • It is easier to query and analyse your log files because the log messages are formatted and structured in a consistent manner.
  • You can more easily parse your log data using some kind of automation. This is especially true if you are using one of the Semantic Logging Application Block (SLAB) sinks that preserve the structure of the payload of events; for example, the Microsoft Azure Table sink preserves the structure of your log messages. However, even if you are using flat files, you can be sure that the format of log messages with a particular ID will be consistent.
  • You can more easily consume the log data from another application. For example, in an application that automates activities in response to events that are recorded in a log file or an Azure.

Event Tracing for Windows

Typically logs are collected in log files and/or databases. The Windows operating system provides another option: Event Tracing for Windows (ETW). It is a strongly typed and high performance tracing mechanism built into the Windows Operating System. Being a strongly typed logging system, it is a natural match for semantic logging.

Note: The terms semantic and strongly typed logging are often used interchangeably.

ETW also provides several useful capabilities. For example:

  • Different event channels. Every channel target is meant for different audience:
    • Admin channel is for system admins
    • Debug channel for developers (the default)
    • Operational channel for other applications
  • Causality tracking: enables link and nest events to be put into hierarchical structures (In .NET it is also capable of tracking Task based causalities over multiple threads).
  • Log events will have a timestamp and a stack trace attached to it.

ETW Architecture

The architecture of ETW consists of:

  1. Session / controller, the logging session inside the ETW infrastructure. It collects, filters, stores and routes the data from providers. It is able to tell the providers to start and stop logging and how verbose the logs should be.
  2. Provider, the part of the logging system connected to the monitoring application. It calls the logging API when interesting things happen within the application.
  3. Consumer, it takes the data from the session and presents it in a particular way (generates statistics, shows alarms ...)

ETW Architecture

  1. The name of the event (internally composed of task and opcode)
  2. The names and types of all the data items that are logged along with the event.
  3. Optional metadata (the verbosity, keywords, channels ...)

Producing events in .NET: EventSource

The ETW infrastructure is exposed in .NET via the System.Diagnostics.Tracing namespace.

EventSource class an event provider. To create your own event provider follow these steps:
 

1. Create a sealed class derived from the EventSource class

2. Annotate the class with the EventSourceAttribute and specify the name of your source.

- The event source name should be unique in the system. - Recommended naming convention is a dash-separated hierarchical name: <company name>-<application name>-<module name>.

For example: "ERNI-ETWSample-SampleClient" and "ERNI-ETWSample-SampleService".

3. Expose the public property with the singleton instance of the class.

4. Author your own keywords or log levels

5. Carefully create Event log methods.

- Method name is important because it will become the event name  

- If two methods with consequent eventIds share the same name plus Start/Stop suffix, they will automatically trigger the correlation tracing.

- The method argument names and order are important, because they define how the payload looks like

- Annotate the method with the Event attribute and specify:

  • EventId (unique for the class)
  • Message string – this pattern will be used for displaying the message (it is not part of the message, rather it is part of the manifest for client)
  • Optional keywords, channel, severity and verbosity

- Call the proper WriteEvent overload. You need to pass in:

  • the eventId from the annotation
  • all method arguments in the same order

6. Create Not-Event methods

The ETW supports limited number of types. It supports all primitive types along with Guid and DateTime types. If you are passing more complex type, you should:

Annotate the method with the NotEvent attribute

Process the complex type and call some of the Event methods.

Should the processing be time consuming, wrap the call with the EventSources IsEnabled() check.

     [EventSource(Name = "ERNI-EtwSample-SampleServer")]
     internal sealed class EtwSampleServerEventProvider : EventSource
     {
     internal static EtwSampleServerEventProvider Log = new EtwSampleServerEventProvider();

     [Event(1, Level = EventLevel.Error, Message = "Service failed.\n{0}\n{1}")]
     internal void ServiceFailed(string exceptionType, string exceptionMessage)
     {
         WriteEvent(1, exceptionType, exceptionMessage);
     }

     [Event(2, Level = EventLevel.Informational, Message = "Service started. BaseUri: {0}")]
     internal void ServiceStarted(string baseUri)
     {
         WriteEvent(2, baseUri);
     }

     [Event(3, Message = "Request started for uri = {0}, client = {1}")]
     internal void RequestStart(string uri, string client)
     {
         WriteEvent(3, uri, client);
     }

     [Event(4, Message = "Request stop")]
     internal void RequestStop()
     {
         WriteEvent(4);
     }

     [Event(5, Message = "Generated random int: {0}")]
     internal void RandomIntGenerated(int randomInt)
     {
         WriteEvent(5, randomInt);
     }
     }

Note: There are two possible ways to incorporate EventSource into your code. Either using the classes in the framework for 4.5 or 4.6 (using System.Diagnostics.Tracing namespace) or via NuGet package (using Microsoft.Diagnostics.Tracing namespace).

The tracing library calls by design cannot fail. The misconfigured method can come unnoticed. Careful review and tests of the event source code are recommended. A NuGet package Microsoft.Diagnostics.Tracing.EventRegister contains custom build tasks which  validate EventSource implementations. Also SLAB library (described later) contains EventSourceAnalyser which can inspect the EventSource implementation for errors.

From an architectural point of view, it is recommended to create a reasonable amount of EventSources in the application. They should correspond with high level modules.

Since the ETW can only efficiently serialize the primitive types, it is also recommended to wrap the raw methods into non-event methods. This can also help with code decoupling (The EventSource class will contain only the methods with primitive type arguments, the more semantic logger classes will be part of an appropriate module and relay the messages to the ETW EventSource). 

Processing events

Semantic logging application block (SLAB)

Semantic logging application block (SLAB) provides a bridge between EventSource style logging and traditional logging approaches. It is a library created by Microsoft patterns and practices.

We can run the SLAB in two modes: In-process and Out-of-Process mode.

In-Proc

In-process provides a way to consume and process log events in the same process. It is targeted for development phase where developers prefer to see immediate response.

The core of the infrastructure consists of Sinks and ObservableEventListeners. ObservableEventListener is a target for the events generated by EventSource. The class is capable to expose the intercepted events via the IObservable interface, which enable advanced manipulation using reactive extensions. Sinks are the end of the chain, where the events are formatted and stored. The common sinks are Console, Flat/Rolling file, database and azure tables.

Developers configure the filters and sinks in code.

The sample shows the how the listeners could be configured and connected:

     public static class LoggingInfrastructure
     {
         private static ObservableEventListener _listener;

         public static void Start()
         {
             if (_listener != null)
             {
                 throw new InvalidOperationException("Logging infrastucture already running");
             }

             _listener = new ObservableEventListener();
             _listener.EnableEvents(EtwSampleServerEventSource.Log, EventLevel.LogAlways);
             _listener.LogToConsole(new SampleFormatter());
         }

         public static void Stop()
         {
             _listener.Dispose();
             _listener = null;
         }
     }

The LoggingInfrastructure.Start()is meant to be called when application starts (beginning of Main method, Application_Start()in Global.asax)

Note: Current version of the SLAB library had some issues when we run it with the current version of .NET framework. Therefore we are using the NuGet version of EventSource library. We needed to recompile the SLAB code according to step in the release 2.0.1406.1

Out-of-process

SLAB provides a way how to process the event in a separate process. In this case the events producer (the application) and event consumer communicate together via ETW infrastructure. The application only needs to reference EventSource library doesn't need to contain any code responsible for formatting and storing traces.

Sinks and filters are configured via configuration file.

Note: Current version of SLAB library does not fully support the event source library. There are limitations in supported features. For more details check the project page on GitHub.

Perfview

So far neither windows nor visual studio provides "candy" application for monitoring ETW logs.

Microsoft released the tool called perfview. The tool, while not very visually pleasing, is very powerful. Microsoft developers are using it to troubleshoot problems with .NET applications.

Perfview can be configured to collect data according to specific scenarios. It also provides powerful filtering and grouping functionality.

The best source of information can be found on Vance Morisson's blog or video tutorial on channel 9.

To see our event sources in action, we can narrow the monitoring to it with the following command:

PerfView /onlyProviders=*ERNI-ETWSample-SampleServer,*ERNI-ETWSample-SampleClient /run MY_APP.exe

On the previous example we enabled and filtered two providers: *ERNI-ETWSample-SampleServer and *ERNI-ETWSample-SampleClient. We are able to see events from both providers interleaved (which is very useful in client-server troubleshooting).

Another useful feature demonstrated is activity tracking. When the methods in ETW provider follow the EventNameStart/EventNameStop pattern, the provider generates an Activity ID for the event. Such a scope can be nested and correct scoping is persisted even for asynchronous code as well (Provided the asynchrony is Task based, even for fork-join activities). Perfview also provides a time measurements for Start/Stop scope (DURATIONMSEC column).

Suppose, we are experiencing a strange behaviour in WCF layer and we would like to know more details on what is going on. We can add new provider Microsoft-Windows-Application Server-Applications and choose to show additional WCF events. We will end up seeing the WCF events from our event source (We are only showing two events out of many available for WCF).

Almost all windows components and especially .NET framework exposes a lot of events which we can use when investigating application behaviour.

Note: Finding the right provider can be a challenge though. Historically, there are multiple conventions how to name a provider, multiple registration schemas and so on.

Conclusion

Tracing is a crucial part of modern software applications. A good logging strategy will have a beneficial impact on the software’s quality. When thinking about your logging strategy, keep all the stakeholders in mind that might be interested in runtime information.

With semantic logging we showed you an intelligent approach to create strongly typed runtime information for any kind of technology. With ETW, Microsoft offers a very mature and well known event logging system that is free to use and blends in perfectly in your Microsoft enterprise system. By combining the semantic logging approach and Microsoft’s ETW you get a solid and easy to implement tracing solution for you .NET application.

 

posted on 24.11.2015
Categories: 
by: Stefan Dobrovolny