Daemian Mack home

Value-specific logfiles with Logback and Clojure

August 7, 2015

Recently one of our customer projects at work discovered a need for value-specific logfiles.

For our purposes, let’s say we have a retail system. Every evening, we want to run a settlement report against this system’s financial behavior for the day. This settlement report will….

Here, we’re interested in altering the behavior in step 2 to produce separate per-execution logfiles rather than one continuous logfile for all reports. That is, we’d like the following…

> ls -al logs

-rw-------   1 log-user  staff   2.0M Jul  4 17:02 settlement-report-84.log
-rw-------   1 log-user  staff   2.0M Jul  5 17:02 settlement-report-85.log

That DB primary key gives us a useful handle to differentiate.

Configuration

Logback possesses reasonably flexible configuration, and clojure.tools.logging allows access to custom Logger and LoggerFactory implementations. One can use the Logger protocol’s write! method to execute arbitrary code – a useful example being clojure.tools.logging’s own tests – with the tradeoff that any formatting choices made in this method will need to be managed apart from the application’s Logback config file.

It turns out Logback can actually get us most of the way there with its out-of-the-box facilities.

We can cooperate a little more closely with the existing Logback config by establishing a task-specific logger in concert with a SiftingAppender, whose purpose is to provide functionality around this notion of value-specific logfiles.

Here are the specific bits of the config/logback.xml

The discriminator key above is the most significant bit. It will use the appender’s file directive to produce logfiles that are distinguished by report-id value.

That value is conveyed via Logback’s “Mapped Diagnostic Context”. We made use of io-clj/logging, a tiny Clojure library, to abstract away interaction with MDC, along with a dynamic var for thread-local state.

Code

Client Code

Considerations Addressed Hereby

We make use here of clojure.tools.logging’s log macro, a bit of plumbing that gives access to multiple arities. We’re using the 4-arity implementation that allows one to specify the logger name. The default Logback behavior dispatches on caller namespace, but we can use an arbitrary string here to point to our value-specific logger in the Logback config.xml.

Providing vars named after the conventional set of logging levels (info, debug, etc.) allows one to migrate client namespaces to the value-specific logger by simply changing the client’s require spec to reference the new logging namespace.

Making these vars available as macros is a common pattern with logging wrappers to preserve the original call-site, so Logback can use the caller’s namespace to dispatch to the right logger; things like line numbers are also preserved.

Providing this as a separate namespace with a dynamic var for storage is a subtlety that may not be readily apparent. Mechanically, it allows us to use the logging facility from any other namespace without causing a cycle in the call-graph. Semantically, that’s necessary because neither Logback’s MDC facility nor the io-clj/logging library are prepared to handle Clojure’s thread pools out of the box.

That is, simple usages passing a data literal to the library’s with-logging-context macro will behave as expected but any logging calls on the other side of a thread boundary will provoke bad behavior. The value of the data literal in code that uses a thread pool will intermittently occupy the wrong thread; often the process driving the logging will have no such bindings, triggering the appender’s defaultValue, and therefore logging to the wrong file. In our project, we happen to use a number of core.async routines to drive the settlement data pipe-line; each of these contexts needs to re-establish the *logging-context* binding in its own logging calls. Providing this stand-alone logging facility means those constituent call-sites can just call the appropriate log-level in app.logging and let the macro manage access to the dynamic binding as appropriate.

Considerations Unaddressed

This is imperfect in at least one way. We are treating the monotonically-increasing primary key as universally unique when it is not; the PK may be “re-used” by other instances of the system – particularly instances with a new DB like system tests that exercise this codepath. If Logback is configured to clobber existing files when attempting to create a logfile with a name that already exists, data loss may result. An easy fix here would be to add a timestamp to the filename.

Thanks to Sam Umbach for valuable input and review!