Log parsing basics

Marco Calizzi on February 2, 2022

Log messages are typically unstructured, a combination of constant free-text written by developers and variable values. A lot of information is buried in there. A log parser can split the log message into its elements and identify templates for easier analysis: it can reduce the dimensions from tenths of millions of logs to a few hundreds patterns. Log templates recognition, or log parsing, is a widely researched topic in the industry as well as in academia.

Log parsing is always the very first step of any log analytics work and it is crucial for the correct extraction of useful information. The most common applications are:

usage analysis;
anomaly detection;
duplicate issue identification;
performance modelling;

and, most importantly

failure diagnosis.

Traditionally log templates and key parameters are extracted through handcrafted regular expressions, but this approach is very time-consuming and error-prone. There are several algorithms that can automate this task ¹ such as SLCT, IPLoM, LKE, LogSig, Spell, Drain. They can be divided into two categories: batch processing and online log parsing. The main difference is that batch processing methods need the entirety of the dataset available and therefore can only work “offline”, on historical data, while online parsers process logs sequentially one by one, which is more practical for real-time services.

When choosing a log parser there are several things to consider:

Strategy: different parsers will rely on different strategies like frequent pattern mining, clustering, iterative partitioning, longest common subsequence, parsing tree, evolutionary algorithms, and other;
Coverage: another important aspect to take into account, not every parser is able to process any type of log;
Efficiency: a good parser has to be accurate and fast.

For more details about the state of the art parsers I suggest you to read this nice scientific article from J. Zhu et al.

Clustering tools for log analytics

Logmind’s proprietary clustering method simplifies log analysis by reducing patterns to actionable insights using ML techniques like deep clustering to ensuring accuracy, flexibility, and real-time performance.

Big Data: challenges and new ways to tackle them

With Big Data come new challenges that require fresh automated approaches to tackle them in real-time.

How can Machine Learning help IT teams with log management?

ML can improve the IT workflow in several different areas: let's find the right model for the right task.

Or have a conversation with our expert

Log parsing basics

Related Posts

Clustering tools for log analytics

Big Data: challenges and new ways to tackle them

How can Machine Learning help IT teams with log management?

Follow us