Logmind Logo

How can Machine Learning help IT teams with log management?

Marco Calizzi on December 10, 2021

Machines use log messages to communicate what is going on with them. They just do it at a pace that is impossible for humans to keep up with. Even a medium-sized business, working in a field non related to IT, will have to manage private networks, email servers, employees accounts, etc. For example, every time you check your email, multiple logs are generated: request to connect, request accepted, connection redirected, connection successful… This adds up to thousands of logs per second .

The history, the current status, every information can be found in the logs, but usually the few critical messages that we care about are drowned among millions of “informational” messages.

The first thing to keep in mind is the scope: what do we want to achieve through log analytics? The answer is simple, we want to detect the root cause of hardware or software problems of IT systems, in a timely manner, to solve them efficiently. The widely used approach is to use rule-based log-management software, often provided directly by the company that developed the system in use. This is a reactive approach, that is effective for known issues but still requires a lot of manual work from specialized staff and is not able to help in new situations.

In this post we will have an overview of how ML algorithms can help in this matter and we will explore them in more detail in the following posts. The main intention is to implement ML models that simplify, automate and are proactive. There are several areas where ML is being successfully implemented:

Log Parsing: ML algorithms can help simplify the task by identifying templates. Log templates recognition is a widely researched topic in the industry as well as in academia. It can reduce the dimensions from tenths of millions of logs to a few hundreds patterns.

Insight generation: in order to help finding the root cause of a problem, ML can discover logs that are correlated, and group them in one insight that can be easily reviewed by an IT team. The correlation can be based on a number of factors such as keywords in the text, event date, host, or metrics attached to the log. From a Data Science point of view, this is just a clustering problem with undetermined number of clusters.

Assessing importance: most log messages are accompanied by a severity level, like ‘critical’, ‘warning’ or ‘informational’. However, the level does not always reflect the importance of the event. There are ‘informational’ logs that are important as well as ‘warning’ logs that are not (‘critical’ logs are also important most of the times). ML classification models can assess the importance of a message regardless of its severity level, and different models can be tuned based on users’ inputs. For this purpose usually NLP models or boosted trees are trained on labelled datasets.

Anomaly detection: probably the most trending application, it allows to unveil issues that would otherwise remain unnoticed, that can span from just a system performance decrease, to preventing problems, to detecting a full on cyber attack. This is where log analytics gets mixed with cybersecurity. Deep learning methods are used to this scope, to detect the normal sequences of logs and alert when a chain of events are altered or interrupted.


Share :

Twitter
Linkedin
Linkedin

Related Posts

Clustering tools for log analytics

Logmind’s proprietary clustering method simplifies log analysis by reducing patterns to actionable insights using ML techniques like deep clustering to ensuring accuracy, flexibility, and real-time performance.

Log parsing basics

What is a log parser? In this post I will give an introduction to what a log parser does, why it is important, its applications and the different types of parsers available.

Big Data: challenges and new ways to tackle them

With Big Data come new challenges that require fresh automated approaches to tackle them in real-time.