[ISN] Five mistakes of log analysis
isn at c4i.org
Fri Oct 22 02:07:31 EDT 2004
Opinion by Anton Chuvakin
OCTOBER 21, 2004
As the IT market grows, organizations are deploying more security
solutions to guard against the ever-widening threat landscape. All
those devices are known to generate copious amounts of audit records
and alerts, and many organizations are setting up repeatable log
collection and analysis processes.
However, when planning and implementing log collection and analysis
infrastructure, the organizations often discover that they aren't
realizing the full promise of such a system. This happens due to some
common log-analysis mistakes.
This article covers the typical mistakes organizations make when
analyzing audit logs and other security-related records produced by
security infrastructure components.
No. 1: Not looking at the logs
Let's start with an obvious but critical one. While collecting and
storing logs is important, it's only a means to an end -- knowing what
's going on in your environment and responding to it. Thus, once
technology is in place and logs are collected, there needs to be a
process of ongoing monitoring and review that hooks into actions and
It's worthwhile to note that some organizations take a half-step in
the right direction: They review logs only after a major incident.
This gives them the reactive benefit of log analysis but fails to
realize the proactive one -- knowing when bad stuff is about to
Looking at logs proactively helps organizations better realize the
value of their security infrastructures. For example, many complain
that their network intrusion-detection systems (NIDS) don't give them
their money's worth. A big reason for that is that such systems often
produce false alarms, which leads to decreased reliability of their
output and an inability to act on it. Comprehensive correlation of
NIDS logs with other records such as firewalls logs and server audit
trails as well as vulnerability and network service information about
the target allow companies to "make NIDS perform" and gain new
Some organizations also have to look at log files and audit tracks due
to regulatory pressure.
No. 2: Storing logs for too short a time
This makes the security team think they have all the logs needed for
monitoring and investigation (while saving money on storage hardware)
and then leading to the horrible realization after the incident that
all logs are gone due to its retention policy. The incident is often
discovered a long time after the crime or abuse has been committed.
If cost is critical, the solution is to split the retention into two
parts: short-term online storage and long-term off-line storage. For
example, archiving old logs on tape allows for cost-effective off-line
storage, while still enabling future analysis.
No. 3: Not normalizing logs
What do we mean by "normalization"? It means we can convert the logs
into a universal format, containing all the details of the original
message but also allowing us to compare and correlate different log
data sources such as Unix and Windows logs. Across different
application and security solutions, log format confusion reigns: some
prefer Simple Network Management Protocol, others favor classic Unix
syslog. Proprietary methods are also common.
Lack of a standard logging format leads to companies needing different
expertise to analyze the logs. Not all skilled Unix administrators who
understand syslog format will be able to make sense out of an obscure
Windows event log record, and vice versa.
The situation is even worse with security systems, because people
commonly have experience with a limited number of systems and thus
will be lost in the log pile spewed out by a different device. As a
result, a common format that can encompass all the possible messages
from security-related devices is essential for analysis, correlation
and, ultimately, for decision-making.
No. 4: Failing to prioritize log records
Assuming that logs are collected, stored for a sufficiently long time
and normalized, what else lurks in the muddy sea of log analysis? The
logs are there, but where do we start? Should we go for a high-level
summary, look at most recent events or something else? The fourth
error is not prioritizing log records. Some system analysts may get
overwhelmed and give up after trying to chew a king-size chunk of log
data without getting any real sense of priority.
Thus, effective prioritization starts from defining a strategy.
Answering questions such as "What do we care about most?" "Has this
attack succeeded?" and "Has this ever happened before?" helps to
formulate it. Consider these questions to help you get started on a
prioritization strategy that will ease the burden of gigabytes of log
data, collected every day.
No. 5: Looking for only the bad stuff
Even the most advanced and security-conscious organizations can
sometimes get tripped up by this pitfall. It's sneaky and insidious
and can severely reduce the value of a log-analysis project. It occurs
when an organization is only looking at what it knows is bad.
Indeed, a vast majority of open-source tools and some commercial ones
are set up to filter and look for bad log lines, attack signatures and
critical events, among other things. For example, Swatch is a classic
free log-analysis tool that's powerful, but only at one thing --
looking for defined bad things in log files.
However, to fully realize the value of log data, it needs to be taken
to the next level -- to log mining. In this step, you can discover
things of interest in log files without having any preconceived notion
of what you need to find. Some examples include compromised or
infected systems, novel attacks, insider abuse and intellectual
It sounds obvious: How can we be sure we know of all the possible
malicious behavior in advance? One option is to list all the known
good things and then look for the rest. It sounds like a solution, but
such a task is not only onerous, but also thankless. It's usually even
harder to list all the good things than it is to list all the bad
things that might happen on a system or network. So many different
events occur that weeding out attack traces just by listing all the
possibilities is ineffective.
A more intelligent approach is needed. Some of the data mining (also
called "knowledge discovery in databases") and visualization methods
actually work on log data with great success. They allow organizations
to look for real anomalies in log data, beyond "known bad" and "not
Avoiding these mistakes will take your log-analysis program to the
next level and enhance the value of your company's security and
Anton Chuvakin is a security strategist at netForensics Inc., a
security information management company in Edison, N.J. His areas of
expertise include intrusion detection, Unix security, forensics and
honeypots. Chuvakin is the co-author of Security Warrior (O'Reilly,
2004) and a contributor to Know Your Enemy: Learning About Security
Threats, Second Edition by the Honeynet Project (Addison-Wesley
Professional, 2004) and Information Security Management Handbook
(Auerbach Publishing, 2004). In his spare time, he maintains his
security portal www.info-secure.org.
More information about the ISN