[ISN] Five mistakes of log analysis

InfoSec News isn at c4i.org
Fri Oct 22 02:07:31 EDT 2004


Opinion by Anton Chuvakin
netForensics Inc.
OCTOBER 21, 2004

As the IT market grows, organizations are deploying more security 
solutions to guard against the ever-widening threat landscape. All 
those devices are known to generate copious amounts of audit records 
and alerts, and many organizations are setting up repeatable log 
collection and analysis processes. 

However, when planning and implementing log collection and analysis 
infrastructure, the organizations often discover that they aren't 
realizing the full promise of such a system. This happens due to some 
common log-analysis mistakes. 

This article covers the typical mistakes organizations make when 
analyzing audit logs and other security-related records produced by 
security infrastructure components. 

No. 1: Not looking at the logs 

Let's start with an obvious but critical one. While collecting and 
storing logs is important, it's only a means to an end -- knowing what 
's going on in your environment and responding to it. Thus, once 
technology is in place and logs are collected, there needs to be a 
process of ongoing monitoring and review that hooks into actions and 
possible escalation. 

It's worthwhile to note that some organizations take a half-step in 
the right direction: They review logs only after a major incident. 
This gives them the reactive benefit of log analysis but fails to 
realize the proactive one -- knowing when bad stuff is about to 

Looking at logs proactively helps organizations better realize the 
value of their security infrastructures. For example, many complain 
that their network intrusion-detection systems (NIDS) don't give them 
their money's worth. A big reason for that is that such systems often 
produce false alarms, which leads to decreased reliability of their 
output and an inability to act on it. Comprehensive correlation of 
NIDS logs with other records such as firewalls logs and server audit 
trails as well as vulnerability and network service information about 
the target allow companies to "make NIDS perform" and gain new 
detection capabilities. 

Some organizations also have to look at log files and audit tracks due 
to regulatory pressure. 

No. 2: Storing logs for too short a time 

This makes the security team think they have all the logs needed for 
monitoring and investigation (while saving money on storage hardware) 
and then leading to the horrible realization after the incident that 
all logs are gone due to its retention policy. The incident is often 
discovered a long time after the crime or abuse has been committed. 

If cost is critical, the solution is to split the retention into two 
parts: short-term online storage and long-term off-line storage. For 
example, archiving old logs on tape allows for cost-effective off-line 
storage, while still enabling future analysis. 

No. 3: Not normalizing logs 

What do we mean by "normalization"? It means we can convert the logs 
into a universal format, containing all the details of the original 
message but also allowing us to compare and correlate different log 
data sources such as Unix and Windows logs. Across different 
application and security solutions, log format confusion reigns: some 
prefer Simple Network Management Protocol, others favor classic Unix 
syslog. Proprietary methods are also common. 

Lack of a standard logging format leads to companies needing different 
expertise to analyze the logs. Not all skilled Unix administrators who 
understand syslog format will be able to make sense out of an obscure 
Windows event log record, and vice versa. 

The situation is even worse with security systems, because people 
commonly have experience with a limited number of systems and thus 
will be lost in the log pile spewed out by a different device. As a 
result, a common format that can encompass all the possible messages 
from security-related devices is essential for analysis, correlation 
and, ultimately, for decision-making. 

No. 4: Failing to prioritize log records 

Assuming that logs are collected, stored for a sufficiently long time 
and normalized, what else lurks in the muddy sea of log analysis? The 
logs are there, but where do we start? Should we go for a high-level 
summary, look at most recent events or something else? The fourth 
error is not prioritizing log records. Some system analysts may get 
overwhelmed and give up after trying to chew a king-size chunk of log 
data without getting any real sense of priority. 

Thus, effective prioritization starts from defining a strategy. 
Answering questions such as "What do we care about most?" "Has this 
attack succeeded?" and "Has this ever happened before?" helps to 
formulate it. Consider these questions to help you get started on a 
prioritization strategy that will ease the burden of gigabytes of log 
data, collected every day. 

No. 5: Looking for only the bad stuff 

Even the most advanced and security-conscious organizations can 
sometimes get tripped up by this pitfall. It's sneaky and insidious 
and can severely reduce the value of a log-analysis project. It occurs 
when an organization is only looking at what it knows is bad. 

Indeed, a vast majority of open-source tools and some commercial ones 
are set up to filter and look for bad log lines, attack signatures and 
critical events, among other things. For example, Swatch is a classic 
free log-analysis tool that's powerful, but only at one thing -- 
looking for defined bad things in log files. 

However, to fully realize the value of log data, it needs to be taken 
to the next level -- to log mining. In this step, you can discover 
things of interest in log files without having any preconceived notion 
of what you need to find. Some examples include compromised or 
infected systems, novel attacks, insider abuse and intellectual 
property theft. 

It sounds obvious: How can we be sure we know of all the possible 
malicious behavior in advance? One option is to list all the known 
good things and then look for the rest. It sounds like a solution, but 
such a task is not only onerous, but also thankless. It's usually even 
harder to list all the good things than it is to list all the bad 
things that might happen on a system or network. So many different 
events occur that weeding out attack traces just by listing all the 
possibilities is ineffective. 

A more intelligent approach is needed. Some of the data mining (also 
called "knowledge discovery in databases") and visualization methods 
actually work on log data with great success. They allow organizations 
to look for real anomalies in log data, beyond "known bad" and "not 
known good." 

Avoiding these mistakes will take your log-analysis program to the 
next level and enhance the value of your company's security and 
logging infrastructures. 

Anton Chuvakin is a security strategist at netForensics Inc., a 
security information management company in Edison, N.J. His areas of 
expertise include intrusion detection, Unix security, forensics and 
honeypots. Chuvakin is the co-author of Security Warrior (O'Reilly, 
2004) and a contributor to Know Your Enemy: Learning About Security 
Threats, Second Edition by the Honeynet Project (Addison-Wesley 
Professional, 2004) and Information Security Management Handbook 
(Auerbach Publishing, 2004). In his spare time, he maintains his 
security portal www.info-secure.org. 

More information about the ISN mailing list