Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Softpanorama Search

Logs Related Tips

News Enterprise Logs  Infrastructure

Recommended Links

Syslog Http logs analyzers logger utility

This is a collection of one-liners I have found useful. These short command line tools have been tested on Solaris tcsh and bash shells.

One frequent problem with logs is truncation. The easiest form of truncation would be

cat /dev/null/ > target.log

or just

:>target.log

Old News ;-)

Note: Most tips were borrowed from ktmatu - One-liners byMatti Tukiainen. Some are modified. We assume that the web server log files (access_log*) are in Combined Format.

How to view log files without line wrapping ?

Less has option -S or --chop-long-lines Causes lines longer than the screen width to be chopped rather than folded. That is, the portion of a long line that does not fit in the screen width is not shown. The default is to fold long lines; that is, display the remainder on the next line.
less -S access_log

How many lines (hits) there are in the log file?

grep 200 access_log | wc -l 

How many page views?

gzip -dc access_log.gz | egrep -vc '(\.gif |\.jpg |\.png )'
2569

How many hits today?

% grep -c `date '+%d/%b/%Y'` access_log
2569

How many unique visitors today?

% grep `date '+%d/%b/%Y'` access_log | cut -d" " -f1 | sort -u | wc -l
1196

How many hits in a particular day?

Uncompressed log file, e.g. January 1, 2001
% grep -c 01/Jan/2001 access_log
2569
Compressed log file, e.g. January 1, 2001
% gzip -dc access_log.gz | grep -c 01/Jan/2001
2569

What period is covered covered in the log?

Uncompressed log file
% head -1 access_log; tail -1 access_log
foo.example - - [30/Dec/2000:23:55:25 +0200] "GET /~ktmatu/ ...
bar.example - - [06/Jan/2001:23:53:37 +0200] "GET /~ktmatu/rates.html ...
Uncompressed log file
% gzip -dc access_log.gz | head -1 ; gzip -dc access_log.gz | tail -1
foo.example - - [30/Dec/2000:23:55:25 +0200] "GET /~ktmatu/ ...
bar.example - - [06/Jan/2001:23:53:37 +0200] "GET /~ktmatu/rates.html ...

Are there missing dates?

Uncompressed log file
% cut -d" " -f4 access_log | cut -d"/" -f1 | uniq
[30
[31
[01
[03
[04
[05
[06
Compressed log file
% gzip -dc wlog0101.gz | cut -d" " -f4 | cut -d"/" -f1 | uniq
[30
[31
[01
[03
[04
[05
[06

How many corrupted log entries?

This is just a very quick and dirty way to check the log.

Uncompressed log file
% perl -ane 'print $_ if (scalar (split /\"/)) != 7' access_log | wc -l
       7
Compressed log file
% gzip -dc access_log.gz | perl -ane 'print $_ if (scalar (split /\"/)) != 7' | wc -l
       7

How does the line number 15927 or lines 15920 - 15929 look like?

Uncompressed log file
% grep -n '.*' access_log | grep '^15927\:'
15927:foo.example.com - - [20/Jan/2002:11:23:45 +0200] "GET ...
% grep -n '.*' access_log | grep '^1592.\:'
15920:foo.example.com - - [20/Jan/2002:11:23:40 +0200] "GET ...
15921:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
15922:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
...
Compressed log file
% gzip -dc access_log.gz | grep -n '.*' | grep '^15927\:'
15927:foo.example.com - - [20/Jan/2002:11:23:45 +0200] "GET ...
% gzip -dc access_log.gz | grep -n '.*' | grep '^1592.\:'
15920:foo.example.com - - [20/Jan/2002:11:23:40 +0200] "GET ...
15921:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
15922:foo.example.com - - [20/Jan/2002:11:23:41 +0200] "GET ...
...

How to figure out the bandwith consumption (in bytes)?

Today:
% grep `date '+%d/%b/%Y'` access_log | awk '{ s += $10 } END {print s}'
13113756
This month:
% grep `date '+../%b/%Y'` access_log | awk '{ s += $10 } END {print s}'
569477018
Used by Googlebot:
% grep googlebot access_log | awk '{ s += $10 } END {print s}'
29832233
Used by some rogue user from IP-address 169.254.22.12:
% grep ^169.254.22.12 access_log | awk '{ s += $10 } END {print s}'
46760880

How to delete partial GET requests from the log?

Partial content requests are usually generated by download managers to speed the downloading of big files and Adobe Acrobat Reader to fetch PDF documents page by page. In this example 206 requests generated by Acrobat reader are deleted so that they don't infate the hit count.

% grep -v '\.pdf .* 206 ' access_log > new_log

How to compress a selected portion from a log?

Use gzip to compress log entries in May 2009
% grep ' \[../May/2009\:' access_log | gzip -9c > access_log-2009-05.gz
Use bzip2 to compress log entries in May 2009
% grep ' \[../May/2009\:' access_log | bzip2 > access_log-2009-05.bz2

See in real-time how the log file grows?

Using tail
% tail -f access_log
With less you must hit "F" (and Ctrl-C q to quit)
% less access_log

Recommended Links

Web site logging and log management

CGI Scripts

 



Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Disclaimer:

Last modified: August 26, 2009