grep, cut, awk, sort, uniq… linux shell power!

How powerful can be linux shell? This is just an other nice example of the power of linux shell!

Well, this is the game: suppose to need on a remote machine to count how many times an ip address (or user or whatever you want) is used in a log file or a part of it.

Let’s take a sample log: Apache access log.

The format is something like (see attached example log file):
1.1.1.1 – – [16/Jan/2008:22:18:42 +0100] “GET /dir/pag.htm” 200 11 “http://diegobelotti.com/a.php” “Mozilla 5 (Windows..) Firefox”

Now we want to have some statistics (commands run on the example log).

The number of ip address between 22:10 and 22:20 and their frequency

grep ’16\/Jan\/2008:22:1′ example.log | cut -d’ ‘ -f1 | sort | uniq -c

Output:

9 1.1.1.1
3 1.1.1.2

the GREP command find the line containing ’16/Jan/2008:22:1′ (from 22:10:00 to 22:19:59), the CUT part cuts every line at the first space character and take the first piece, than we just have to SORT the results and take the UNIQUE values counting them.. easy isn’t it?

Visited pages and their frequency

grep ’16\/Jan’ example.log | cut -d’ ‘ -f7 | sort | uniq -c

Output:

12 /dir/pag.htm”

the GREP command find the line containing ’16\/Jan’ (hopefully you won’t rotate the log less than once a year!), the CUT part cuts every line at the space character and take the 7th piece, than we just have to SORT the results and take the UNIQUE values counting them.

Number of actual connected ip address

netstat -ntu | awk ‘{print $4}’ | cut -d: -f1 | sort | uniq -c | sort -n

Output:

4 10.111.111.114
1 (w/o
1 Local
1

in this case AWK print out the 4th column of netstat output…

Now combine these command to filter logs and outputs and make them human readable!

Leave a Reply

Your email address will not be published.