Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

AWK Tips

News

AWK Programming

Best Shell Books

Recommended Links AWK Regular expressions AWK one liners Unix Shell Tips and Tricks

Unix Sysadmin Tips

SSH Tips SCP Tips nmap_tips     Tips Humor Etc

killing processes by name (in this example we kill the process called netscape)

kill `ps auxww | grep netscape | egrep -v grep | awk '{print $2}'`
or with awk alone:
ps auxww | awk '$0~/netscape/&&$0!~/awk/{print $2}' |xargs kill 
It has to be adjusted to fit the ps command on whatever unix system you are on. Basically it is: "If the process is called netscape and it is not called 'grep netscape' (or awk) then print the pid"

UNIX Basics Examples with awk A short introduction

Games with ls

  1. Renaming within the name:
    ls -1 *old* | awk '{print "mv "$1" "$1}' | sed s/old/new/2 | sh
    (although in some cases it will fail, as in file_old_and_old)
     
  2. Remove only files:
    ls -l * | grep -v drwx | awk '{print "rm "$9}' | sh
    or with awk alone:
    ls -l | awk '$1!~/^drwx/{print $9}'|xargs rm
    Be careful when trying this out in your home directory. We remove files!  It's saver to move them to some location like /tmp/garbage)
     
  3. remove only directories
    ls -l | grep '^d' | awk '{print "rm -r "$9}' | sh
    or
    ls -p | grep /$ | wk '{print "rm -r "$1}'
    or with awk alone:
    ls -l|awk '$1~/^d.*x/{print $9}'|xargs rm -r
    Be careful when trying this out in your home directory. We remove things!

An alterative way to pass shell variables into AWK program from the command line using "variable assignment" parameters:

Pseudo-files

AWK knows another way to assign values to AWK variables, like in the following example:

$ awk '{ print "var is", var }' var=TEST file1 

This statement assigns the value "TEST" to the AWK variable "var", and then reads the files "file1" and "file2". The assignment works, because AWK interprets each file name containing an equal sign ("=") as an assignment.

This example is very portable  and easy to use. So why don't we use this syntax exclusively?

This syntax has two drawbacks: the variable assignment are interpreted by AWK the moment the file would have been read. At this time the assignment takes place. Since the BEGIN action is performed before the first file is read, the variable is not available in the BEGIN  action.

The second problem is, that the order of the variable assignments and of the files are important. In the following example

$ awk '{ print "var is", var }' file1 var=TEST file2

the variable var is not defined during the read of file1, but during the reading of file2. This may cause bugs that are hard to track down.

Comparing Strings

Sometimes you want to compare a string or pattern with another string, for example a particular field or a variable. You can compare two strings in various ways, including whether one contains the other, whether they are identical, or whether one precedes the other in alphabetical order.

You use the tilde (~) sign to test whether two strings match. For example,

$2 ~ /^15/

checks whether field 2 begins with 15. This pattern matches if field 2 begins with 15 regardless of what the rest of the field may contain. It is a test for matching, not identity. If you wish to test whether field 2 contains precisely the string 15 and nothing else, you could use

$2 ~ /^15$/

You can test for nonmatching strings with !~. This is similar to ~, but it matches if the first string is not contained in the second string.

You can use the == operator to check whether two strings are identical, rather than whether one contains the other. For example,

$1==$3

checks to see whether the value of field 1 is equal to the value of field 3.

Do not confuse == with =. The former (==) tests whether two strings are identical. The single equal sign (=) assigns a value to a variable. For example,

$1=15

sets the value of field 1 equal to 15. It would be used as part of an action statement. On the other hand,

$1==15

compares the value of field 1 to the number 15. It could be a pattern statement.

The != operator tests whether the values of two expressions are not equal. For example,

$1 != "pencils"

is a pattern that matches any line where the first field is not "pencils."

25 Best AWK Commands - Tricks

UrFix's Blog

stuck on OS X, so here's a Perl version for the Mac:

curl -u username:password --silent "https://mail.google.com/mail/feed/atom" | tr -d '\n' | awk -F '<entry>' '{for (i=2; i<=NF; i++) {print $i}}' | perl -pe 's/^<title>(.*)<\/title>.*<name>(.*)<\/name>.*$/$2 - $1/'

If you want to see the name of the last person, who added a message to the conversation, change the greediness of the operators like this:

curl -u username:password --silent "https://mail.google.com/mail/feed/atom" | tr -d '\n' | awk -F '<entry>' '{for (i=2; i<=NF; i++) {print $i}}' | perl -pe 's/^<title>(.*)<\/title>.*?<name>(.*?)<\/name>.*$/$2 - $1/'

5) Remove duplicate entries in a file without sorting.

awk '!x[$0]++' <file>

Using awk, find duplicates in a file without sorting, which reorders the contents. awk will not reorder them, and still find and remove duplicates which you can then redirect into another file.

6) find geographical location of an ip address

lynx -dump http://www.ip-adress.com/ip_tracer/?QRY=$1|grep address|egrep 'city|state|country'|awk '{print $3,$4,$5,$6,$7,$8}'|sed 's\ip address flag \\'|sed 's\My\\'

I save this to bin/iptrace and run "iptrace ipaddress" to get the Country, City and State of an ip address using the http://ipadress.com service.

I add the following to my script to get a tinyurl of the map as well:

URL=`lynx -dump http://www.ip-adress.com/ip_tracer/?QRY=$1|grep details|awk '{print $2}'`

lynx -dump http://tinyurl.com/create.php?url=$URL|grep tinyurl|grep "19. http"|awk '{print $2}'

7) Block known dirty hosts from reaching your machine

wget -qO - http://infiltrated.net/blacklisted|awk '!/#|[a-z]/&&/./{print "iptables -A INPUT -s \"$1\" -j DROP"}'

Blacklisted is a compiled list of all known dirty hosts (botnets, spammers, bruteforcers, etc.) which is updated on an hourly basis. This command will get the list and create the rules for you, if you want them automatically blocked, append |sh to the end of the command line. It's a more practical solution to block all and allow in specifics however, there are many who don't or can't do this which is where this script will come in handy. For those using ipfw, a quick fix would be {print "add deny ip from "$1" to any}. Posted in the sample output are the top two entries. Be advised the blacklisted file itself filters out RFC1918 addresses (10.x.x.x, 172.16-31.x.x, 192.168.x.x) however, it is advisable you check/parse the list before you implement the rules

8) Display a list of committers sorted by the frequency of commits

svn log -q|grep "|"|awk "{print \$3}"|sort|uniq -c|sort -nr

Use this command to find out a list of committers sorted by the frequency of commits.

9) List the number and type of active network connections

netstat -ant | awk '{print $NF}' | grep -v '[a-z]' | sort | uniq -c

10) View facebook friend list [hidden or not hidden]

lynx -useragent=Opera -dump 'http://www.facebook.com/ajax/typeahead_friends.php?u=4&__a=1' |gawk -F'\"t\":\"' -v RS='\",' 'RT{print $NF}' |grep -v '\"n\":\"' |cut -d, -f2

There's no need to be logged in facebook. I could do more JSON filtering but you get the idea…

Replace u=4 (Mark Zuckerberg, Facebook creator) with desired uid.

Hidden or not hidden… Scary, don't you?

11) List recorded formular fields of Firefox

cd ~/.mozilla/firefox/ && sqlite3 `cat profiles.ini | grep Path | awk -F= '{print $2}'`/formhistory.sqlite "select * from moz_formhistory" && cd - > /dev/null

When you fill a formular with Firefox, you see things you entered in previous formulars with same field names. This command list everything Firefox has registered. Using a "delete from", you can remove anoying Google queries, for example ;-)

12) Brute force discover

sudo zcat /var/log/auth.log.*.gz | awk '/Failed password/&&!/for invalid user/{a[$9]++}/Failed password for invalid user/{a["*" $11]++}END{for (i in a) printf "%6s\t%s\n", a[i], i|"sort -n"}'

Show the number of failed tries of login per account. If the user does not exist it is marked with *.

13) Show biggest files/directories, biggest first with 'k,m,g' eyecandy

du -max-depth=1 | sort -r -n | awk '{split("k m g",v); s=1; while($1>1024){$1/=1024; s++} print int($1)" "v[s]"\t"$2}'

I use this on debian testing, works like the other sorted du variants, but i like small numbers and suffixes :)

14) Analyse an Apache access log for the most common IP addresses

tail -10000 access_log | awk '{print $1}' | sort | uniq -c | sort -n | tail

This uses awk to grab the IP address from each request and then sorts and summarises the top 10

15) copy working directory and compress it on-the-fly while showing progress

tar -cf - . | pv -s $(du -sb . | awk '{print $1}') | gzip > out.tgz

What happens here is we tell tar to create "-c" an archive of all files in current dir "." (recursively) and output the data to stdout "-f -". Next we specify the size "-s" to pv of all files in current dir. The "du -sb . | awk ?{print $1}?" returns number of bytes in current dir, and it gets fed as "-s" parameter to pv. Next we gzip the whole content and output the result to out.tgz file. This way "pv" knows how much data is still left to be processed and shows us that it will take yet another 4 mins 49 secs to finish.

Credit: Peteris Krumins http://www.catonmat.net/blog/unix-utilities-pipe-viewer/

16) List of commands you use most often

history | awk '{print $2}' | sort | uniq -c | sort -rn | head

17) Identify long lines in a file

awk 'length>72' file

This command displays a list of lines that are longer than 72 characters. I use this command to identify those lines in my scripts and cut them short the way I like it.

18) Makes you look busy

alias busy='my_file=$(find /usr/include -type f | sort -R | head -n 1); my_len=$(wc -l $my_file | awk "{print $1}"); let "r = $RANDOM % $my_len" 2>/dev/null; vim +$r $my_file'

This makes an alias for a command named 'busy'. The 'busy' command opens a random file in /usr/include to a random line with vim. Drop this in your .bash_aliases and make sure that file is initialized in your .bashrc.

19) Show me a histogram of the busiest minutes in a log file:

cat /var/log/secure.log | awk '{print substr($0,0,12)}' | uniq -c | sort -nr | awk '{printf("\n%s ",$0) ; for (i = 0; i<$1 ; i++) {printf("*")};}'

20) Analyze awk fields

awk '{print NR": "$0; for(i=1;i<=NF;++i)print "\t"i": "$i}'

Breaks down and numbers each line and it's fields. This is really useful when you are going to parse something with awk but aren't sure exactly where to start.

21) Browse system RAM in a human readable form

sudo cat /proc/kcore | strings | awk 'length > 20' | less

This command lets you see and scroll through all of the strings that are stored in the RAM at any given time. Press space bar to scroll through to see more pages (or use the arrow keys etc).

Sometimes if you don't save that file that you were working on or want to get back something you closed it can be found floating around in here!

The awk command only shows lines that are longer than 20 characters (to avoid seeing lots of junk that probably isn't "human readable").

If you want to dump the whole thing to a file replace the final '| less' with '> memorydump'. This is great for searching through many times (and with the added bonus that it doesn't overwrite any memory…).

Here's a neat example to show up conversations that were had in pidgin (will probably work after it has been closed)…

sudo cat /proc/kcore | strings | grep '([0-9]\{2\}:[0-9]\{2\}:[0-9]\{2\})'(depending on sudo settings it might be best to run

sudo sufirst to get to a # prompt)

22) Monitor open connections for httpd including listen, count and sort it per IP

watch "netstat -plan|grep :80|awk {'print \$5'} | cut -d: -f 1 | sort | uniq -c | sort -nk 1"

It's not my code, but I found it useful to know how many open connections per request I have on a machine to debug connections without opening another http connection for it.

You can also decide to sort things out differently then the way it appears in here.

23) Purge configuration files of removed packages on debian based systems

sudo aptitude purge `dpkg -get-selections | grep deinstall | awk '{print $1}'`

Purge all configuration files of removed packages

24) Quick glance at who's been using your system recently

last  | grep -v "^$" | awk '{ print $1 }' | sort -nr | uniq -c

This command takes the output of the 'last' command, removes empty lines, gets just the first field ($USERNAME), sort the $USERNAMES in reverse order and then gives a summary count of unique matches.

25) Number of open connections per ip.

netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n

Here is a command line to run on your server if you think your server is under attack. It prints our a list of open connections to your server and sorts them by amount.



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March, 12, 2019