Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Unix uniq command

News Shells Recommended Links Options Examples Pipes AWK xargs
Environment find grep sort cut tee Exit Status Etc

The  uniq command can eliminate or count duplicate lines in a presorted file. It reads in lines and compares the previous line to the current line. Depending on the options specified on the command line it may display only unique lines or one occurrence of repeated lines or both types of lines.  The common idiom is

sort | uniq 

and

sort | uniq -c

For example

grep 'hysteresis' * | awk -F: '{print $1}' | sort | uniq | wc -l 

or in more complex pipe:

cut -d '"' -f 2 $1 | cut -d '/' -f 3 | tr '[:upper:]' '[:lower:]'\

   |  sort | uniq -c | sort -r > $1_sites

The default output is to display lines that only appear once and one copy of lines that appear more than once.

Syntax

     uniq [ -cdu [ +n ][ -n ][ input [ output ] ]

Options

-c Precedes each line with a count of the number of times it occurred in the input.
-d Deletes duplicate copies. Only one line out of a set of repeated lines is displayed.
-u Displays only lines not duplicated (uniq lines).
-n Ignores the first n fields of an input line when comparing for duplicate lines. A field is a string of nonblank characters. A blank is a space or tab.
+n Ignores the first n characters of an input line when comparing for duplicate lines.

Arguments

The following arguments may be passed to the uniq command.

input The name of the file containing the input data.
output The name of the file to hold the output data. If no output file is specified, the output is displayed on the standard output.
The input and output files must not be the same name. If they are, the contents of the file are destroyed.
If no input file is specified, uniq reads from the standard input and writes to the standard output.
You cannot specify an output file without specifying an input file.

RELATED COMMANDS

sort

You can use uniq to reduce duplicate lines from a file. First the file must be sorted, then you can remove the duplicate lines, reducing the size of the file.

It is also useful to filter out multiple blank lines from unsorted or sorted output of other commands. For example, the dircmp command displays its output using pr; thus the output usually scrolls off your screen before you can read it. But if you pipe the output of the dircmp command through the uniq command, the blank lines are reduced and the output is compact.

Old News ;-)

TTTT uniq

uniq(1) takes a stream of lines and collapses adjacent duplicate lines into one copy of the lines. So if you had a file called foo that looked like:
davel
davel
davel
jeffy
jones
jeffy
mark
mark
mark
chuck
bonnie
chuck

You could run uniq on it like this:
% uniq foo
davel
jeffy
jones
jeffy
mark
chuck
bonnie
chuck

Notice that there are still two jeffy lines and two chuck lines. This is because the duplicates were not adjacent. To get a true unique list you have to make sure the stream is sorted:
% sort foo | uniq
jones
bonnie
davel
chuck
jeffy
mark

That gives you a truly unique list. However, it's also a useless use of uniq since sort(1) has an argument, -u to do this very common operation:
% sort -u foo
jones
bonnie
davel
chuck
jeffy
mark

That does exactly the same thing as "sort | uniq", but only takes one process instead of two.

uniq has other arguments that let it do more interesting mutilations on its input:


Tuesday Tiny Techie Tip -- 03 December 1996

Recommended Links


In case of broken links please try to use Google search. If you find the page please notify us about new location
Google     

Ad Hoc Data Analysis From The Unix Command Line - Wikibooks, collection of open-content textbooks