Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

tr command

News	See also	Recommended Links	Reference	Perl tr function	TR Set Notation	cut command	printf Command

SED	AWK	Caesar cipher	Cryptography	chpasswd	Sysadmin Horror Stories	Humor	Etc

Introduction
Important but rarely used and known options
TR Set Notation
Examples
Emulating Unix TR command in Perl
Difference Between tr and sed Command

Introduction

Unix tr command copies the standard input to the standard output with substitution or deletion of selected characters. In addition it can squeeze repeating characters into a singe character (with option -s). This makes tr a great preprocessing tool for the cut command which in many cases is way too primitive to be useful without this functionality.

NOTE: Perl also has tr function, which can be used instead of command, if you need additional flexibility of preprocessing. The semantic is basically the same.

Utility performs classic alphabet1 to alphabet2 type of translation sometimes called 1:1 transliteration and as such is suitable for implementation of the Caesar cipher. Unix inherited tr from Multics as a derivative of PL/1 translate built-in function, which in turn was a generalization of a TR command in System/360 architecture (see IBM System-360 Green Card).

The format of the tr command is somewhat strange -- this is one of the few Unix commands that accepts input only from standard input.

tr [ options ] [ set1  [ set2 ] ]

For example, to translate file from regular "\n" line delimiter to \0 line delimiter required by such utilities as xargs, du and other you can pipe the file into tr

cat file | tr '\n' '\0' | du --files0-from -

Input characters in the string set1 are mapped to corresponding characters in the string set2. Logically set1 and set2 should have equal length. If this is not the case, no error is generated, but two rules are applied to make them equal:

If length of set2 is less then the length of set2 then set2 is extended to the length of set1 by repeating its last character as many times as necessary.
If length of set2 exceed the length of set1, excess characters in set2 are ignored.

The set1 can be specified directly as a complement if option -c is given( see below)

Both sets can be specified by

enumeration of characters like in
```
tr '{}' '()' < infile > outfile
```
or using charater ranges like in
```
 tr 'A-Z' 'a-z' < infile > outfile
```
or using POSIX character classes. Special POSIX character classes also can be used instead of specifying sets of individual characters or ranges:
- alnum: alphanumeric characters
- alpha: alphabetic characters
- cntrl: control (non-printing) characters
- digit: numeric characters
- graph: graphic characters
- lower: lower-case alphabetic characters
- print: printable characters
- punct: punctuation characters
- space: whitespace characters
- upper: upper-case characters
- xdigit: hexadecimal characters 0-9 A-F
  For example, more correct implementation of the change of the case from upper to lower or vise versa can be specified as following:
```
cat names | tr '[:upper:]' '[:lower:]' > lc_names
```
Classes can be combined to form a more complex set, for example '[:lower:][:upper:]'
or mix or any of those.

Important but rarely used and known options

The tr utility accepts three additional options which substantially increase its power:

-c -- converts the set to the complement of the listed characters, i.e., operations apply to characters not in the given set
-d -- delete characters in the first set from the output
-s -- squeeze repeated characters in the output into a singe character. This makes tr a great preprocessing tool for cut command

Most Unix administrators do not suspect about existence of those options, which are quite useful and greatly extend the usability of this generally very simple command.

-s -- squeeze repeated characters in the output into a singe character. This makes tr a great preprocessing tool for the cut command

Here is more full description of those options:

-c --complement Complement set1 with respect to the universe of characters whose ASCII codes are 01 through 0377 octal. For example:
To replace every nonprinting character, other than valid control characters, with a ? (question mark), enter:
```
tr -c '[:print:][:cntrl:]' '?' < textfile > newfile
```
Here is more complex and rather elegant example in which the goal is to create a list of words in a file (option -s means "squeeze repeating symbols", see below):
```
tr -cs '[:lower:][:upper:]' '[\n*]' < text > words
```
This translates each sequence of characters other than lowercase ppercase letters into a single newline character. The * (asterisk) causes the tr command to repeat the new line character enough times to make the second string as long as the first string.

Extract digits form a string:
```
echo "Abc123d56E" | tr -cd '[[:digit:]]'
```
Output:
12356
-d, --delete Delete specified set of characters defined in set1 but do not translate. The most important usage of this tr option is for security purposes: it can sanitize all arguments so the evil user cannot submit commands as arguments in a script. Shell metacharacters, such as backticks, all kind of brackets ( ()[]{} ), colon and semicolon as well as =#$&!@|<> should be removed from the values of the argument, if they cannot occur in the argument before you start processing those values. If script is used by a considerable population there is always at least one blacksheep that will try to mangle input arguments to see what will happens ;-)
For example:
- ```
tr --delete '=;:`"<>,./?!@#$%^&(){}[]'
```
- tr can be used to change the carriage returns at the end of each line into the newline UNIX expects. tr allows you to specify characters as octal values by preceding the value with a backslash, so the command:
```
tr -d '\015' < pc.file > unix.file
```
  OR
```
tr -d '\r' < pc.file > unix.file
```
  will remove the carriage return from the carriage return/newline pair used by Microsoft OSes as a line terminator. Please note that this can also be done by dos2unix utility.
- To delete all NULL characters from a file:
```
tr -d '\0' < textfile > newfile
```
-s, --squeeze-repeats Replace sequences of the same character with one. -s uses set1 if neither translating nor deleting specified, otherwise squeeze uses set2 and occurs after translation or deletion. For example:
- To replace every sequence of characters in the <space> character class with a single : (colon) character, enter:
```
tr -s '[:space:]' '[\:*]' < in_file 
```
- To replace every sequence of one or more new lines with a single new line:
```
tr -s '\n' < textfile > newfile
```
  OR
```
tr -s '\012' < textfile > newfile
```
- Here is more complex and rather elegant example in which the goal is to create a list of words in a file:
```
tr -cs '[:lower:][:upper:]' '[\n*]' < text > words
```
  This translates each sequence of characters other than lowercase uppercase letters into a single newline character. The * (asterisk) causes the tr command to repeat the new line character enough times to make the second string as long as the first string.
- Create a list of the unique words contained in the file one per line:
```
cat infile | tr -cs "[:alnum:]" "\n" | sort | uniq -c | sort -rn
```
-t, --truncate-set1. Truncate set1 to the length of set2. By default set2 is truncated to the length of set1. This option reverse the default behavior. It is available only in GNU implementation of tr.

TR Set Notation

Sets set1 and set2 are specified as strings of characters. Most represent themselves. Interpreted sequences are:

\nnn -- character with octal value nnn

\xnn -- character with hexadecimal value nn

\\ -- backslash

\a -- alert

\b -- backpace

\f -- form feed

\r -- return

\t -- horizontal tab

\v -- vertical tab

\E -- escape

c1-c2 -- all characters from c1 to c2in ascending order. The character specified by c1 must collate before the character specified by c2.

[c1-c2] -- same as c1-c2 if both sets use this form

[c*] -- set2 extended to the length of set1 with the symbol c. In other words fills out the set2 with the character specified by c. This option can be used only at the end of the set2. Any characters specified after the * (asterisk) are ignored.

[c*N] -- N copies of symbol c. N is considered a decimal integer unless the first digit is a 0; then it is considered an octal integer.

[:alnum:] -- all letters and digits

[:alpha:] -- all letters

[:blank:] -- all horizontal whitespace

[:cntrl:] -- all control characters

[:digit:] -- all digits

[:graph:] -- all printable characters, not including space

[:lower:] -- all lower case letters

[:print:] -- all printable characters, including space

[:punct:] -- all punctuation characters

[:space:] -- all horizontal or vertical whitespace

[:upper:] -- all upper case letters

[:xdigit:] -- all hexadecimal digits

[=c=] -- Specifies all of the characters with the same equivalence class as the character specified by C.

Notes:

Translation occurs if -d is not given and both set1 and set2 appear
-t may be used only when translating.
set2 is extended to the length of set1 by repeating its last character as necessary. Excess characters in set2 are ignored.
Only [:lower:] and [:upper:] are guaranteed to expand in ascending order. They can be used in pairs to specify case conversion.
-s (Squeeze all strings of repeated output characters to single characters) uses set1 if neither translating nor deleting specified, otherwise squeeze uses set2 and occurs after translation or deletion.

Examples

(Some examples were adapted from AIX man page)

The --squeeze-repeats (or -s) switch reduces multiple occurrences of a set of the letters specified as the first argument to a single character for each of the letters,
```
$ echo"aaabbbccc" | tr -s 'abc'
abc
```
1. To replace every sequence of one or more new lines with a single new line, enter:
```
tr -s '\n' < textfile > newfile
```
  OR
```
tr -s '\012' < textfile > newfile
```
2. To replace every sequence of characters in the <space> character class with a single # (pound sign) character, enter:
```
tr -s '[:space:]' '#'
```
3. Replace with a single blank successive blanks characters (space & tab)
```
tr -s '[:blank:]' ' ' < input.txt > output.txt
```
The --complement (or -c) switch reverses the the set matching characters. The characters in the first parameter are not mapped into the second, but characters that aren't in the first parameter are changed to the indicated character.
```
$ printf "%s\n" "The cow jumped over the moon" | tr -c 'aeiou' '?'
??e??o???u??e??o?e????e??oo??
```
1. The following example creates a list of all the words in `file1' one per line in `file2', where a word is taken to be a maximal string of alphabetic character. The second string is quoted to protect `\' from the Shell. 012 is the ASCII code for newline.
```
tr -cs A-Za-z '\012' <file1 >file2
```
  Note: you can use more modern variant:
```
tr -cs "[:alpha:]" "\n" < file1 >file2 
```
  Or:
```
tr -cs '[:lower:][:upper:]' '\n' < file` > file2
```
2. To replace every nonprinting character, other than valid control characters, with a ? (question mark). Please note that in "replacement set the last character will be propagated to match the length of the first set:
```
tr -c '[:print:][:cntrl:]' '?' < textfile > newfile
```
3. This example scans a file created in a different locale to find characters that are not printable characters in the current locale and replace them with ? sign
```
tr -c "[:print:]" '?' < myfile.txt
```
To translate curvy braces into parentheses, enter:
```
tr '{}' '()' < textfile > newfile
```
This translates each { (left brace) to ( (left parenthesis) and each } (right brace) to ) (right parenthesis). All other characters remain unchanged.
To translate lowercase characters to uppercase, enter:
```
tr 'a-z' 'A-Z' < textfile > newfile
```
Remove all non-printable characters from the file:

tr -cd "[:print:]" < myfile.txt
To delete all NULL characters from a file, enter:
```
tr -d '\0' < textfile > newfile
```
tr supports character equivalence. To translate any e-like characters in a variable named FOREIGN_STRING to a plain e, for example, you use
```
$ printf "$FOREIGN_STRING" | tr "[=e=]" "e"
```
The --truncate-set1 (or -t) ignores any characters in the first parameter that don't have a matching character in the second parameter.
One of the most common use of tr is to translate MS-DOS text files to Unix text files. DOS text files have carriage returns and line feed characters, whereas Linux uses only line feeds to mark the end of a line. The extra carriage returns need to be deleted.
```
$ tr -d '\r' < dos.txt > linux.txt
```
Apple text files have carriage returns instead of line feeds. tr can take care of that as well by replacing the carriage returns.
```
$ tr '\r' '\n' < apple.txt > linux.txt
```
The other escaped characters recognized by tr are as follows:
- \o-- ASCII octal value o (one to three octal digits)
- \\-- Backslash
- \a-- Audible beep
- \b-- Backspace
- \f-- Form feed
- \n-- New line
- \r-- Return
- \t-- Horizontal tab
- \v-- Vertical tab

Emulating Unix TR command in Perl

Like many other simple Unix utilities TR can be emulated using Perl. See Perl tr function for details. Perl tr function has the same semantec and supports that same option as tr command.

For example here how to squeeze sequences of spaces and tabs into one single space

perl -pe 'tr/ \t/ /s; $_ = substr($_,1,-1)'

Here incocation of function is "tr/ \t/ /s", As you can see Perl requires that sets are delimited with some character (slash "/" in this example) and options are specified at the end. Other then that this is a direct analog of

tr -s ' \t' ' '

Additional examples are available from Perl One Liners

Difference Between tr and sed Command

Difference Between tr and sed Command Linux.com

10 Jul 13
In Linux every thing is in files and some time we need to edit files to make some changes. There are many command line utilities like vim, vi, nano etc that allow us to open file, find the particular word from file and replace it with our correct word. If we want to modify a large file without opening, there are also many command line utilities as echo, sed and tr in Linux that will allow us to modify a file without opening it. Sometimes we can do same modification in a file with sed and tr. As per below example we can use sed and tr command for same purpose. Where we can use both sed or tr, we will prefer to use of tr command because the tr is more faster. Of course, in many practical cases, the speed difference is too small to notice.

Suppose we have a string as "This+is+test+for+tr+and+sed" and we want to replace '+' with white-space ' ' and this type of replacement can be done with both tr as well sed command as below
[user@test ~]$ echo This+is+test+for+tr+and+sed |tr '+' ' '

This is test for tr and sed
[user@test ~]$ echo This+is+test+for+tr+and+sed |sed 's/\+/ /g'

This is test for tr and sed
We can use sed and tr commands as editor and basic text transformations, but there are difference in uses of tr command and sed command.

Difference between tr and sed

tr command Translate, squeeze, delete characters from standard input, writing to standard output. on the other hand sed is a stream editor or it is used to perform basic text transformations on an input stream

tr perform character based transformation but sed perform string based transformation.

For example
[user@test ~]$ echo I am a good boy | tr 'good' 'test'

I am a tsst bsy
tr has done character based transformation and it is replacing good to best as g=b, o=e, o=s, d=t and because o is double so it ignore first rule and using o=s and output is as above.
[user@test ~]$ echo I am a good boy | sed 's/good/best/g'

I am a best boy
But sed is string based transformation and if there will 'good' string more than one time those will replace with 'best'.

But in other cases tr command also more useful and easier. Just suppose that we have entered braces '{}' by mistake instead of parenthesis '()' in a file test.txt and we can translate braces with parenthesis with tr.
[user@test ~]$ tr '{}' '()' < test.txt > newtest.txt
It will replace '{}' with '()' from test.txt and save output in newtest.txt.

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

NEWS CONTENTS

20200305 : Converting between uppercase and lowercase on the Linux command line ( Mar 05, 2020 , www.networkworld.com )
20181030 : 10 tr Command Examples in Linux ( Oct 30, 2018 , www.tecmint.com )
20171022 : Unix text editing - sed, tr, cut, od ( Oct 22, 2017 , seismo.berkeley.edu )
20171022 : How to replace ® ™ with Spaces in UNIX Unix Linux Forums Shell Programming and Scripting ( How to replace ® ™ with Spaces in UNIX Unix Linux Forums Shell Programming and Scripting, )
20171022 : Non Printable & Special Characters: Problems and how to overcome them ( lexjansen.com )
20171022 : Delete non printable characters from file Unix Linux Forums UNIX for Dummies Questions & Answers ( Delete non printable characters from file Unix Linux Forums UNIX for Dummies Questions & Answers, )
20141005 : tr (Unix) ( Wikipedia )
20140619 : sed - unix tr find and replace ( Stack Overflow )
20130809 : Remove non-printable ASCII characters from a file with this simple Unix command by Alvin Alexander ( Aug 9, 2013 )
20130809 : tr page from Section 1 of the unix 8th manual ( tr page from Section 1 of the unix 8th manual, )
20110729 : tr Command ( tr Command, Jul 29, 2011 )
20090316 : Concatenate only digits from string - bash tr ( Mar 3, 2009 , UNIX BASH scripting )
20080912 : Understanding Linux - UNIX tr command ( Understanding Linux - UNIX tr command, Sep 12, 2008 )
20080222 : Text processing with UNIX by Chris Herborth ( Aug 01, 2006 , developerWorks )
20080222 : The GAWK Manual - Sample Program ( The GAWK Manual - Sample Program, Feb 22, 2008 )
20080221 : Understanding Linux - UNIX tr command ( Understanding Linux - UNIX tr command, Feb 21, 2008 )
20080221 : Commands Reference, Volume 5 - tr Command ( Commands Reference, Volume 5 - tr Command, Feb 21, 2008 )
20080221 : The tr command ( The tr command, Feb 21, 2008 )
20080221 : TR 95-10 A Taxonomy of Unix System and Network Vulnerabilities by Bishop ( TR 95-10 A Taxonomy of Unix System and Network Vulnerabilities, )
20070714 : ONLamp.com -- Sanitizing Mail on Panther Server ( ONLamp.com -- Sanitizing Mail on Panther Server, Jul 14, 2007 )
20070714 : Learn how to remove extended ASCII characters from Unix files ( Learn how to remove extended ASCII characters from Unix files, Jul 14, 2007 )
20070210 : Commonly Used Unix-like Commands ( Commonly Used Unix-like Commands, Feb 10, 2007 )
20051121 : Sobell on the Bourne Again Shell and the Linux Command Line - The Utility Known as tr ( LinuxPlanet )
20051121 : A Little Devil Called tr ( Linux Journal )
20051121 : Dogs of the Linux Shell ( Linux Journal )
20051121 : Linux and UNIX tr command help ( Linux and UNIX tr command help, )

Old News ;-)

[Mar 05, 2020] Converting between uppercase and lowercase on the Linux command line

Mar 05, 2020 | www.networkworld.com

https://www.networkworld.com/article/3529409/converting-between-uppercase-and-lowercase-on-the-linux-command-line.html
There are many ways to change text on the Linux command line from lowercase to uppercase and vice versa. In fact, you have an impressive set of commands to choose from. This post examines some of the best commands for the job and how you can get them to do just what you want.
Using tr
The tr (translate) command is one of the easiest to use on the command line or within a script. If you have a string that you want to be sure is in uppercase, you just pass it through a tr command like this:
$ echo Hello There | tr [:lower:] [:upper:]
HELLO THERE
[Get regularly scheduled insights by signing up for Network World newsletters.]
Below is an example of using this kind of command in a script when you want to be sure that all of the text that is added to a file is in uppercase for consistency:
#!/bin/bash

echo -n "Enter department name: "
read dept
echo $dept | tr [:lower:] [:upper:] >> depts
Switching the order to [:upper:] [:lower:] would have the opposite effect, putting all the department names in lowercase:

... ... ...

[Oct 30, 2018] 10 tr Command Examples in Linux

Oct 30, 2018 | www.tecmint.com

8. Here is an example of breaking a single line of words (sentence) into multiple lines, where each word appears in a separate line.
$ echo "My UID is $UID"

My UID is 1000

$ echo "My UID is $UID" | tr " "  "\n"

My 
UID 
is 
1000
9. Related to the previous example, you can also translate multiple lines of words into a single sentence as shown.
$ cat uid.txt

My 
UID 
is 
1000

$ tr "\n" " " < uid.txt

My UID is 1000
10. It is also possible to translate just a single character, for instance a space into a " : " character, as follows.
$ echo "Tecmint.com =>Linux-HowTos,Guides,Tutorials" | tr " " ":"

Tecmint.com:=>Linux-HowTos,Guides,Tutorials
There are several sequence characters you can use with tr , for more information, see the tr man page.

... ... ...

[Oct 22, 2017] Unix text editing - sed, tr, cut, od

Oct 22, 2017 | seismo.berkeley.edu

A tr script to remove all non-printing characters from a file is below. Non-printing characters may be invisible, but cause problems with printing or sending the file via electronic mail. You run it from Unix command prompt, everything on one line:
> tr -d '\001'-'\011''\013''\014''\016'-'\037''\200'-'\377' 
   < filein > fileout
What is the meaning of this tr script is, that it deletes all charactes with octal value from 001 to 011, characters 013, 014, characters from 016 to 037 and characters from 200 to 377. Other characters are copied over from filein to fileout and these are printable. Please remember, you can not fold a line containing tr command, everything must be on one line, how long it would be. In practice, this script solves some mysterious Unix printing problems.
Type in a text file named "f127.TR" with the line starting tr above. Print the file on screen with cat f127.TR command, replace "filein" and "fileout" with your file names, not same the file, then copy and paste the line and run (execute) it. Please, remember this does not solve Unix end-of-file problem, that is the character '\000', also known as a 'null', in the file. Nor does it handle binary file problem, that is a file starting with two zeroes '\060' and '\060'

Sometimes there are some invisible characters causing havoc. This tr command line converts tabulate- characters into hashes (#) and formfeed- characters into stars (*).
> tr '\011\014' '#*'  < filein > fileout
The numeric value of tabulate is 9, hex 09, octal 011 and in C-notation it is \t or \011. Formfeed is 12, hex 0C, octal 014 and in C-notation it is \f or \014. Please note, tr replaces character from the first (leftmost) group with corresponding character in the second group. Characters in octal format, like \014 are counted as one character each.

How to replace ® ™ with Spaces in UNIX Unix Linux Forums Shell Programming and Scripting

sed -e 's/\"®\"/ /g' -e 's/\"™\"/ /g' < file

Non Printable & Special Characters: Problems and how to overcome them

lexjansen.com

Non printable & special characters in clinical trial data create potential problems in producing quality deliverables. There could be major issues such as incorrect statistics / counts in the deliverables, or minor ones such as incorrect line breaks, page brakes or appearance of strange symbols in the reports. Identifying and deleting these issues could pose challenges. When faced with this issue in Pharmaceutical & Biotech industry, it is imperative to clean them up. We need to understand the underlying cause and use various techniques to identify and handle them.

Delete non printable characters from file Unix Linux Forums UNIX for Dummies Questions & Answers

Really it is better to use `strings` utility for this particular purpose...

strings filename

This cure one liner removes carriage return, which is probably not what you want, unless you have pretty printer ready. you should probably use tr -dc '[:print:]\n' < oldfile > newfile

tr -dc '[:print:]' < oldfile > newfile

[Oct 05, 2014] tr (Unix)

Wikipedia

Most versions of tr, including GNU tr and classic Unix tr, operate on single-byte characters and are not Unicode compliant. An exception is the Heirloom Toolchest implementation, which provides basic Unicode support.

Ruby and Perl also have an internal tr operator, which operates analogously. Tcl's string map command is more general in that it maps strings to strings while tr maps characters to characters.

[Jun 19, 2014] sed - unix tr find and replace

Stack Overflow

This is the command I'm using on a standard web page I wget from online.
tr '<' '\n<' < index.html
however it giving me newlines, but not adding the back carrot in again. e.g.
 echo "<hello><world>" | tr '<' '\n<'
returns
 (blank line which is fine)
 hello>
 world>
instead of
 (blank line or not)
 <hello>
 <world>
Thanks

That's because tr only does character-for-character substitution (or deletion).
Try sed instead.
echo '<hello><world>' | sed -e 's/</\n&/g'
Or awk.
echo '<hello><world>' | awk '{gsub(/</,"\n<",$0)}1'
Or perl.
echo '<hello><world>' | perl -pe 's/</\n</g'
Or ruby.
echo '<hello><world>' | ruby -pe '$_.gsub!(/</,"\n<")'
Or python.
echo '<hello><world>' \
| python -c 'for l in __import__("fileinput").input():print l.replace("<","\n<")'
===
I tried that but I get n<hello>n<world>. I don't know what the sed newline character is – Kamran224 Dec 1 '11 at 23:26
====

@Kamran224 This works for me but try: echo -e '<hello><world>' | sed -e 's/</\n&/g' – user649198 Dec 1 '11 at 23:29
====

@Kamran224 \n is a GNU sed extension. What system are you on? – ephemient Dec 1 '11 at 23:36
====

@ephemient SunOS (afs system on my campus) – Kamran224 Dec 1 '11 at 23:43
====

@Jaypal A string of 8 spaces does not equal a tab; you need a literal tab character. The 8-space thing is about tab stops, not tabs. – Michael J. Barber Dec 4 '11 at 7:27
====

Does this work for you?
awk -F"><" -v OFS=">\n<" '{print $1,$2}'

[jaypal:~/Temp] echo "<hello><world>" | awk -F"><" -v OFS=">\n<" '{$1=$1}1';
<hello>
<world>
You can put a regex / / (lines you want this to happen for) in front of the awk {} action.

====

'{$1=$1}1' is shorter and will work if there is more than >< on a line. – ephemient Dec 2 '11 at 0:10
====

Thanks @ephemient I agree, Have updated my answer. – jaypal Dec 2 '11 at 0:16
====

This would replace fewer of the < characters than in the question. – Michael J. Barber Dec 4 '11 at 7:29
====

If you have GNU grep, this may work for you:
grep -Po '<.*?>[^<]*' index.html
which should pass through all of the HTML, but each tag should start at the beginning of the line with possible non-tag text following on the same line.

If you want nothing but tags:
grep -Po '<.*?>' index.html
You should know, however, that it's not a good idea to parse HTML with regexes.

[Aug 9, 2013] Remove non-printable ASCII characters from a file with this simple Unix command By Alvin Alexander

Aug 9, 2013

For a variety of reasons you can end up with text files on your Unix filesystem that have binary characters in them. In fact, I showed you how to do this to yourself in my blog post about the Unix script command. (There's nothing wrong with this approach; it's just a by-product of using the script command.)

To fix this problem, and get the binary characters out of your files, there are several approaches you can take to fix this problem. Probably the easiest solution involves using the Unix tr command. Here's all you have to remove non-printable binary characters (garbage) from a Unix text file:
tr -cd '\11\12\15\40-\176' < file-with-binary-chars > clean-file
This command uses the -c and -d arguments to the tr command to remove all the characters from the input stream other than the ASCII octal values that are shown between the single quotes. This command specifically allows the following characters to pass through this Unix filter:
octal 11: tab
octal 12: linefeed
octal 15: carriage return
octal 40 through octal 176: all the "good" keyboard characters 
All the other binary characters -- the "garbage" characters in your file -- are stripped out during this translation process.

tr page from Section 1 of the unix 8th manual

The following example creates a list of all the words in `file1' one per line in `file2', where a word is taken to be a maximal string of alphabetics. The second string is quoted to protect `\' from the Shell. 012 is the ASCII code for newline.
tr -cs A-Za-z 
'\012' <file1 >file2

[Jul 29, 2011] tr Command

Examples
To translate braces into parentheses, type:
tr '{}' '()' < textfile > newfile
To translate braces into brackets type:
tr '{}' '\[]' < textfile > newfile
This translates each { (left brace) to [ (left bracket) and each } (right brace) to ] (right bracket). The left bracket must be entered with a \ (backslash) escape character.
To translate lowercase characters to uppercase, type:
tr 'a-z' 'A-Z' < textfile > newfile
To create a list of words in a file, type:
tr -cs '[:lower:][:upper:]' '[\n*]' < textfile > newfile
This translates each sequence of characters other than lowercase ppercase letters into a single newline character. The * (asterisk) causes the tr command to repeat the new line character enough times to make the second string as long as the first string.
To delete all NULL characters from a file, type:
tr -d '\0' < textfile > newfile
To replace every sequence of one or more new lines with a single new line, type:
tr -s '\n' < textfile > newfile
OR
tr -s '\012' < textfile very nonprinting character, other than valid control characters, with a ? (question mark), type:		
tr -c '[:print:][:cntrl:]' '[?*]' < textfile > newfile
This scans a file created in a different locale to find characters that are not printable characters in the current locale.
To replace every sequence of characters in the <space> character class with a single # character, type:
tr -s '[:space:]' '[#*]'

[Mar 16, 2009] Concatenate only digits from string - bash tr

Mar 3, 2009 | UNIX BASH scripting

Just to introduce a good use of Linux tr command; if you need to concatenate the digits from a string, here is a way:
$ echo "Abc123d56E" | tr -cd '[[:digit:]]' Output: 12356
From tr man pages:

tr [OPTION]... SET1 [SET2]
-c, -C, --complement: first complement SET1 -d, --delete : delete characters in SET1, do not translate

Similarly:
$ echo "Abc123d56E" | tr -d '[[:digit:]]' Output: AbcdE

[Sep 12, 2008] Understanding Linux - UNIX tr command

A clever example of how to use tr to convert text into one word per line. Too simplistic; should be at least [:alnum:]

Create a list of the words in /path/to/file, one per line, enter:
$ tr -cs "[:alpha:]" "\n" < /path/to/file
Where,

-c : Complement the set of characters in string1

-s : Replace each input sequence of a repeated character that is listed in SET1 with a single occurrence of that character

[Feb 22, 2008] Text processing with UNIX by Chris Herborth

Aug 01, 2006 | developerWorks

Translating text

Now that you know at least five different ways of generating some text, let's look at doing some simple translations on it.

The tr command lets you translate characters in one set to the corresponding characters in a second set. Let's take a look at a few examples (Listing 4) to see how it works.
Listing 4. Using tr to translate characters
echo "a test" | tr t p
echo "a test" | tr aest 1234
echo "a test" | tr -d t
echo "a test" | tr '[:lower:]' '[:upper:]'
Looking at the output of these commands (see Listing 5) gives you a clue about how tr works (here's a hint: it's a direct replacement of characters in the first set with the corresponding characters from the second set).
Listing 5. What has tr done?
chrish@dhcp3 [199]$ echo "a test" | tr t p
a pesp

chrish@dhcp3 [200]$ echo "a test" | tr aest 1234
1 4234

chrish@dhcp3 [201]$ echo "a test" | tr -d t
a es

chrish@dhcp3 [202]$ echo "a test" | tr '[:lower:]' '[:upper:]'
A TEST
The first and second examples are simple enough, replacing one character for another. The third example, with the -d option (delete), removes the specified characters completely from the output. This is often used to remove carriage returns from DOS text files to turn them into UNIX text files. Finally, the last example uses character classes (those names inside of [: :]) to convert all lower-case letters into upper-case letters. Portable Operating System Interface-standard (POSIX-standard) character classes include:

alnum: alphanumeric characters

alpha: alphabetic characters

cntrl: control (non-printing) characters

digit: numeric characters

graph: graphic characters

lower: lower-case alphabetic characters

print: printable characters

punct: punctuation characters

space: whitespace characters

upper: upper-case characters

xdigit: hexadecimal characters

Listing 6. Converting DOS text files into UNIX text files
tr -d '\r' < input_dos_file.txt > output_unix_file.txt
Although the tr command respects C locale environment variables (try man locale for more information about these), don't expect it to do anything sensible with UTF-8 documents, such as being able to replace lower-case accented characters with appropriate upper-case characters. The tr command works best with ASCII and the other standard C locales.

[Feb 22, 2008] The GAWK Manual - Sample Program

The following example is a complete awk program, which prints the number of occurrences of each word in its input. It illustrates the associative nature of awk arrays by using strings as subscripts. It also demonstrates the `for x in array' construction. Finally, it shows how awk can be used in conjunction with other utility programs to do a useful task of some complexity with a minimum of effort. Some explanations follow the program listing.
awk '
# Print list of word frequencies
{
    for (i = 1; i <= NF; i++)
        freq[$i]++
}

END {
    for (word in freq)
        printf "%s\t%d\n", word, freq[word]
}'
The first thing to notice about this program is that it has two rules. The first rule, because it has an empty pattern, is executed on every line of the input. It uses awk's field-accessing mechanism (see section Examining Fields) to pick out the individual words from the line, and the built-in variable NF (see section Built-in Variables) to know how many fields are available.

For each input word, an element of the array freq is incremented to reflect that the word has been seen an additional time.

The second rule, because it has the pattern END, is not executed until the input has been exhausted. It prints out the contents of the freq table that has been built up inside the first action.

Note that this program has several problems that would prevent it from being useful by itself on real text files:

Words are detected using the awk convention that fields are separated by whitespace and that other characters in the input (except newlines) don't have any special meaning to awk. This means that punctuation characters count as part of words.

The awk language considers upper and lower case characters to be distinct. Therefore, `foo' and `Foo' are not treated by this program as the same word. This is undesirable since in normal text, words are capitalized if they begin sentences, and a frequency analyzer should not be sensitive to that.

The output does not come out in any useful order. You're more likely to be interested in which words occur most frequently, or having an alphabetized table of how frequently each word occurs.

The way to solve these problems is to use other system utilities to process the input and output of the awk script. Suppose the script shown above is saved in the file `frequency.awk'. Then the shell command:
tr A-Z a-z < file1 | tr -cd 'a-z\012' \
  | awk -f frequency.awk \
  | sort +1 -nr
produces a table of the words appearing in `file1' in order of decreasing frequency.

The first tr command in this pipeline translates all the upper case characters in `file1' to lower case. The second tr command deletes all the characters in the input except lower case characters and newlines. The second argument to the second tr is quoted to protect the backslash in it from being interpreted by the shell. The awk program reads this suitably massaged data and produces a word frequency table, which is not ordered.

The awk script's output is now sorted by the sort command and printed on the terminal. The options given to sort in this example specify to sort by the second field of each input line (skipping one field), that the sort keys should be treated as numeric quantities (otherwise `15' would come before `5'), and that the sorting should be done in descending (reverse) order.

See the general operating system documentation for more information on how to use the tr and sort commands.

[Feb 21, 2008] Understanding Linux - UNIX tr command

Shell scripting example
In the following example you will get confirmation before deleting the file. If the user responds in lower case, the tr command will do nothing, but if the user responds in upper case, the character will be changed to lower case. This will ensure that even if user responds with YES, YeS, YEs etc; script should remove file:
#!/bin/bash echo -n "Enter file name : " read myfile echo -n "Are you sure ( yes or no ) ? " read confirmation confirmation="$(echo ${confirmation} | tr 'A-Z' 'a-z')" if [ "$confirmation" == "yes" ]; then [ -f $myfile ] && /bin/rm $myfile || echo "Error - file $myfile not found" else : # do nothing fi
Remove all non-printable characters from myfile.txt
$ tr -cd "[:print:]" < myfile.txt
Remove all two more successive blank spaces from a copy of the text in a file called input.txt and save output to a new file called output.txt
tr -s ' ' ' ' < input.txt > output.txt
The -d option is used to delete every instance of the string (i.e., sequence of characters) specified in set1. For example, the following would remove every instance of the word nameserver from a copy of the text in a file called /etc/resolv.conf and write the output to a file called ns.ipaddress.txt:
tr -d 'nameserver' < /etc/resolv.conf > ns.ipaddress.txt

[Feb 21, 2008] Commands Reference, Volume 5 - tr Command

From AIX man pages

Examples
To translate braces into parentheses, enter:
tr '{}' '()' < textfile > newfile
This translates each { (left brace) to ( (left parenthesis) and each } (right brace) to ) (right parenthesis). All other characters remain unchanged.
To translate braces into brackets, enter:
tr '{}' '\[]' < textfile > newfile
This translates each { (left brace) to [ (left bracket) and each } (right brace) to ] (right bracket). The left bracket must be entered with a \ (backslash) escape character.
To translate lowercase characters to uppercase, enter:
tr 'a-z' 'A-Z' < textfile > newfile
To create a list of words in a file, enter:
tr -cs '[:lower:][:upper:]' '[\n*]' < textfile > newfile
This translates each sequence of characters other than lowercase ppercase letters into a single newline character. The * (asterisk) causes the tr command to repeat the new line character enough times to make the second string as long as the first string.
To delete all NULL characters from a file, enter:
tr -d '\0' < textfile > newfile
To replace every sequence of one or more new lines with a single new line, enter:
tr -s '\n' < textfile > newfile
OR
tr -s '\012' < textfile > newfile
To replace every nonprinting character, other than valid control characters, with a ? (question mark), enter:
tr -c '[:print:][:cntrl:]' '[?*]' < textfile > newfile
This scans a file created in a different locale to find characters that are not printable characters in the current locale.
To replace every sequence of characters in the <space> character class with a single # (pound sign) character, enter:
tr -s '[:space:]' '[#*]'

[Feb 21, 2008] The tr command

Cat-ting our file (columns.txt) and then piping the output of the cat command to the input of the translate command causing all lowercase names to be translated to uppercase names.
cat columns.txt | tr '[a-z]' '[A-Z]'
Remember we have not modified the file columns.txt so how do we save the output? Simple, by redirecting the output of the translate command with '>' to a file called UpCaseColumns.txt with:
cat columns.txt | tr '[a-z]' '[A-Z]' > UpCaseColumns.txt            
Since the tr command, does not take a filename like sed did, we could have changed the above example to:
tr '[a-z]' '[A-Z]' < columns.txt > UpCaseColumns.txt            
As you can see the input to the translate command now comes, not from stdin, but rather from columns.txt. So either way we do it, we can achieve what we've set out to do, using tr as part of a stream, or taking the input from the stdin ('<').

TR 95-10 A Taxonomy of Unix System and Network Vulnerabilities by Bishop

[PDF]

[Jul 14, 2007] ONLamp.com -- `Sanitizing` Mail on Panther Server

[Jul 14, 2007] Learn how to remove extended ASCII characters from Unix files

In the shell program we use to remove all non-printable ASCII characters from a text file, we tell the tr command to delete every character in the translation process except for the specific characters we specify. In essence, we filter out the undesirable characters. The tr command we use in our program is shown below:

tr -cd '\11\12\40-\176' < $INPUT_FILE > $OUTPUT_FILE

In this command, the variable INPUT_FILE must contain the name of the Solaris file you'll be reading from, and OUTPUT_FILE must contain the name of the output file you'll be writing to. When the -c and -d options of the tr command are used in combination like this, the only characters tr writes to the standard output stream are the characters we've specified on the command line.

Although it may not look very attractive, we're using octal characters in our tr command to make our programming job easier and more efficient. Our command tells tr to retain only the octal characters 11, 12, and 40 through 176 when writing to standard output. Octal character 11 corresponds to the [TAB] character, and octal 12 corresponds to the [LINEFEED] character. The octal characters 40 through 176 correspond to the standard visible keyboard characters, beginning with the [Space] character (octal 40) through the ~ character (octal 176). These are the only characters retained by tr -- the rest are filtered out, leaving us with a clean ASCII file.

[Feb 10, 2007] Commonly Used Unix-like Commands

Example1: Change uppercase to lowercase in a file:

D:\temp>more score.txt
john 81 91
mark 82 93
tina 88 92

D:\temp>tr '[a-z]' '[A-Z]' < score.txt > score1.txt

D:\temp>more score1.txt
JOHN 81 91
MARK 82 93
TINA 88 92

[Nov 21, 2005] Sobell on the Bourne Again Shell and the Linux Command Line - The Utility Known as tr

LinuxPlanet

LP: Would you talk a little more about the tr utility?

Ah, tr. Well, first thing that comes to mind is that it is the answer to the trivia question, "Name a Linux utility that accepts input only from standard input and never from a file named as an argument on the command line." It is an odd beast that is useful only sometimes--but when it is useful it is very useful. Here is an excerpt that talks about tr:
"The tr utility reads standard input and, for each input character, maps it to an alternate character, deletes the character, or leaves the character alone. This utility reads from standard input and writes to standard output.

"The tr utility is typically used with two arguments, string1 and string2. The position of each character in the two strings is important: Each time tr finds a character from string1 in its input, it replaces that character with the corresponding character from string2.

"With one argument, string1, and the --delete option, tr deletes the characters specified in string1. The option --squeeze-repeats replaces multiple sequential occurrences of characters in string1 with single occurrences (for example, abbc becomes abc).

"You can use a hyphen to represent a range of characters instring1 or string2. The two command lines in the following example produce the same result:
$ echo abcdef | tr  'abcdef' 'xyzabc'
xyzabc
$ echo abcdef | tr  'a-f' 'x-za-c'
xyzabc
"The next example demonstrates a popular method for disguising text, often called ROT13 (rotate 13) because it replaces the first letter of the alphabet with the thirteenth, the second with the fourteenth, and so forth.
$ echo The punchline of the joke is ... |
> tr 'A-M N-Z a-m n-z' 'N-Z A-M n-z a-m'
Gur chapuyvar bs gur wbxr vf ...
"To make the text intelligible again, reverse the order of the arguments to tr:
$ echo Gur chapuyvar bs gur wbxr vf ... |
> tr 'N-Z A-M n-z a-m' 'A-M N-Z a-m n-z'
The punchline of the joke is ...
"The --delete option causes tr to delete selected characters:
$ echo If you can read this, you can spot the missing vowels! |
> tr --delete 'aeiou'
If y cn rd ths, y cn spt th mssng vwls!
"In the following example, tr replaces characters and reduces pairs of identical characters to single characters:
$ echo tennessee | tr --squeeze-repeats 'tnse' 'srne'
serene
"The next example replaces each sequence of nonalphabetic characters (the complement of all the alphabetic characters as specified by the character class alpha) in the file draft1 with a single NEWLINE character. The output is a list of words, one per line.
$ tr --complement --squeeze-repeats '[:alpha:]' '\n' < draft1
"The final example uses character classes to upshift the string hi there:
$ echo hi there | tr '[:lower:]' '[:upper:]'
HI THERE

A Little Devil Called tr

Linux Journal

Luckily, we can also use ranges of characters to specify the characters more efficiently:
tr a-z A-Z Ever had those horrible upper case DOS file names? Here's a Bourne script to take care of them:
for f in *; do mv $f `echo $f | tr A-Z a-z` done 
Many UNIX editors allow some text to be processed by the shell. For example, to replace all upper case characters of the next paragraph with lower case while in vi, type:
tr A-Z a-z 
As another example, the command:
tr a-z A-Z 
capitalizes the current and next line (the character after the ! is a movement character). If you read the International Obfuscated C Code Contest (ftp://ftp.uu.net./pub/ioccc/), you frequently see that part of the hints are coded by a method called rot13. rot13 is a Caesar cypher, i.e., a cypher in which all letters are shifted some number of places. For example, a becomes b, b becomes c, ..., y becomes z, and z becomes a. In rot13 each letter is shifted 13 places. It is a weak cypher, and to decipher it, you can use rot13 again. You can also use tr to read the text in this way:
tr a-zA-Z n-za-mN-ZA-M 
Another interesting way to use tr is to change files from Macintosh format to UNIX format. For returns, the Macintosh uses \r while UNIX uses \n. GNU tr allows you to use the C special characters, so type:
tr \r \n

If you don't have GNU's version of tr, you can always use the corresponding octal numbers as shown here:
tr \015 \012 
You might wonder what would happen if the second string is shorter than the first string. POSIX says this is not allowed. System V says that only that portion of the first string is used that has a matching character in the second string. BSD and GNU pad the second string with its final character in order to match the length of the first string. The reason this last method is handy becomes clearer when we take complements into account. Assume you wish to make a list of all words and keywords in your listing. When you use -c, tr complements the first string. In C, all identifiers and keywords consist of a-zA-Z0-9_, so those are the characters we want to keep. Thus, we can do the following:
tr -c a-zA-Z0-9_ \n
If we pipe the tr output through sort -u, we get our desired list. If we follow POSIX, the second string would have to describe 193 newline characters (described as \n*193 or \n*). If we use system V, only the zero byte is translated to a newline, since the complement of a-zA-Z0-9_ starts with the zero byte.

The second important use of tr is to remove characters. For this option, you use the flag -d with one string as an argument. To fix up those nasty MS-DOS text files with a ^M at the end of the line and a trailing ^Z, specify tr in this way:
tr -d \015\032
Many people have written a program in C to do this same operation. Well, a C program isn't necessary--you only need to know the right program, tr, with the right flags. The -d flag isn't used often, but is nice to have when needed. You can combine it with the -c flag to delete everything except characters from the string you supplied as an argument.
Repeated characters can be squeezed into a single one using the -s option with one string as an argument. It can also be used to squeeze white space. To remove empty lines, type:
tr -s \n The -s option can be used with two strings as arguments. In that case, tr first translates the text as if -s were not given and then tries to squeeze the characters in the second string. For instance, we can squeeze all standard white space to a single space by specifying:
tr -s \n [ *] 
The -d flag can also be used with two strings: the characters in the first string will be removed and the characters in the second string will be squeezed. tr may not be a great program; however, it gets the job done. It is particularly useful in scripts using pipes and command substitutions (i.e., inside the back quotes). If you use tr often, you'll learn to appreciate its capabilities. Small is beautiful.

Dogs of the Linux Shell

Linux Journal

t r is a simple pattern translator. Its practical application overlaps a bit with other, more complex tools, such as sed and awk [with larger binary footprints]. tr is quite useful for simple textual replacements, deletions and additions. Its behavior is dictated by "from" and "to" character sets provided as the first and second argument. The general usage syntax of tr is as follows:
# (12)  tr usage
tr [options] "set1" ["set2"] < input > output
Note that tr does not accept file arguments; it reads from standard input and writes to standard output. When two character sets are provided, tr operates on the characters contained in "set1" and performs some amount of substitution based on "set2". Listing 1 demonstrates some of the more common tasks performed with tr.
# (13) Transform lower case alphas to their
#      equivelent upper case.
$ echo "Hello World." | tr "[a-z]" "[A-Z]"
HELLO WORLD.

# (14) Same lower to upper transformation -
#      uses character class names :lower:
#      and :upper:.  (tr recognizes 12
#      character class names).
$ tr "[:lower:]" "[:upper:]" README > UPPER_README

# (15) Make $PATH a bit more readable/searchable -
#  substitude ':' with a line feed
$ echo $PATH | tr ":" "\n"
/usr/bin
/bin
/usr/local/bin
.....
$ echo $PATH | tr ":" "\n" | grep -i "local"
/usr/local/bin
/usr/home/curly/Local_bin

# (16) Remove all white space from a file.
$ tr -d "[:space:]" < README > NO_WHITE_SPACE

# (17) Substitute all single or sequence of ;
#      with a single :
$ echo ";;;;This;;is;a;;;;simple;;;example." \
| tr -s ";" ":"
:This:is:a:simple:example.

Linux and UNIX tr command help

this example takes an echo response of '12345678 9247' and pipes it through the tr replacing the appropriate numbers with the letters. In this example it would return computer hope.

echo "12345678 9247" | tr 123456789 computerh -

this example would take the file myfile1 and strip all non printable characters and take that results to myfile2.

tr -cd '\11\12\40-\176' < myfile1 > myfile2

Reference

tr(1) translate-delete char - Linux man page

tr - translate or delete characters
Synopsis
tr [OPTION]... SET1 [SET2]
Description

Translate, squeeze, and/or delete characters from standard input, writing to standard output.

-c, -C, --complement

use the complement of SET1

-d, --delete

delete characters in SET1, do not translate

-s, --squeeze-repeats

replace each input sequence of a repeated character that is listed in SET1 with a single occurrence of that character

-t, --truncate-set1

first truncate SET1 to length of SET2

--help

display this help and exit

--version

output version information and exit

SETs are specified as strings of characters. Most represent themselves. Interpreted sequences are:

\NNN

character with octal value NNN (1 to 3 octal digits)

\\

backslash

\a

audible BEL

\b

backspace

\f

form feed

\n

new line

\r

return

\t

horizontal tab

\v

vertical tab

CHAR1-CHAR2

all characters from CHAR1 to CHAR2 in ascending order

[CHAR*]

in SET2, copies of CHAR until length of SET1

[CHAR*REPEAT]

REPEAT copies of CHAR, REPEAT octal if starting with 0

[:alnum:]

all letters and digits

[:alpha:]

all letters

[:blank:]

all horizontal whitespace

[:cntrl:]

all control characters

[:digit:]

all digits

[:graph:]

all printable characters, not including space

[:lower:]

all lower case letters

[:print:]

all printable characters, including space

[:punct:]

all punctuation characters

[:space:]

all horizontal or vertical whitespace

[:upper:]

all upper case letters

[:xdigit:]

all hexadecimal digits

[=CHAR=]

all characters which are equivalent to CHAR

Translation occurs if -d is not given and both SET1 and SET2 appear. -t may be used only when translating. SET2 is extended to length of SET1 by repeating its last character as necessary. Excess characters of SET2 are ignored. Only [:lower:] and [:upper:] are guaranteed to expand in ascending order; used in SET2 while translating, they may only be used in pairs to specify case conversion. -s uses SET1 if not translating nor deleting; else squeezing uses SET2 and occurs after translation or deletion.

Solaris 9 man pages section 1 User Commands

Any combination of the options -c, -d, or -s may be used:

-c Complement the set of characters in string1 with respect to the universe of characters whose ASCII codes are 01 through 0377 octal.
-d Delete all input characters in string1.
-s Squeeze all strings of repeated output characters that are in string2 to single characters.

Example 1 Creating a list of all the words in a filename

The following example creates a list of all the words in filename1, one per line, in filename2, where a word is taken to be a maximal string of alphabetics. The second string is quoted to protect `\' from the shell. 012 is the ASCII code for NEWLINE.

example% tr -cs A-Za-z '\012' <filename1>filename2

tr Command -- AIX man page

Flags

`-A`	Performs all operations on a byte-by-byte basis using the ASCII collation order for ranges and character classes, instead of the collation order for the current locale.
`-c`	Specifies that the value of String1 be replaced by the complement of the string specified by String1. The complement of String1 is all of the characters in the character set of the current locale, except the characters specified by String1. If the `-A` and `-c` flags are both specified, characters are complemented with respect to the set of all 8-bit character codes. If the `-c` and `-s` flags are both specified, the `-s` flag applies to characters in the complement of String1.
`-d`	Deletes each character from standard input that is contained in the string specified by String1.
`-s`	Removes all but the first in a sequence of a repeated characters. Character sequences specified by String1 are removed from standard input before translation, and character sequences specified by String2 are removed from standard output.
String1	Specifies a string of characters.
String2	Specifies a string of characters.

The tr command

Cat-ting our file (columns.txt) and then piping the output of the cat command to the input of the translate command causing all lowercase names to be translated to uppercase names.
cat columns.txt | tr '[a-z]' '[A-Z]'
Remember we have not modified the file columns.txt so how do we save the output? Simple, by redirecting the output of the translate command with '>' to a file called UpCaseColumns.txt with:
cat columns.txt | tr '[a-z]' '[A-Z]' > UpCaseColumns.txt
Since the tr command, does not take a filename like sed did, we could have changed the above example to:
tr '[a-z]' '[A-Z]' < columns.txt > UpCaseColumns.txt
As you can see the input to the translate command now comes, not from stdin, but rather from columns.txt. So either way we do it, we can achieve what we've set out to do, using tr as part of a stream, or taking the input from the stdin ('<').

We can also use translate in another way: to distinguish between spaces and tabs. Spaces and tabs can be a pain when using scripts to compile system reports. What we need is a way of translating these characters. Now, there are many ways to skin a cat in Linux and shell scripting. I'm going to show you one way, although I'm sure you could now write a sed expression to do the same thing.

Assume that I have a file with a number of columns in it, but I am not sure about the number of spaces or tabs between the different columns, I would need some way of changing these spaces into a single space. Why? Since, having a space (one or more) or a tab (one or more) between the columns will produce significantly different output if we extracted information from the file with a shell script. How do we do convert many spaces or tabs into a single space? Well, translate is our right-hand man (or woman) for this particular task. In order not to waste our time modifying our columns.txt let's work on the free command, which shows you free memory on your system. Type:
free 
If you look at the output you will see that there's lots of spaces between each one of these fields. How do we reduce multiple spaces between fields to a single space? We can use to tr to squeeze characters (you can squeeze any characters but in this case we want to squeeze a space):
free |tr -s ' '
The -s switch tells the translate command to squeeze. (Read the info page on tr to find out all the other switches of tr).

We could squeeze zeroes with:
free | tr -s '0'
Which would obviously make zero sense!

Going back to our previous command of squeezing spaces, you'll see immediately that our memory usage table (which is what the free command produces) becomes much more usable because we've removed superfluous spaces.

Perhaps, we want some fields from the output. We could redirect the output of this into a file with:
free | tr -s ' ' > file.txt
Traditional systems would have you use a Text editor to cut and paste the fields you are interested in, into a new file. Do we want to do that? Absolutely not! We're lazy, we want to find a better way of doing this.

What I'm interested in, is the line that contains 'Mem'. As part of your project, you should be building a set of scripts to monitor your system. Memory sounds like a good one that you may want to save. Instead of just redirecting the tr command to a file, let's first pass it through sed where we extract only the lines beginning with the word "Mem":
free | tr -s ' ' | sed '/^Mem/!d'
This returns only the line that we're interested in. We could run this over and over again, to ensure that the values change.

Let's take this one step further. We're only interested in the second, third and fourth fields of the line (representing total memory, used memory and free memory respectively). How do we retrieve only these fields?

Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D

Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: February, 10, 2021