Softpanorama
May the source be with you, but remember the KISS principle ;-)

Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

Unix find tutorial

Prev | Contents | Next

Contents

  1. Introduction
  2. Find search expressions
  3. Finding files using name or path
  4. Finding files by age
  5. Using -exec option and xargs with find
  6. Finding SUID/SGUID files
  7. Finding World Writable, Abandoned and other Abnormal Files
  8. Finding Files based on size: largest, empty and within certain range
  9. Additional ways of controlling tree traversal
  10. Using find for backups
  11. Examples of Usage of Unix Find Command
  12. Typical Errors in using find
  13. Summary
  14. Webliography

Part 5: Using -exec option and xargs with find

Find is capable to perform several actions on the files or directories that are found. Among most commonly used actions are

Never use -exec option in a hurry or under pressure. Always test correctness of selected files with -ls option first before running "destructive" command on them

While -print option is pretty benign, the -exec option is a powerful and dangerous tool. Administrator folklore contains many horror stories of wiping out important filesystems by misunderstanding this option of file command. The rule No.1 is using -exec option is very simple: unless you enjoy the situation commonly called SNAFU, always test find command containing exec using -ls option instead of -exec to see if the files selected are the files you really wish to process.

TIPS:

  1. Unless you really need to proceed the whole subtree use -maxdepth 1 to prevent getting extra files in results.
  2. Always use  the option -0  with  xargs commands if you supply the list of files generated by find to it. Correspondingly always use option -print0 of find command to generate such a list. It prevent mistreating files with spaces in the name (which typically comes from Windows environment)  option 

Find is able to execute one or more commands for each file it has found with the -exec option. Unfortunately, one cannot simply enter the command. You need to remember two syntactic tricks:

  1. The command that you want to execute needs to contain a special macro argument {}, which will be replaced by the matched filename on each invocation of -exec predicate.  You can use {} and it will evaluate to the same file and path.

  2. You need to specify \; (or ';' ) at the end of the command. (If the \ is left out, the shell will interpret the ; as the end of the find command.)

    For example, the following two commands are equivalent:

    find . -name "*rc.conf" -exec chmod o+r {} \;
    find . -name "*rc.conf" -exec chmod o+r {} ';' 

    NOTE: In case {} macro parameter is the last item in the command then it should be a space between the {} and the \;. For example:

    find . -type d -exec ls -ld {} \; 

If you attempt to make changes that involve system directories it is better to do it in two stages. first create a file with the list of changes using find and verify that it is accurate. Then use xargs with option -p (see below) to process this file.

In case of deletion of the file GNU find  has option -delete which is safer then "-exec /bin/rm {} \;". For example find / -name core -delete 

In case of deletion of the file GNU find has option -delete which is safer then " -exec /bin/rm {} \; ". For example find / -name core -delete

There is classic problem of using rm in case you have filenames with spaces, for example files that migrated to Unix filesystem from Windows where, unfortunately, using spaces in filenames is a common practice. For example you might need to delete all documents that ends with "doc", by writing :

find /mnt/zip -name "*doc copy" -delete

There are several way to prevent this nasty error:

Again it is better to experiment first to see if everything is right if you deal with important files. Five minutes of testing can save five or more hours of desperate attempts to recover accidentally deleted files.

Here are examples of "good practices" of using find. We will use chmod as the base of examples. Many people do not think about commands like chmod or chown as particularly dangerous, but applied to root filesystem they can be pretty devastating. Please note that we first get to the target directory using cd and only then are using find  command with "." (dot) argument. This avoids such unpleased situation as typing "/ etc" instead of "/etc".

Or worse "/etc" instead of etc (the intention was to get to local etc directory but string "/etc" is hardwired in sysadmin brains and this slip costs many sysadmins tremendous pain):

Test command:

find . -type f -ls

Final command:

find `pwd` -type f -exec chmod 500 {} ';'
The command bellow search in the current directory and all sub directories and change permissions of each file as specified. Here an additional danger is connected with being in a wring directory.

 

Test command:

find . -name "*rc.conf"  -ls

Final command:

find `pwd` -name "*rc.conf"  -exec chmod o+r {} \;

This command will search in the current directory and all sub directories. All files named *rc.conf will be processed by the chmod -o+r command. The argument {} is a macro that expands to each found file. The \; argument indicates the -exec argument has ended. You can use ';' instead:

find `pwd` -name "*rc.conf" -exec chmod o+r {} ';' 

The end results of this command is all *rc.conf files have read bit set in "other" permissions.

The find  command is commonly used to remove core files that are more than a few 24-hour periods (days) old. These core files are copies of the actual memory image of a running program when the program dies unexpectedly. They can be huge, so occasionally trimming them is wise:

Test command:

find . -name core -ctime +4 -ls

Final command:

find `pwd` -name core -ctime +4 -exec /bin/rm -f {} \;
For grep the /dev/null  argument can by used to show the name of the file before the text that is found. Without it, only the text found is printed. An equivalent mechanism in GNU find  is to use the "-H" or "--with-filename" option to grep:
find /tmp -exec grep "search	string" {} /dev/null \; -print

An alternative to -exec option is piping output into xargs command which we will discuss in the next section.

Feeding find  output to pipes with xargs

One of the biggest limitations of the -exec option (or predicate with the side effect to be more correct) is that it can only run the specified command on one file at a time.

Always check the correctness of the list of the files selected by  find command

The xargs command solves this problem by enabling users to run a single command on many files at one time. In general, it is much faster to run one command on many files, because this cuts down on the number of invocations of particular command/utility.

Note: Option -print0 with print list of filenames with null character (\0) instead of whitespace as the output delimiter between pathnames found. This is a safer option if files can contain blanks or other special characters if you use find  with xargs (the -0 argument is needed in xargs.). 

For example often one needs to find files containing a specific pattern in multiple directories one can use an exec option in find 

 find . -type f -exec grep -iH '/bin/ksh' {} \; 

But there is more elegant and more Unix-like way of accomplishing the same task using xargs and pipes. You can use the xargs to read the output of find  and build a pipeline that invokes grep. This way, grep is called only four or five times even though it might check through 200 or 300 files. By default, xargs always appends the list of filenames to the end of the specified command, so using it with grep and most other Unix command is pretty natural:

find . -type f -print | xargs grep -il 'bin/ksh' 

This gave the same output a lot faster (-l option in grep prints only the names of files with matching lines, separated by NEWLINE characters. Does not repeat the names of files when the pattern is found more than once.). You can also filter output using additional grep stage of pipeline before xargs

Also when the xargs is used with grep, the latter it will be getting multiple filenames. In this case grep will automatically includes the filename of any file that contains a match. Still option -H for grep (or addition /dev/null to the list of files) is recommended as the last "chunk" of filenames can contain a single file.

When used in combination, find, grep, and xargs are a potent team to help find files lost or misplaced anywhere in the UNIX file system. I strongly encourage you to experiment further, as this is a sysadmin skill that is really necessary in the current environment with god knows how many files even in such directories as /etc

You can use time command to find the difference in speed with -exec option vs. xargs in the following way:

time find /usr/src -name "*.html" -exec grep -H "foo" {} ';' | wc -l
time find /usr/src -name "*.html" | xargs grep -l "foo" | wc -l

On any substantial set of files xargs works considerably faster. The difference becomes even greater when more complex commands are run and the list of files is longer.

Two other useful options for xargs are the -p option, which makes xargs interactive, and the -n  option, which makes xargs run the specified command each tine with  N number of arguments. As we mentioned before the option -0 prevent mistreating files with spaces in the name (which typically comes from windows environment) and should be used option  -print0 of find command

I would like to stress it again and again that his is a vital option if you can have filenames with spaces in you filesystem. As this is a pretty high change in modern Unix environment, I recommend using it as the default option. That means always.   If you add option -print0 to find command and option -0 to xargs command, you can avoid the danger to processing wrong file name with blanks as multiple files with potential catastrophic consequences if you use some destruction option in -exec or xargs:

find /mnt/zip -name "*prefs copy" -print0 | xargs -0 rm

Using option -p you can provide manual confirmation of each action. The reason is that xargs runs the specified command on the filenames from its standard input, so interactive commands such as cp -i, mv -i, and rm -i don't work right.

So when you run the command first time you can use this option as a safety valve. After several operations with confirmation you can cancel it and run without option -p. The -p option solves that problem. In the preceding example, the -p option would have made the command safe because I could answer yes or no to each file. Thus, the command I typed was the following:

find /mnt/zip -name "*prefs copy" -print0 | xargs -p rm

Many users frequently ask why xargs  should be used when shell command substitution archives the same results. Take a look at this example:

grep foo `find /usr/src/linux -name "*.html"`

The drawback with commands such as this is that if the set of files returned by find  is longer than the system's command-line length limit, the command will fail. The xargs  approach gets around this problem because xargs  runs the command as many times as is required, instead of just once.

People are doing pretty complex staff this way. For example (Ubuntu Forums, March 23rd, 2010)

FakeOutdoorsman

I'm trying to convert Nikon NEF images to jpg. Usually I use find and xargs for batch processes like this for example:

Code:

find . -name "*.flac" -exec basename \{\} .flac \; | xargs -i ffmpeg \
	-i \{\}.flac -acodec libvorbis -aq 3 \{\}.ogg

However, my latest attempt is giving me no output because I can't seem to get xargs to work with pipes. An individual set of commands works:

Code:

dcraw -w -c MEY_7046.NEF | convert - -resize 25% MEY_7046.jpg exiftool \
	-overwrite_original -TagsFromFile MEY_7046.NEF MEY_7046.jpg dcraw -z MEY_7046.jpg

A nice set of commands, but not yet practical for converting a DVD with multiple directories. My truncated find-isized version does nothing:

Code:

find . -name "*.NEF" -exec basename \{\} .NEF \; | xargs -i dcraw -w -c \{\}.NEF | convert - -resize 25% \{\}.jpg

Any ideas of where I'm going wrong?

diesch:

That pipes the output of all the dcraw runs together into one convert call.

Try

Code:

find . -name "*.NEF" -exec basename \{\} .NEF \; | xargs -i sh -c 'dcraw -w -c $0.NEF | convert - resize 25% $0.jpg'

In this example you can also use -0 argument to xargs.

But ability of xargs to use multiple argument can be a source of the problems too. For example

find . -type f -name "*.java" | xargs tar cvf myfile.tar

Here the attempt is made to create a backup of all java files in the current tree: But if the list length for xargs to invoke the tar command is too  big, xargs will split it into multiple command, and subsequent tar commands will overwrite previous tar archives. As a result archive will contain a fraction of files, but without testing you might discover this sad side effect too late.

To solve this problem you can use either file with the list of files to include in the archive (tar can read a list of files from the file using option -T) or option "-r" which tells tar to append to the archive (option '-c' means "create"):.

find . -type f -name "*.java" | xargs tar rvf myfile.tar

Gotchas

The -exec option in find  command is a very sharp tool. Below we'll present some of the horror stories (see also Typical Errors In Using Find). Such errors are often made under time pressure or when the person is very tied and situation awareness is low.

Please remember that five minutes of testing usually can save five or more hours of desperate attempts to recover from the results of incorrectly run find  command.

Please remember that five minutes of testing usually can save five or more hours of desperate attempts to recover from the results of incorrectly run find  command.

Typically "find blunders" are committed when a complex find  command that changes the files in a certain subtree using rm, chown, or chmod command is constructed and run without any testing. So in many cases this is a direct result of recklessness of sysadmin. Sometimes it is result of time pressure of being extremely tired.

Often you just can't foresee the results of particular find command without testing. For example, sometimes the directories that are used contain symbolic links to directories in other part of filesystem and "find start running wild" on subtree that you never intended it to run. Sometimes the pattern that you use has unintended side effect. Sometimes it just a silly typo. Life of sysadmin is a complex one so little testing does wonders in preventing nasty surprises from overconfidence in your own abilities :-).

Here are some pretty telling examples:

Prev | Contents | Next




Etc

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least


Copyright © 1996-2014 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting hosting of this site with different providers to distribute and speed up access. Currently there are two functional mirrors: softpanorama.info (the fastest) and softpanorama.net.

Disclaimer:

The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: October 11, 2014;