Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers May the source be with you, but remember the KISS principle ;-) Skepticism and critical thinking is not panacea, but can help to understand the world better

# Classic Unix Utilities

 News Recommended Books Recommended Links Pipable Tools Reference Shells bash Pipes Selected Manpages cut find sort awk tr dd tee ifconfig xargs uniq route cat tail grep sudo sar eval paste head teraterm screen split expr join rpm ln at wc ps YUM rpm diff touch alias cp diff_tools vnc expect logrotate netcat Curl tar cron chpasswd gcc chmod chown df du mkisofs logrotate tree kill nohup pushd popd dirs last e2label hostname nslookup dig basename dirname dos2unix dmesg Saferm -- wrapper for rm command OFM CVS date history Horror Stories Unix Sysadmin Tips Unix History Humor Etc
 There are many people who use UNIX or Linux but who IMHO do not understand UNIX. UNIX is not just an operating system, it is a way of doing things, and the shell plays a key role by providing the glue that makes it work. The UNIX methodology relies heavily on reuse of a set of tools rather than on building monolithic applications. Even perl programmers often miss the point, writing the heart and soul of the application as perl script without making use of the UNIX toolkit. David Korn(bold italic is mine -- BNN)

## Alphabetical list

 A B C D E F G H I J K L M N O P R S Q T U V W X Y Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

P

R

S

T

U

V

W

X

Y

Z

IMHO there are three Unix tools that can spell the difference between really good programmer or sysadmin and just above average one (even if the latter has solid knowledge of shell and Perl, knowledge of shell and Perl is necessary but not sufficient):

• OFM (Midnight Commander, Deco, XNC) - a unique class of file managers that greatly accelirate working with the classic command line Unix tools. Paradoxically came to Unix from DOS. See The Orthodox File Manager(OFM) Paradigm. Chapter 4.

• Expect - a unique Unix tool (that is now available for Windows too). BTW one of the earlier names for Expect was "sex" as it related to "intercourse" of programs ;-). I strongly recommend to learn how to use it. See TCL, TK & Expect for more information

• TCL -- Tool command language. This is a unique language that permits automating tasks that neither shell not Perl can do. It is used in Expect (see above). Unfortunately politics of Unix (forking efforts of Richard Stallman (see Guile, a Scheme-based GNU macro language :-( ) and, especially, Sun fascination with Java) prevented TCL from becoming a standard Unix macro language. As Wikipedia noted " Despite the enthusiasm of its users and developers, many novice programmers find Scheme intimidating - and the average skill level of scripting language programmers is substantially lower than for system and application programmers. Hence Guile, despite its many benefits, struggles for mainstream acceptance in the Linux/Unix world. ". For the dark side of RMS see The Tcl War and the second part of my RMS biography

This two tools can also be used as a fine text in interviews on advanced Unix-related positions if you have several similar candidates. Other things equal, their knowledge definitely demonstrate the level of Unix culture superior to the average "command line junkies" level ;-)

Overview of books about GNU/open source tools can be found in Unix tools bibliography. There not that much good books on the subject, still even average books can provide you with insight in usage of the tool that you might never get via daily practice.

Please note that Unix is a pretty complex system and some aspects of it are non-obvious even for those who have more than ten years of experience.

Dr. Nikolai Bezroukov

 Top Visited

Your browser does not support iframes.

Switchboard Latest Past week Past month

## Old News ;-)

#### [Oct 09, 2019] The gzip Recovery Toolkit

###### Oct 09, 2019 | www.aaronrenn.com

So you thought you had your files backed up - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well.

I'm very eager for feedback on this program . If you download and try it, I'd appreciate and email letting me know what your results were. My email is arenn@urbanophile.com . Thanks.

ATTENTION

99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted.

Disclaimer and Warning

This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified.

Note that version 0.8 contains major bug fixes and improvements. See the ChangeLog for details. Upgrading is recommended. The old version is provided in the event you run into troubles with the new release.

You need the following packages:

First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make . Install manually by copying to the directory of your choice.

Usage

Run gzrecover on a corrupted .gz file. If you leave the filename blank, gzrecover will read from the standard input. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. The default filename when reading from the standard input is "stdin.recovered". To write recovered data to the standard output, use the -p option. (Note that -p and -o cannot be used together).

To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect the stderr stream to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. Note that gzrecover will take longer than regular gunzip. The more corrupt your data the longer it takes. If your archive is a tarball, read on.

For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.6 or higher) handles corrupted files out of the box.

Here's an example:

$ls *.gz my-corrupted-backup.tar.gz$ gzrecover my-corrupted-backup.tar.gz
$ls *.recovered my-corrupted-backup.tar.recovered$ cpio -F my-corrupted-backup.tar.recovered -i -v


Note that newer versions of cpio can spew voluminous error messages to your terminal. You may want to redirect the stderr stream to /dev/null. Also, cpio might take quite a long while to run.

The gzip Recovery Toolkit v0.8
Copyright (c) 2002-2013 Aaron M. Renn ( arenn@urbanophile.com )

#### [Oct 09, 2019] gzip - How can I recover files from a corrupted .tar.gz archive - Stack Overflow

###### Oct 09, 2019 | stackoverflow.com

15

George ,Jun 24, 2016 at 2:49

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

JohnEye ,Oct 4, 2016 at 11:27

Recovery is possible but it depends on what caused the corruption.

If the file is just truncated, getting some partial result out is not too hard; just run

gunzip < SMS.tar.gz > SMS.tar.partial


which will give some output despite the error at the end.

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question .

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

,

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:
gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated


I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

The solution was relatively simple using the unix dos2unix utility

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt
....etc.


It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

#### [Sep 16, 2019] Artistic Style - Index

###### Sep 16, 2019 | astyle.sourceforge.net

Artistic Style 3.1 A Free, Fast, and Small Automatic Formatter
for C, C++, C++/CLI, Objective‑C, C#, and Java Source Code

 Project Page: http://astyle.sourceforge.net/ SourceForge: http://sourceforge.net/projects/astyle/

Artistic Style is a source code indenter, formatter, and beautifier for the C, C++, C++/CLI, Objective‑C, C# and Java programming languages.

When indenting source code, we as programmers have a tendency to use both spaces and tab characters to create the wanted indentation. Moreover, some editors by default insert spaces instead of tabs when pressing the tab key. Other editors (Emacs for example) have the ability to "pretty up" lines by automatically setting up the white space before the code on the line, possibly inserting spaces in code that up to now used only tabs for indentation.

The NUMBER of spaces for each tab character in the source code can change between editors (unless the user sets up the number to his liking...). One of the standard problems programmers face when moving from one editor to another is that code containing both spaces and tabs, which was perfectly indented, suddenly becomes a mess to look at. Even if you as a programmer take care to ONLY use spaces or tabs, looking at other people's source code can still be problematic.

To address this problem, Artistic Style was created – a filter written in C++ that automatically re-indents and re-formats C / C++ / Objective‑C / C++/CLI / C# / Java source files. It can be used from a command line, or it can be incorporated as a library in another program.

#### [Sep 16, 2019] Usage -- PrettyPrinter 0.18.0 documentation

###### Sep 16, 2019 | prettyprinter.readthedocs.io

Usage

Install the package with pip :

pip install prettyprinter

from pprint import pprint

do

from prettyprinter import cpprint

for colored output. For colorless output, remove the c prefix from the function name:

from prettyprinter import pprint

#### [Sep 16, 2019] JavaScript code prettifier

###### Sep 16, 2019 | github.com

Announcement: Action required rawgit.com is going away .

An embeddable script that makes source-code snippets in HTML prettier.

• Works on HTML pages.
• Works even if code contains embedded links, line numbers, etc.
• Simple API: include some JS & CSS and add an onload handler.
• Customizable styles via CSS. See the themes gallery .
• Supports all C-like, Bash-like, and XML-like languages. No need to specify the language.
• Extensible language handlers for other languages. You can specify the language.
• Widely used with good cross-browser support. Powers https://code.google.com/ and http://stackoverflow.com/

#### [Sep 16, 2019] Pretty-print for shell script

###### Sep 16, 2019 | stackoverflow.com

Benoit ,Oct 21, 2010 at 13:19

I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc.

Do you know of one ?

Jamie ,Sep 11, 2012 at 3:00

Vim can indent bash scripts. But not reformat them before indenting.
Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!)

Though, some bugs with << (expecting EOF as first character on a line) e.g.

EDIT: ZZ not ZQ

Daniel Martí ,Apr 8, 2018 at 13:52

A bit late to the party, but it looks like shfmt could do the trick for you.

Brian Chrisman ,Sep 9 at 7:47

In bash I do this:
reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}") declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//" }  this eliminates comments and reindents the script "bash way". If you have HEREDOCS in your script, they got ruined by the sed in the previous function. So use: reindent() { source <(echo "Zibri () {";cat "$1"; echo "}")
}


But all your script will have a 4 spaces indentation.

Or you can do:

reindent ()
{
rstr=$(mktemp -u "XXXXXXXXXX"); source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}"); echo '#!/bin/bash'; declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/    /"
}


which takes care also of heredocs.

> ,

Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script .

Very nice, only one thing i took out is the [...]->test substitution.

#### [Sep 16, 2019] A command-line HTML pretty-printer Making messy HTML readable - Stack Overflow

##### "... Have a look at the HTML Tidy Project: http://www.html-tidy.org/ ..."
###### Sep 16, 2019 | stackoverflow.com

nisetama ,Aug 12 at 10:33

I'm looking for recommendations for HTML pretty printers which fulfill the following requirements:
• Takes HTML as input, and then output a nicely formatted/correctly indented but "graphically equivalent" version of the given input HTML.
• Must support command-line operation.
• Must be open-source and run under Linux.

> ,

Have a look at the HTML Tidy Project: http://www.html-tidy.org/

The granddaddy of HTML tools, with support for modern standards.

There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository .

Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.

For your needs, here is the command line to call Tidy:

#### [Sep 12, 2019] 9 Best File Comparison and Difference (Diff) Tools for Linux

###### Sep 12, 2019 | www.tecmint.com

3. Kompare

Kompare is a diff GUI wrapper that allows users to view differences between files and also merge them.

Some of its features include:

1. Supports multiple diff formats
2. Supports comparison of directories
4. Customizable interface
5. Creating and applying patches to source files
<img aria-describedby="caption-attachment-21311" src="https://www.tecmint.com/wp-content/uploads/2016/07/Kompare-Two-Files-in-Linux.png" alt="Kompare Tool - Compare Two Files in Linux" width="1097" height="701" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/Kompare-Two-Files-in-Linux.png 1097w, https://www.tecmint.com/wp-content/uploads/2016/07/Kompare-Two-Files-in-Linux-768x491.png 768w" sizes="(max-width: 1097px) 100vw, 1097px" />

Kompare Tool – Compare Two Files in Linux

Visit Homepage : https://www.kde.org/applications/development/kompare/

4. DiffMerge

DiffMerge is a cross-platform GUI application for comparing and merging files. It has two functionality engines, the Diff engine which shows the difference between two files, which supports intra-line highlighting and editing and a Merge engine which outputs the changed lines between three files.

It has got the following features:

1. Supports directory comparison
2. File browser integration
3. Highly configurable
<img aria-describedby="caption-attachment-21312" src="https://www.tecmint.com/wp-content/uploads/2016/07/DiffMerge-Compare-Files-in-Linux.png" alt="DiffMerge - Compare Files in Linux" width="1078" height="700" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/DiffMerge-Compare-Files-in-Linux.png 1078w, https://www.tecmint.com/wp-content/uploads/2016/07/DiffMerge-Compare-Files-in-Linux-768x499.png 768w" sizes="(max-width: 1078px) 100vw, 1078px" />

DiffMerge – Compare Files in Linux

Visit Homepage : https://sourcegear.com/diffmerge/

5. Meld – Diff Tool

Meld is a lightweight GUI diff and merge tool. It enables users to compare files, directories plus version controlled programs. Built specifically for developers, it comes with the following features:

1. Two-way and three-way comparison of files and directories
2. Update of file comparison as a users types more words
3. Makes merges easier using auto-merge mode and actions on changed blocks
4. Easy comparisons using visualizations
5. Supports Git, Mercurial, Subversion, Bazaar plus many more
<img aria-describedby="caption-attachment-21313" src="https://www.tecmint.com/wp-content/uploads/2016/07/Meld-Diff-Tool-to-Compare-Files-in-Linux.png" alt="Meld - A Diff Tool to Compare File in Linux" width="1028" height="708" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/Meld-Diff-Tool-to-Compare-Files-in-Linux.png 1028w, https://www.tecmint.com/wp-content/uploads/2016/07/Meld-Diff-Tool-to-Compare-Files-in-Linux-768x529.png 768w" sizes="(max-width: 1028px) 100vw, 1028px" />

Meld – A Diff Tool to Compare File in Linux

Visit Homepage : http://meldmerge.org/

6. Diffuse – GUI Diff Tool

Diffuse is another popular, free, small and simple GUI diff and merge tool that you can use on Linux. Written in Python, It offers two major functionalities, that is: file comparison and version control, allowing file editing, merging of files and also output the difference between files.

You can view a comparison summary, select lines of text in files using a mouse pointer, match lines in adjacent files and edit different file. Other features include:

1. Syntax highlighting
2. Keyboard shortcuts for easy navigation
3. Supports unlimited undo
4. Unicode support
5. Supports Git, CVS, Darcs, Mercurial, RCS, Subversion, SVK and Monotone
<img aria-describedby="caption-attachment-21314" src="https://www.tecmint.com/wp-content/uploads/2016/07/DiffUse-Compare-Text-Files-in-Linux.png" alt="DiffUse - A Tool to Compare Text Files in Linux" width="1030" height="795" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/DiffUse-Compare-Text-Files-in-Linux.png 1030w, https://www.tecmint.com/wp-content/uploads/2016/07/DiffUse-Compare-Text-Files-in-Linux-768x593.png 768w" sizes="(max-width: 1030px) 100vw, 1030px" />

DiffUse – A Tool to Compare Text Files in Linux

Visit Homepage : http://diffuse.sourceforge.net/

7. XXdiff – Diff and Merge Tool

XXdiff is a free, powerful file and directory comparator and merge tool that runs on Unix like operating systems such as Linux, Solaris, HP/UX, IRIX, DEC Tru64. One limitation of XXdiff is its lack of support for unicode files and inline editing of diff files.

It has the following list of features:

1. Shallow and recursive comparison of two, three file or two directories
2. Horizontal difference highlighting
3. Interactive merging of files and saving of resulting output
4. Supports merge reviews/policing
5. Supports external diff tools such as GNU diff, SIG diff, Cleareddiff and many more
6. Extensible using scripts
7. Fully customizable using resource file plus many other minor features
<img aria-describedby="caption-attachment-21315" src="https://www.tecmint.com/wp-content/uploads/2016/07/xxdiff-Tool.png" alt="xxdiff Tool" width="718" height="401" />

xxdiff Tool

Visit Homepage : http://furius.ca/xxdiff/

8. KDiff3 – – Diff and Merge Tool

KDiff3 is yet another cool, cross-platform diff and merge tool made from KDevelop . It works on all Unix-like platforms including Linux and Mac OS X, Windows.

It can compare or merge two to three files or directories and has the following notable features:

1. Indicates differences line by line and character by character
2. Supports auto-merge
3. In-built editor to deal with merge-conflicts
4. Supports Unicode, UTF-8 and many other codecs
5. Allows printing of differences
6. Windows explorer integration support
7. Also supports auto-detection via byte-order-mark "BOM"
8. Supports manual alignment of lines
9. Intuitive GUI and many more
<img aria-describedby="caption-attachment-21418" src="https://www.tecmint.com/wp-content/uploads/2016/07/KDiff3-Tool-for-Linux.png" alt="KDiff3 Tool for Linux" width="950" height="694" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/KDiff3-Tool-for-Linux.png 950w, https://www.tecmint.com/wp-content/uploads/2016/07/KDiff3-Tool-for-Linux-768x561.png 768w" sizes="(max-width: 950px) 100vw, 950px" />

KDiff3 Tool for Linux

Visit Homepage : http://kdiff3.sourceforge.net/

9. TkDiff

TkDiff is also a cross-platform, easy-to-use GUI wrapper for the Unix diff tool. It provides a side-by-side view of the differences between two input files. It can run on Linux, Windows and Mac OS X.

Additionally, it has some other exciting features including diff bookmarks, a graphical map of differences for easy and quick navigation plus many more.

Visit Homepage : https://sourceforge.net/projects/tkdiff/

Having read this review of some of the best file and directory comparator and merge tools, you probably want to try out some of them. These may not be the only diff tools available you can find on Linux, but they are known to offer some the best features, you may also want to let us know of any other diff tools out there that you have tested and think deserve to be mentioned among the best.

#### [Sep 04, 2019] Basic Trap for File Cleanup

###### Sep 04, 2019 | www.putorius.net

Basic Trap for File Cleanup

Using an trap to cleanup is simple enough. Here is an example of using trap to clean up a temporary file on exit of the script.

#!/bin/bash
trap "rm -f /tmp/output.txt" EXIT
yum -y update > /tmp/output.txt
if grep -qi "kernel" /tmp/output.txt; then
mail -s "KERNEL UPDATED" user@example.com < /tmp/output.txt
fi


NOTE: It is important that the trap statement be placed at the beginning of the script to function properly. Any commands above the trap can exit and not be caught in the trap.

Now if the script exits for any reason, it will still run the rm command to delete the file. Here is an example of me sending SIGINT (CTRL+C) while the script was running.

# ./test.sh
^Cremoved '/tmp/output.txt'


NOTE: I added verbose ( -v ) output to the rm command so it prints "removed". The ^C signifies where I hit CTRL+C to send SIGINT.

This is a much cleaner and safer way to ensure the cleanup occurs when the script exists. Using EXIT ( 0 ) instead of a single defined signal (i.e. SIGINT – 2) ensures the cleanup happens on any exit, even successful completion of the script.

#### [Sep 04, 2019] Exec - Process Replacement Redirection in Bash by Steven Vona

###### Sep 02, 2019 | www.putorius.net

The Linux exec command is a bash builtin and a very interesting utility. It is not something most people who are new to Linux know. Most seasoned admins understand it but only use it occasionally. If you are a developer, programmer or DevOp engineer it is probably something you use more often. Lets take a deep dive into the builtin exec command, what it does and how to use it.

Basics of the Sub-Shell

In order to understand the exec command, you need a fundamental understanding of how sub-shells work.

... ... ...

What the Exec Command Does

In it's most basic function the exec command changes the default behavior of creating a sub-shell to run a command. If you run exec followed by a command, that command will REPLACE the original process, it will NOT create a sub-shell.

An additional feature of the exec command, is redirection and manipulation of file descriptors . Explaining redirection and file descriptors is outside the scope of this tutorial. If these are new to you please read " Linux IO, Standard Streams and Redirection " to get acquainted with these terms and functions.

In the following sections we will expand on both of these functions and try to demonstrate how to use them.

How to Use the Exec Command with Examples

Let's look at some examples of how to use the exec command and it's options.

Basic Exec Command Usage – Replacement of Process

If you call exec and supply a command without any options, it simply replaces the shell with command .

Let's run an experiment. First, I ran the ps command to find the process id of my second terminal window. In this case it was 17524. I then ran "exec tail" in that second terminal and checked the ps command again. If you look at the screenshot below, you will see the tail process replaced the bash process (same process ID).

Since the tail command replaced the bash shell process, the shell will close when the tail command terminates.

Exec Command Options

If the -l option is supplied, exec adds a dash at the beginning of the first (zeroth) argument given. So if we ran the following command:

exec -l tail -f /etc/redhat-release


It would produce the following output in the process list. Notice the highlighted dash in the CMD column.

The -c option causes the supplied command to run with a empty environment. Environmental variables like PATH , are cleared before the command it run. Let's try an experiment. We know that the printenv command prints all the settings for a users environment. So here we will open a new bash process, run the printenv command to show we have some variables set. We will then run printenv again but this time with the exec -c option.

In the example above you can see that an empty environment is used when using exec with the -c option. This is why there was no output to the printenv command when ran with exec.

The last option, -a [name], will pass name as the first argument to command . The command will still run as expected, but the name of the process will change. In this next example we opened a second terminal and ran the following command:

exec -a PUTORIUS tail -f /etc/redhat-release


Here is the process list showing the results of the above command:

As you can see, exec passed PUTORIUS as first argument to command , therefore it shows in the process list with that name.

Using the Exec Command for Redirection & File Descriptor Manipulation

The exec command is often used for redirection. When a file descriptor is redirected with exec it affects the current shell. It will exist for the life of the shell or until it is explicitly stopped.

If no command is specified, redirections may be used to affect the current shell environment.

– Bash Manual

Here are some examples of how to use exec for redirection and manipulating file descriptors. As we stated above, a deep dive into redirection and file descriptors is outside the scope of this tutorial. Please read " Linux IO, Standard Streams and Redirection " for a good primer and see the resources section for more information.

Redirect all standard output (STDOUT) to a file:
exec >file


In the example animation below, we use exec to redirect all standard output to a file. We then enter some commands that should generate some output. We then use exec to redirect STDOUT to the /dev/tty to restore standard output to the terminal. This effectively stops the redirection. Using the cat command we can see that the file contains all the redirected output.

Open a file as file descriptor 6 for writing:
exec 6> file2write

Open file as file descriptor 8 for reading:
exec 8< file2read

Copy file descriptor 5 to file descriptor 7:
exec 7<&5

Close file descriptor 8:
exec 8<&-

Conclusion

In this article we covered the basics of the exec command. We discussed how to use it for process replacement, redirection and file descriptor manipulation.

In the past I have seen exec used in some interesting ways. It is often used as a wrapper script for starting other binaries. Using process replacement you can call a binary and when it takes over there is no trace of the original wrapper script in the process table or memory. I have also seen many System Administrators use exec when transferring work from one script to another. If you call a script inside of another script the original process stays open as a parent. You can use exec to replace that original script.

I am sure there are people out there using exec in some interesting ways. I would love to hear your experiences with exec. Please feel free to leave a comment below with anything on your mind.

Resources

#### [Sep 03, 2019] bash - How to convert strings like 19-FEB-12 to epoch date in UNIX - Stack Overflow

###### Feb 11, 2013 | stackoverflow.com

Asked 6 years, 6 months ago Active 2 years, 2 months ago Viewed 53k times 24 4

hellish ,Feb 11, 2013 at 3:45

In UNIX how to convert to epoch milliseconds date strings like:
19-FEB-12
16-FEB-12
05-AUG-09


I need this to compare these dates with the current time on the server.

> ,

To convert a date to seconds since the epoch:
date --date="19-FEB-12" +%s


Current epoch:

date +%s


So, since your dates are in the past:

NOW=date +%s
THEN=date --date="19-FEB-12" +%s

let DIFF=$NOW-$THEN
echo "The difference is: $DIFF"  Using BSD's date command, you would need $ date -j -f "%d-%B-%y" 19-FEB-12 +%s


Differences from GNU date :

1. -j prevents date from trying to set the clock
2. The input format must be explicitly set with -f
3. The input date is a regular argument, not an option (viz. -d )
4. When no time is specified with the date, use the current time instead of midnight.

#### [Sep 03, 2019] Linux - UNIX Convert Epoch Seconds To the Current Time - nixCraft

###### Sep 03, 2019 | www.cyberciti.biz

Print Current UNIX Time

Type the following command to display the seconds since the epoch:

 date +%s 

date +%s

Sample outputs:
1268727836

Convert Epoch To Current Time

Type the command:

 date -d @Epoch date -d @1268727836 date -d "1970-01-01 1268727836 sec GMT" 

date -d @Epoch date -d @1268727836 date -d "1970-01-01 1268727836 sec GMT"

Sample outputs:

Tue Mar 16 13:53:56 IST 2010


Please note that @ feature only works with latest version of date (GNU coreutils v5.3.0+). To convert number of seconds back to a more readable form, use a command like this:

 date -d @1268727836 +"%d-%m-%Y %T %z" 

date -d @1268727836 +"%d-%m-%Y %T %z"

Sample outputs:

16-03-2010 13:53:56 +0530


#### [Sep 03, 2019] command line - How do I convert an epoch timestamp to a human readable format on the cli - Unix Linux Stack Exchange

###### Sep 03, 2019 | unix.stackexchange.com

Gilles ,Oct 11, 2010 at 18:14

date -d @1190000000 Replace 1190000000 with your epoch

Stefan Lasiewski ,Oct 11, 2010 at 18:04

$echo 1190000000 | perl -pe 's/(\d+)/localtime($1)/e'
Sun Sep 16 20:33:20 2007


This can come in handy for those applications which use epoch time in the logfiles:

$tail -f /var/log/nagios/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
[Thu May 13 10:15:46 2010] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;HOSTA;check_raid;0;check_raid.pl: OK (Unit 0 on Controller 0 is OK)


Stéphane Chazelas ,Jul 31, 2015 at 20:24

With bash-4.2 or above:
printf '%(%F %T)T\n' 1234567890


(where %F %T is the strftime() -type format)

That syntax is inspired from ksh93 .

In ksh93 however, the argument is taken as a date expression where various and hardly documented formats are supported.

For a Unix epoch time, the syntax in ksh93 is:

printf '%(%F %T)T\n' '#1234567890'


ksh93 however seems to use its own algorithm for the timezone and can get it wrong. For instance, in Britain, it was summer time all year in 1970, but:

$TZ=Europe/London bash -c 'printf "%(%c)T\n" 0' Thu 01 Jan 1970 01:00:00 BST$ TZ=Europe/London ksh93 -c 'printf "%(%c)T\n" "#0"'
Thu Jan  1 00:00:00 1970


DarkHeart ,Jul 28, 2014 at 3:56

Custom format with GNU date :
date -d @1234567890 +'%Y-%m-%d %H:%M:%S'


Or with GNU awk :

awk 'BEGIN { print strftime("%Y-%m-%d %H:%M:%S", 1234567890); }'


,

The two I frequently use are:
$perl -leprint\ scalar\ localtime\ 1234567890 Sat Feb 14 00:31:30 2009  #### [Sep 03, 2019] Time conversion using Bash Vanstechelman.eu ###### Sep 03, 2019 | www.vanstechelman.eu Time conversion using Bash This article show how you can obtain the UNIX epoch time (number of seconds since 1970-01-01 00:00:00 UTC) using the Linux bash "date" command. It also shows how you can convert a UNIX epoch time to a human readable time. Obtain UNIX epoch time using bash Obtaining the UNIX epoch time using bash is easy. Use the build-in date command and instruct it to output the number of seconds since 1970-01-01 00:00:00 UTC. You can do this by passing a format string as parameter to the date command. The format string for UNIX epoch time is '%s'. lode@srv-debian6:~$ date "+%s" 1234567890

To convert a specific date and time into UNIX epoch time, use the -d parameter. The next example shows how to convert the timestamp "February 20th, 2013 at 08:41:15" into UNIX epoch time.

lode@srv-debian6:~$date "+%s" -d "02/20/2013 08:41:15" 1361346075 Converting UNIX epoch time to human readable time Even though I didn't find it in the date manual, it is possible to use the date command to reformat a UNIX epoch time into a human readable time. The syntax is the following: lode@srv-debian6:~$ date -d @1234567890 Sat Feb 14 00:31:30 CET 2009

The same thing can also be achieved using a bit of perl programming:

lode@srv-debian6:~$perl -e 'print scalar(localtime(1234567890)), "\n"' Sat Feb 14 00:31:30 2009 Please note that the printed time is formatted in the timezone in which your Linux system is configured. My system is configured in UTC+2, you can get another output for the same command. #### [Sep 03, 2019] Run PerlTidy to beautify the code ##### Notable quotes: ##### "... Once I installed Code::TidyAll and placed those files in the root directory of the project, I could run tidyall -a . ..." ###### Sep 03, 2019 | perlmaven.com The Code-TidyAll distribution provides a command line script called tidyall that will use Perl::Tidy to change the layout of the code. This tandem needs 2 configuration file. The .perltidyrc file contains the instructions to Perl::Tidy that describes the layout of a Perl-file. We used the following file copied from the source code of the Perl Maven project. -pbp -nst -et=4 --maximum-line-length=120 # Break a line after opening/before closing token. -vt=0 -vtc=0 The tidyall command uses a separate file called .tidyallrc that describes which files need to be beautified. [PerlTidy] select = {lib,t}/**/*.{pl,pm,t} select = Makefile.PL select = {mod2html,podtree2html,pods2html,perl2html} argv = --profile=$ROOT/.perltidyrc

[SortLines]
select = .gitignore
Once I installed Code::TidyAll and placed those files in the root directory of the project, I could run tidyall -a .

That created a directory called .tidyall.d/ where it stores cached versions of the files, and changed all the files that were matches by the select statements in the .tidyallrc file.

Then, I added .tidyall.d/ to the .gitignore file to avoid adding that subdirectory to the repository and ran tidyall -a again to make sure the .gitignore file is sorted.

#### [Sep 02, 2019] bash - Pretty-print for shell script

###### Oct 21, 2010 | stackoverflow.com

Pretty-print for shell script Ask Question Asked 8 years, 10 months ago Active 30 days ago Viewed 14k times

Benoit ,Oct 21, 2010 at 13:19

I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc.

Do you know of one ?

Jamie ,Sep 11, 2012 at 3:00

Vim can indent bash scripts. But not reformat them before indenting.
Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!)

Though, some bugs with << (expecting EOF as first character on a line) e.g.

EDIT: ZZ not ZQ

Daniel Martí ,Apr 8, 2018 at 13:52

A bit late to the party, but it looks like shfmt could do the trick for you.

Brian Chrisman ,Aug 11 at 4:08

In bash I do this:
reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}") declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//" }  this eliminates comments and reindents the script "bash way". If you have HEREDOCS in your script, they got ruined by the sed in the previous function. So use: reindent() { source <(echo "Zibri () {";cat "$1"; echo "}")
}


But all your script will have a 4 spaces indentation.

Or you can do:

reindent ()
{
rstr=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 16 | head -n 1); source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}"); echo '#!/bin/bash'; declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/    /"
}


which takes care also of heredocs.

Pius Raeder ,Jan 10, 2017 at 8:35

Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script .

Very nice, only one thing i took out is the [...]->test substitution.

#### [Sep 02, 2019] mvdan-sh A shell parser, formatter, and interpreter (POSIX-Bash-mksh)

##### Written in Go language
###### Sep 02, 2019 | github.com
go parser shell bash formatter posix mksh interpreter bash-parser beautify
1. Go 98.8%
2. Other 1.2%
Type Name Latest commit message Commit time
Failed to load latest commit information.
_fuzz/ it
_js
cmd
expand
fileutil
interp
shell
syntax
.gitignore
.travis.yml
go.mod
go.sum
release-docker.sh

sh

A shell parser, formatter and interpreter. Supports POSIX Shell , Bash and mksh . Requires Go 1.11 or later.

Quick start

To parse shell scripts, inspect them, and print them out, see the syntax examples .

For high-level operations like performing shell expansions on strings, see the shell examples .

shfmt

cd $(mktemp -d); go mod init tmp; go get mvdan.cc/sh/cmd/shfmt  The latest v3 pre-release can be downloaded in a similar manner, using the /v3  module: cd$(mktemp -d); go mod init tmp; go get mvdan.cc/sh/v3/cmd/shfmt


Finally, any older release can be built with their respective older Go versions by manually cloning, checking out a tag, and running go build ./cmd/shfmt .

shfmt  formats shell programs. It can use tabs or any number of spaces to indent. See canonical.sh for a quick look at its default style.

You can feed it standard input, any number of files or any number of directories to recurse into. When recursing, it will operate on .sh  and .bash  files and ignore files starting with a period. It will also operate on files with no extension and a shell shebang.

shfmt -l -w script.sh


Typically, CI builds should use the command below, to error if any shell scripts in a project don't adhere to the format:

shfmt -d .


Use -i N  to indent with a number of spaces instead of tabs. There are other formatting options - see shfmt -h . For example, to get the formatting appropriate for Google's Style guide, use  shfmt -i 2 -ci .

Packages are available on Arch , CRUX , Docker , FreeBSD , Homebrew , NixOS , Scoop , Snapcraft , and Void .

Replacing bash -n 

bash -n  can be useful to check for syntax errors in shell scripts. However, shfmt >/dev/null  can do a better job as it checks for invalid UTF-8 and does all parsing statically, including checking POSIX Shell validity:

$echo '${foo:1 2}' | bash -n
$echo '${foo:1 2}' | shfmt
1:9: not a valid arithmetic operator: 2
$echo 'foo=(1 2)' | bash --posix -n$ echo 'foo=(1 2)' | shfmt -p
1:5: arrays are a bash feature


gosh

cd $(mktemp -d); go mod init tmp; go get mvdan.cc/sh/v3/cmd/gosh  Experimental shell that uses interp . Work in progress, so don't expect stability just yet. Fuzzing This project makes use of go-fuzz to find crashes and hangs in both the parser and the printer. To get started, run: git checkout fuzz ./fuzz  Caveats • When indexing Bash associative arrays, always use quotes. The static parser will otherwise have to assume that the index is an arithmetic expression. $ echo '${array[spaced string]}' | shfmt 1:16: not a valid arithmetic operator: string$ echo '${array[dash-string]}' | shfmt${array[dash - string]}

• $((  and ((  ambiguity is not supported. Backtracking would complicate the parser and make streaming support via io.Reader  impossible. The POSIX spec recommends to space the operands if $( (  is meant.
$echo '$((foo); (bar))' | shfmt
1:1: reached ) without matching $(( with ))  • Some builtins like export  and let  are parsed as keywords. This is to allow statically parsing them and building their syntax tree, as opposed to just keeping the arguments as a slice of arguments. JavaScript A subset of the Go packages are available as an npm package called mvdan-sh . See the _js directory for more information. Docker To build a Docker image, checkout a specific version of the repository and run: docker build -t my:tag -f cmd/shfmt/Dockerfile .  Related projects • Alternative docker images - by jamesmstone , PeterDaveHello • format-shell - Atom plugin for  shfmt  • micro - Editor with a built-in plugin for shfmt  • modd - A developer tool that responds to filesystem changes, using sh  • shell-format - VS Code plugin for shfmt  • vim-shfmt - Vim plugin for shfmt  #### [Aug 26, 2019] linux - Avoiding accidental 'rm' disasters - Super User ###### Aug 26, 2019 | superuser.com Avoiding accidental 'rm' disasters Ask Question Asked 6 years, 3 months ago Active 6 years, 3 months ago Viewed 1k times 1 Mr_Spock ,May 26, 2013 at 11:30 Today, using sudo -s , I wanted to rm -R ./lib/ , but I actually rm -R /lib/ . I had to reinstall my OS (Mint 15) and re-download and re-configure all my packages. Not fun. How can I avoid similar mistakes in the future? Vittorio Romeo ,May 26, 2013 at 11:55 First of all, stop executing everything as root . You never really need to do this. Only run individual commands with sudo if you need to. If a normal command doesn't work without sudo, just call sudo !! to execute it again. If you're paranoid about rm , mv and other operations while running as root, you can add the following aliases to your shell's configuration file: [$UID = 0 ] && \
alias rm='rm -i' && \
alias mv='mv -i' && \
alias cp='cp -i'


These will all prompt you for confirmation ( -i ) before removing a file or overwriting an existing file, respectively, but only if you're root (the user with ID 0).

Don't get too used to that though. If you ever find yourself working on a system that doesn't prompt you for everything, you might end up deleting stuff without noticing it. The best way to avoid mistakes is to never run as root and think about what exactly you're doing when you use sudo .

#### [Aug 26, 2019] bash - How to prevent rm from reporting that a file was not found

###### Aug 26, 2019 | stackoverflow.com

How to prevent rm from reporting that a file was not found? Ask Question Asked 7 years, 4 months ago Active 1 year, 4 months ago Viewed 101k times 133 19

pizza ,Apr 20, 2012 at 21:29

I am using rm within a BASH script to delete many files. Sometimes the files are not present, so it reports many errors. I do not need this message. I have searched the man page for a command to make rm quiet, but the only option I found is -f , which from the description, "ignore nonexistent files, never prompt", seems to be the right choice, but the name does not seem to fit, so I am concerned it might have unintended consequences.
• Is the -f option the correct way to silence rm ? Why isn't it called -q ?
• Does this option do anything else?

Keith Thompson ,Dec 19, 2018 at 13:05

The main use of -f is to force the removal of files that would not be removed using rm by itself (as a special case, it "removes" non-existent files, thus suppressing the error message).

You can also just redirect the error message using

$rm file.txt 2> /dev/null  (or your operating system's equivalent). You can check the value of $? immediately after calling rm to see if a file was actually removed or not.

vimdude ,May 28, 2014 at 18:10

Yes, -f is the most suitable option for this.

tripleee ,Jan 11 at 4:50

-f is the correct flag, but for the test operator, not rm
[ -f "$THEFILE" ] && rm "$THEFILE"


this ensures that the file exists and is a regular file (not a directory, device node etc...)

mahemoff ,Jan 11 at 4:41

\rm -f file will never report not found.

Idelic ,Apr 20, 2012 at 16:51

As far as rm -f doing "anything else", it does force ( -f is shorthand for --force ) silent removal in situations where rm would otherwise ask you for confirmation. For example, when trying to remove a file not writable by you from a directory that is writable by you.

Keith Thompson ,May 28, 2014 at 18:09

I had same issue for cshell. The only solution I had was to create a dummy file that matched pattern before "rm" in my script.

#### [Aug 26, 2019] shell - rm -rf return codes

###### Aug 26, 2019 | superuser.com

rm -rf return codes Ask Question Asked 6 years ago Active 6 years ago Viewed 15k times 8 0

SheetJS ,Aug 15, 2013 at 2:50

Any one can let me know the possible return codes for the command rm -rf other than zero i.e, possible return codes for failure cases. I want to know more detailed reason for the failure of the command unlike just the command is failed(return other than 0).

Adrian Frühwirth ,Aug 14, 2013 at 7:00

To see the return code, you can use echo $? in bash. To see the actual meaning, some platforms (like Debian Linux) have the perror binary available, which can be used as follows: $ rm -rf something/; perror $? rm: cannot remove something/': Permission denied OS error code 1: Operation not permitted  rm -rf automatically suppresses most errors. The most likely error you will see is 1 (Operation not permitted), which will happen if you don't have permissions to remove the file. -f intentionally suppresses most errors Adrian Frühwirth ,Aug 14, 2013 at 7:21 grabbed coreutils from git.... looking at exit we see... openfly@linux-host:~/coreutils/src$ cat rm.c | grep -i exit
if (status != EXIT_SUCCESS)
exit (status);
/* Since this program exits immediately after calling 'rm', rm need not
atexit (close_stdin);
usage (EXIT_FAILURE);
exit (EXIT_SUCCESS);
usage (EXIT_FAILURE);
error (EXIT_FAILURE, errno, _("failed to get attributes of %s"),
exit (EXIT_SUCCESS);
exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);


Now looking at the status variable....

openfly@linux-host:~/coreutils/src $cat rm.c | grep -i status usage (int status) if (status != EXIT_SUCCESS) exit (status); enum RM_status status = rm (file, &x); assert (VALID_STATUS (status)); exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);  looks like there isn't much going on there with the exit status. I see EXIT_FAILURE and EXIT_SUCCESS and not anything else. so basically 0 and 1 / -1 To see specific exit() syscalls and how they occur in a process flow try this openfly@linux-host:~/$ strace rm -rf $whatever  fairly simple. ref: http://www.unix.com/man-page/Linux/EXIT_FAILURE/exit/ #### [Aug 20, 2019] How to exclude file when using scp command recursively ###### Aug 12, 2019 | www.cyberciti.biz I need to copy all the *.c files from local laptop named hostA to hostB including all directories. I am using the following scp command but do not know how to exclude specific files (such as *.out):$ scp -r ~/projects/ user@hostB:/home/delta/projects/ How do I tell scp command to exclude particular file or directory at the Linux/Unix command line? One can use scp command to securely copy files between hosts on a network. It uses ssh for data transfer and authentication purpose. Typical scp command syntax is as follows: scp file1 user@host:/path/to/dest/ scp -r /path/to/source/ user@host:/path/to/dest/ scp [options] /dir/to/source/ user@host:/dir/to/dest/

Scp exclude files

I don't think so you can filter or exclude files when using scp command. However, there is a great workaround to exclude files and copy it securely using ssh. This page explains how to filter or excludes files when using scp to copy a directory recursively.

How to use rsync command to exclude files

The syntax is:

rsync -av -e ssh --exclude='*.out' /path/to/source/ user@hostB:/path/to/dest/

Where,

1. -a : Recurse into directories i.e. copy all files and subdirectories. Also, turn on archive mode and all other options (-rlptgoD)
2. -v : Verbose output
3. -e ssh : Use ssh for remote shell so everything gets encrypted
4. --exclude='*.out' : exclude files matching PATTERN e.g. *.out or *.c and so on.
Example of rsync command

In this example copy all file recursively from ~/virt/ directory but exclude all *.new files:

$sudo yum install moreutils  #### [Jul 29, 2019] Locate Command in Linux ###### Jul 25, 2019 | linuxize.com ... ... ... The locate command also accepts patterns containing globbing characters such as the wildcard character * . When the pattern contains no globbing characters the command searches for *PATTERN* , that's why in the previous example all files containing the search pattern in their names were displayed. The wildcard is a symbol used to represent zero, one or more characters. For example, to search for all .md files on the system you would use: locate *.md  To limit the search results use the -n option followed by the number of results you want to be displayed. For example, the following command will search for all .py files and display only 10 results: locate -n 10 *.py  By default, locate performs case-sensitive searches. The -i ( --ignore-case ) option tels locate to ignore case and run case-insensitive search. locate -i readme.md  /home/linuxize/p1/readme.md /home/linuxize/p2/README.md /home/linuxize/p3/ReadMe.md  To display the count of all matching entries, use the -c ( --count ) option. The following command would return the number of all files containing .bashrc in their names: locate -c .bashrc  6  By default, locate doesn't check whether the found files still exist on the file system. If you deleted a file after the latest database update if the file matches the search pattern it will be included in the search results. To display only the names of the files that exist at the time locate is run use the -e ( --existing ) option. For example, the following would return only the existing .json files: locate -e *.json  If you need to run a more complex search you can use the -r ( --regexp ) option which allows you to search using a basic regexp instead of patterns. This option can be specified multiple times. For example, to search for all .mp4 and .avi files on your system and ignore case you would run: locate --regex -i "(\.mp4|\.avi)"  #### [Jul 26, 2019] Sort Command in Linux [10 Useful Examples] by Christopher Murray ##### Notable quotes: ##### "... The sort command option "k" specifies a field, not a column. ..." ##### "... In gnu sort, the default field separator is 'blank to non-blank transition' which is a good default to separate columns. ..." ##### "... What is probably missing in that article is a short warning about the effect of the current locale. It is a common mistake to assume that the default behavior is to sort according ASCII texts according to the ASCII codes. ..." ###### Jul 12, 2019 | linuxhandbook.com 5. Sort by months [option -M] Sort also has built in functionality to arrange by month. It recognizes several formats based on locale-specific information. I tried to demonstrate some unqiue tests to show that it will arrange by date-day, but not year. Month abbreviations display before full-names. Here is the sample text file in this example: March Feb February April August July June November October December May September 1 4 3 6 01/05/19 01/10/19 02/06/18 Let's sort it by months using the -M option: sort filename.txt -M  Here's the output you'll see: 01/05/19 01/10/19 02/06/18 1 3 4 6 Jan Feb February March April May June July August September October November December  ... ... ... 7. Sort Specific Column [option -k] If you have a table in your file, you can use the -k option to specify which column to sort. I added some arbitrary numbers as a third column and will display the output sorted by each column. I've included several examples to show the variety of output possible. Options are added following the column number. 1. MX Linux 100 2. Manjaro 400 3. Mint 300 4. elementary 500 5. Ubuntu 200 sort filename.txt -k 2  This will sort the text on the second column in alphabetical order: 4. elementary 500 2. Manjaro 400 3. Mint 300 1. MX Linux 100 5. Ubuntu 200  sort filename.txt -k 3n  This will sort the text by the numerals on the third column. 1. MX Linux 100 5. Ubuntu 200 3. Mint 300 2. Manjaro 400 4. elementary 500  sort filename.txt -k 3nr  Same as the above command just that the sort order has been reversed. 4. elementary 500 2. Manjaro 400 3. Mint 300 5. Ubuntu 200 1. MX Linux 100  8. Sort and remove duplicates [option -u] If you have a file with potential duplicates, the -u option will make your life much easier. Remember that sort will not make changes to your original data file. I chose to create a new file with just the items that are duplicates. Below you'll see the input and then the contents of each file after the command is run. READ Learn to Use CURL Command in Linux With These Examples 1. MX Linux 2. Manjaro 3. Mint 4. elementary 5. Ubuntu 1. MX Linux 2. Manjaro 3. Mint 4. elementary 5. Ubuntu 1. MX Linux 2. Manjaro 3. Mint 4. elementary 5. Ubuntu sort filename.txt -u > filename_duplicates.txt  Here's the output files sorted and without duplicates. 1. MX Linux 2. Manjaro 3. Mint 4. elementary 5. Ubuntu  9. Ignore case while sorting [option -f] Many modern distros running sort will implement ignore case by default. If yours does not, adding the -f option will produce the expected results. sort filename.txt -f  Here's the output where cases are ignored by the sort command: alpha alPHa Alpha ALpha beta Beta BEta BETA  10. Sort by human numeric values [option -h] This option allows the comparison of alphanumeric values like 1k (i.e. 1000). sort filename.txt -h  Here's the sorted output: 10.0 100 1000.0 1k  I hope this tutorial helped you get the basic usage of the sort command in Linux. If you have some cool sort trick, why not share it with us in the comment section? Christopher works as a Software Developer in Orlando, FL. He loves open source, Taco Bell, and a Chi-weenie named Max. Visit his website for more information or connect with him on social media. John The sort command option "k" specifies a field, not a column. In your example all five lines have the same character in column 2 – a "." Stephane Chauveau In gnu sort, the default field separator is 'blank to non-blank transition' which is a good default to separate columns. In his example, the "." is part of the first column so it should work fine. If –debug is used then the range of characters used as keys is dumped. What is probably missing in that article is a short warning about the effect of the current locale. It is a common mistake to assume that the default behavior is to sort according ASCII texts according to the ASCII codes. For example, the command echo printf ".nxn0nXn@në" | sort produces ". 0 @ X x ë" with LC_ALL=C but ". @ 0 ë x X" with LC_ALL=en_US.UTF-8. #### [Jul 26, 2019] Cheat.sh Shows Cheat Sheets On The Command Line Or In Your Code Editor> ##### The choice of shell as a programming language is strange, but the idea is good... ##### Notable quotes: ##### "... The tool is developed by Igor Chubin, also known for its console-oriented weather forecast service wttr.in , which can be used to retrieve the weather from the console using only cURL or Wget. ..." ###### Jul 26, 2019 | www.linuxuprising.com While it does have its own cheat sheet repository too, the project is actually concentrated around the creation of a unified mechanism to access well developed and maintained cheat sheet repositories. The tool is developed by Igor Chubin, also known for its console-oriented weather forecast service wttr.in , which can be used to retrieve the weather from the console using only cURL or Wget. It's worth noting that cheat.sh is not new. In fact it had its initial commit around May, 2017, and is a very popular repository on GitHub. But I personally only came across it recently, and I found it very useful, so I figured there must be some Linux Uprising readers who are not aware of this cool gem. cheat.sh features & more cheat.sh major features: • Supports 58 programming languages , several DBMSes, and more than 1000 most important UNIX/Linux commands • Very fast, returns answers within 100ms • Simple curl / browser interface • An optional command line client (cht.sh) is available, which allows you to quickly search cheat sheets and easily copy snippets without leaving the terminal • Can be used from code editors, allowing inserting code snippets without having to open a web browser, search for the code, copy it, then return to your code editor and paste it. It supports Vim, Emacs, Visual Studio Code, Sublime Text and IntelliJ Idea • Comes with a special stealth mode in which any text you select (adding it into the selection buffer of X Window System or into the clipboard) is used as a search query by cht.sh, so you can get answers without touching any other keys The command line client features a special shell mode with a persistent queries context and readline support. It also has a query history, it integrates with the clipboard, supports tab completion for shells like Bash, Fish and Zsh, and it includes the stealth mode I mentioned in the cheat.sh features. The web, curl and cht.sh (command line) interfaces all make use of https://cheat.sh/ but if you prefer, you can self-host it . It should be noted that each editor plugin supports a different feature set (configurable server, multiple answers, toggle comments, and so on). You can view a feature comparison of each cheat.sh editor plugin on the Editors integration section of the project's GitHub page. Want to contribute a cheat sheet? See the cheat.sh guide on editing or adding a new cheat sheet. Interested in bookmarking commands instead? You may want to give Marker, a command bookmark manager for the console , a try. cheat.sh curl / command line client usage examples Examples of using cheat.sh using the curl interface (this requires having curl installed as you'd expect) from the command line: Show the tar command cheat sheet: curl cheat.sh/tar  Example with output: $ curl cheat.sh/tar
# To extract an uncompressed archive:
tar -xvf /path/to/foo.tar

# To create an uncompressed archive:
tar -cvf /path/to/foo.tar /path/to/foo/

# To extract a .gz archive:
tar -xzvf /path/to/foo.tgz

# To create a .gz archive:
tar -czvf /path/to/foo.tgz /path/to/foo/

# To list the content of an .gz archive:
tar -ztvf /path/to/foo.tgz

# To extract a .bz2 archive:
tar -xjvf /path/to/foo.tgz

# To create a .bz2 archive:
tar -cjvf /path/to/foo.tgz /path/to/foo/

# To extract a .tar in specified Directory:
tar -xvf /path/to/foo.tar -C /path/to/destination/

# To list the content of an .bz2 archive:
tar -jtvf /path/to/foo.tgz

# To create a .gz archive and exclude all jpg,gif,... from the tgz
tar czvf /path/to/foo.tgz --exclude=\*.{jpg,gif,png,wmv,flv,tar.gz,zip} /path/to/foo/

# To use parallel (multi-threaded) implementation of compression algorithms:
tar -z ... -> tar -Ipigz ...
tar -j ... -> tar -Ipbzip2 ...
tar -J ... -> tar -Ipixz ...


cht.sh also works instead of cheat.sh:
curl cht.sh/tar


Want to search for a keyword in all cheat sheets? Use:
curl cheat.sh/~keyword


List the Python programming language cheat sheet for random list :
curl cht.sh/python/random+list


Example with output:
$curl cht.sh/python/random+list # python - How to randomly select an item from a list? # # Use random.choice # (https://docs.python.org/2/library/random.htmlrandom.choice): import random foo = ['a', 'b', 'c', 'd', 'e'] print(random.choice(foo)) # For cryptographically secure random choices (e.g. for generating a # passphrase from a wordlist), use random.SystemRandom # (https://docs.python.org/2/library/random.htmlrandom.SystemRandom) # class: import random foo = ['battery', 'correct', 'horse', 'staple'] secure_random = random.SystemRandom() print(secure_random.choice(foo)) # [Pēteris Caune] [so/q/306400] [cc by-sa 3.0]  Replace python with some other programming language supported by cheat.sh, and random+list with the cheat sheet you want to show. Want to eliminate the comments from your answer? Add ?Q at the end of the query (below is an example using the same /python/random+list): $ curl cht.sh/python/random+list?Q
import random

foo = ['a', 'b', 'c', 'd', 'e']
print(random.choice(foo))

import random

foo = ['battery', 'correct', 'horse', 'staple']
secure_random = random.SystemRandom()
print(secure_random.choice(foo))


For more flexibility and tab completion you can use cht.sh, the command line cheat.sh client; you'll find instructions for how to install it further down this article. Examples of using the cht.sh command line client:

Show the tar command cheat sheet:

cht.sh tar


List the Python programming language cheat sheet for random list :
cht.sh python random list


There is no need to use quotes with multiple keywords.

You can start the cht.sh client in a special shell mode using:

cht.sh --shell


And then you can start typing your queries. Example:
$cht.sh --shell cht.sh> bash loop  If all your queries are about the same programming language, you can start the client in the special shell mode, directly in that context. As an example, start it with the Bash context using: cht.sh --shell bash  Example with output: $ cht.sh --shell bash
cht.sh/bash> loop
...........
cht.sh/bash> switch case


Want to copy the previously listed answer to the clipboard? Type c , then press Enter to copy the whole answer, or type C and press Enter to copy it without comments.

Type help in the cht.sh interactive shell mode to see all available commands. Also look under the Usage section from the cheat.sh GitHub project page for more options and advanced usage.

How to install cht.sh command line client
You can use cheat.sh in a web browser, from the command line with the help of curl and without having to install anything else, as explained above, as a code editor plugin, or using its command line client which has some extra features, which I already mentioned. The steps below are for installing this cht.sh command line client.

If you'd rather install a code editor plugin for cheat.sh, see the Editors integration page.

1. Install dependencies.

To install the cht.sh command line client, the curl command line tool will be used, so this needs to be installed on your system. Another dependency is rlwrap , which is required by the cht.sh special shell mode. Install these dependencies as follows.

• Debian, Ubuntu, Linux Mint, Pop!_OS, and any other Linux distribution based on Debian or Ubuntu:
sudo apt install curl rlwrap


• Fedora:
sudo dnf install curl rlwrap


• Arch Linux, Manjaro:
sudo pacman -S curl rlwrap


• openSUSE:
sudo zypper install curl rlwrap


The packages seem to be named the same on most (if not all) Linux distributions, so if your Linux distribution is not on this list, just install the curl and rlwrap packages using your distro's package manager.

You can install this either for your user only (so only you can run it), or for all users:

• Install it for your user only. The command below assumes you have a ~/.bin folder added to your PATH (and the folder exists). If you have some other local folder in your PATH where you want to install cht.sh, change install path in the commands:
curl https://cht.sh/:cht.sh > ~/.bin/cht.sh

chmod +x ~/.bin/cht.sh


• Install it for all users (globally, in /usr/local/bin ):
curl https://cht.sh/:cht.sh | sudo tee /usr/local/bin/cht.sh

sudo chmod +x /usr/local/bin/cht.sh


If the first command appears to have frozen displaying only the cURL output, press the Enter key and you'll be prompted to enter your password in order to save the file to /usr/local/bin .

You may also download and install the cheat.sh command completion for Bash or Zsh:

• Bash:
mkdir ~/.bash.d

curl https://cheat.sh/:bash_completion > ~/.bash.d/cht.sh

echo ". ~/.bash.d/cht.sh" >> ~/.bashrc


• Zsh:
mkdir ~/.zsh.d

curl https://cheat.sh/:zsh > ~/.zsh.d/_cht

echo 'fpath=(~/.zsh.d/ $fpath)' >> ~/.zshrc  Opening a new shell / terminal and it will load the cheat.sh completion. #### [Jun 23, 2019] Utilizing multi core for tar+gzip-bzip compression-decompression ##### Highly recommended! ##### Notable quotes: ##### "... There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. ..." ##### "... You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use. ..." ###### Jun 23, 2019 | stackoverflow.com user1118764 , Sep 7, 2012 at 6:58 I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit). I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression. Is there any way I can utilize the unused cores to make it faster? Warren Severin , Nov 13, 2017 at 4:37 The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37 Mark Adler , Sep 7, 2012 at 14:48 You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz: tar cf - paths-to-archive | pigz > archive.tar.gz By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g. tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz user788171 , Feb 20, 2013 at 12:43 How do you use pigz to decompress in the same fashion? Or does it only work for compression? Mark Adler , Feb 20, 2013 at 16:18 pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression. The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores. Garrett , Mar 1, 2014 at 7:26 The hyphen here is stdout (see this page ). Mark Adler , Jul 2, 2014 at 21:29 Yes. 100% compatible in both directions. Mark Adler , Apr 23, 2015 at 5:23 There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. Jen , Jun 14, 2013 at 14:34 You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use. For example use: tar -c --use-compress-program=pigz -f tar.file dir_to_zip Valerio Schiavoni , Aug 5, 2014 at 22:38 Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38 bovender , Sep 18, 2015 at 10:14 @ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14 Valerio Schiavoni , Sep 28, 2015 at 23:41 On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41 Offenso , Jan 11, 2017 at 17:26 I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26 Maxim Suslov , Dec 18, 2014 at 7:31 Common approach There is option for tar program: -I, --use-compress-program PROG filter through PROG (must accept -d) You can use multithread version of archiver or compressor utility. Most popular multithread archivers are pigz (instead of gzip) and pbzip2 (instead of bzip2). For instance: $ tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive
$tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary): $ tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz
$tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz  Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa. p7zip For p7zip for compression you need a small shell script like the following: #!/bin/sh case$1 in
-d) 7za -txz -si -so e;;
*) 7za -txz -si -so a .;;
esac 2>/dev/null


Save it as 7zhelper.sh. Here the example of usage:

$tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive$ tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z

xz

Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ).

This is a fragment of man for 5.1.0alpha version:

Multithreaded compression and decompression are not implemented yet, so this option has no effect for now.

However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2:

Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used.

Recompiling with replacement

If you build tar from sources, then you can recompile with parameters

--with-gzip=pigz
--with-bzip2=lbzip2
--with-lzip=plzip


After recompiling tar with these options you can check the output of tar's help:

$tar --help | grep "lbzip2\|plzip\|pigz" -j, --bzip2 filter the archive through lbzip2 --lzip filter the archive through plzip -z, --gzip, --gunzip, --ungzip filter the archive through pigz  mpibzip2 , Apr 28, 2015 at 20:57 I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57 oᴉɹǝɥɔ , Jun 10, 2015 at 17:39 Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39 selurvedu , May 26, 2016 at 22:13 Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13 panticz.de , Sep 1, 2014 at 15:02 You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores: tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/  einpoklum , Feb 11, 2017 at 15:59 A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59 If you want to have more flexibility with filenames and compression options, you can use: find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \ tar -P --transform='s@/my/path/@@g' -cf - {} + | \ pigz -9 -p 4 > myarchive.tar.gz  Step 1: find find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want. -exec will execute the next command using the results of find : tar Step 2: tar tar -P --transform='s@/my/path/@@g' -cf - {} + --transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included. -P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading /' from member names". Leading '/' with be removed by --transform anyway. -cf - tells tar to use the tarball name we'll specify later {} + uses everyfiles that find found previously Step 3: pigz pigz -9 -p 4 Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores. Step 4: archive name > myarchive.tar.gz Finally. #### [Jun 23, 2019] Test with rsync between two partitions ###### Jun 23, 2019 | www.fsarchiver.org An important test is done using rsync. It requires two partitions: the original one, and a spare partition where to restore the archive. It allows to know whether or not there are differences between the original and the restored filesystem. rsync is able to compare both the files contents, and files attributes (timestamps, permissions, owner, extended attributes, acl, ), so that's a very good test. The following command can be used to know whether or not files are the same (data and attributes) on two file-systems: rsync -axHAXnP /mnt/part1/ /mnt/part2/  #### [Jun 22, 2019] Using SSH and Tmux for screen sharing Enable by Seth Kenlon Tmux ###### Jun 22, 2019 | www.redhat.com Tmux is a screen multiplexer, meaning that it provides your terminal with virtual terminals, allowing you to switch from one virtual session to another. Modern terminal emulators feature a tabbed UI, making the use of Tmux seem redundant, but Tmux has a few peculiar features that still prove difficult to match without it. First of all, you can launch Tmux on a remote machine, start a process running, detach from Tmux, and then log out. In a normal terminal, logging out would end the processes you started. Since those processes were started in Tmux, they persist even after you leave. Secondly, Tmux can "mirror" its session on multiple screens. If two users log into the same Tmux session, then they both see the same output on their screens in real time. Tmux is a lightweight, simple, and effective solution in cases where you're training someone remotely, debugging a command that isn't working for them, reviewing text, monitoring services or processes, or just avoiding the ten minutes it sometimes takes to read commands aloud over a phone clearly enough that your user is able to accurately type them. To try this option out, you must have two computers. Assume one computer is owned by Alice, and the other by Bob. Alice remotely logs into Bob's PC and launches a Tmux session: alice$ ssh bob.local
alice$tmux On his PC, Bob starts Tmux, attaching to the same session: bob$ tmux attach

When Alice types, Bob sees what she is typing, and when Bob types, Alice sees what he's typing.

It's a simple but effective trick that enables interactive live sessions between computer users, but it is entirely text-based.

Collaboration

With these two applications, you have access to some powerful methods of supporting users. You can use these tools to manage systems remotely, as training tools, or as support tools, and in every case, it sure beats wandering around the office looking for somebody's desk. Get familiar with SSH and Tmux, and start using them today.

#### [Jun 10, 2019] Screen Command Examples To Manage Multiple Terminal Sessions

###### Jun 10, 2019 | www.ostechnix.com

Screen Command Examples To Manage Multiple Terminal Sessions

by sk · Published June 6, 2019 · Updated June 7, 2019

GNU Screen is a terminal multiplexer (window manager). As the name says, Screen multiplexes the physical terminal between multiple interactive shells, so we can perform different tasks in each terminal session. All screen sessions run their programs completely independent. So, a program or process running inside a screen session will keep running even if the session is accidentally closed or disconnected. For instance, when upgrading Ubuntu server via SSH, Screen command will keep running the upgrade process just in case your SSH session is terminated for any reason.

The GNU Screen allows us to easily create multiple screen sessions, switch between different sessions, copy text between sessions, attach or detach from a session at any time and so on. It is one of the important command line tool every Linux admins should learn and use wherever necessary. In this brief guide, we will see the basic usage of Screen command with examples in Linux.

Installing GNU Screen

GNU Screen is available in the default repositories of most Linux operating systems.

To install GNU Screen on Arch Linux, run:

$sudo pacman -S screen  On Debian, Ubuntu, Linux Mint: $ sudo apt-get install screen


On Fedora:

$sudo dnf install screen  On RHEL, CentOS: $ sudo yum install screen


On SUSE/openSUSE:

$sudo zypper install screen  Let us go ahead and see some screen command examples. Screen Command Examples To Manage Multiple Terminal Sessions The default prefix shortcut to all commands in Screen is Ctrl+a . You need to use this shortcut a lot when using Screen. So, just remember this keyboard shortcut. Create new Screen session Let us create a new Screen session and attach to it. To do so, type the following command in terminal: screen  Now, run any program or process inside this session. The running process or program will keep running even if you're disconnected from this session. Detach from Screen sessions To detach from inside a screen session, press Ctrl+a and d . You don't have to press the both key combinations at the same time. First press Ctrl+a and then press d . After detaching from a session, you will see an output something like below. [detached from 29149.pts-0.sk]  Here, 29149 is the screen ID and pts-0.sk is the name of the screen session. You can attach, detach and kill Screen sessions using either screen ID or name of the respective session. Create a named session You can also create a screen session with any custom name of your choice other than the default username like below. screen -S ostechnix  The above command will create a new screen session with name "xxxxx.ostechnix" and attach to it immediately. To detach from the current session, press Ctrl+a followed by d . Naming screen sessions can be helpful when you want to find which processes are running on which sessions. For example, when a setup LAMP stack inside a session, you can simply name it like below. screen -S lampstack  Create detached sessions Sometimes, you might want to create a session, but don't want to attach it automatically. In such cases, run the following command to create detached session named "senthil" : screen -S senthil -d -m  Or, shortly: screen -dmS senthil  The above command will create a session called "senthil", but won't attach to it. List Screen sessions To list all running sessions (attached or detached), run: screen -ls  Sample output: There are screens on: 29700.senthil (Detached) 29415.ostechnix (Detached) 29149.pts-0.sk (Detached) 3 Sockets in /run/screens/S-sk.  As you can see, I have three running sessions and all are detached. Attach to Screen sessions If you want to attach to a session at any time, for example 29415.ostechnix , simply run: screen -r 29415.ostechnix  Or, screen -r ostechnix  Or, just use the screen ID: screen -r 29415  To verify if we are attached to the aforementioned session, simply list the open sessions and check. screen -ls  Sample output: There are screens on: 29700.senthil (Detached) 29415.ostechnix (Attached) 29149.pts-0.sk (Detached) 3 Sockets in /run/screens/S-sk.  As you see in the above output, we are currently attached to 29415.ostechnix session. To exit from the current session, press ctrl+a, d. Create nested sessions When we run "screen" command, it will create a single session for us. We can, however, create nested sessions (a session inside a session). First, create a new session or attach to an opened session. I am going to create a new session named "nested". screen -S nested  Now, press Ctrl+a and c inside the session to create another session. Just repeat this to create any number of nested Screen sessions. Each session will be assigned with a number. The number will start from 0 . You can move to the next session by pressing Ctrl+n and move to previous by pressing Ctrl+p . Here is the list of important Keyboard shortcuts to manage nested sessions. • Ctrl+a " – List all sessions • Ctrl+a 0 – Switch to session number 0 • Ctrl+a n – Switch to next session • Ctrl+a p – Switch to the previous session • Ctrl+a S – Split current region horizontally into two regions • Ctrl+a l – Split current region vertically into two regions • Ctrl+a Q – Close all sessions except the current one • Ctrl+a X – Close the current session • Ctrl+a \ – Kill all sessions and terminate Screen • Ctrl+a ? – Show keybindings. To quit this, press ENTER. Lock sessions Screen has an option to lock a screen session. To do so, press Ctrl+a and x . Enter your Linux password to lock the screen. Screen used by sk <sk> on ubuntuserver. Password:  Logging sessions You might want to log everything when you're in a Screen session. To do so, just press Ctrl+a and H . Alternatively, you can enable the logging when starting a new session using -L parameter. screen -L  From now on, all activities you've done inside the session will recorded and stored in a file named screenlog.x in your$HOME directory. Here, x is a number.

You can view the contents of the log file using cat command or any text viewer applications.

Log screen sessions

Kill Screen sessions

If a session is not required anymore, just kill it. To kill a detached session named "senthil":

screen -r senthil -X quit

Or,

screen -X -S senthil quit

Or,

screen -X -S 29415 quit

If there are no open sessions, you will see the following output:

$screen -ls No Sockets found in /run/screens/S-sk.  For more details, refer man pages. $ man screen

There is also a similar command line utility named "Tmux" which does the same job as GNU Screen. To know more about it, refer the following guide.

Resource:

#### [Mar 13, 2019] Getting started with the cat command by Alan Formy-Duval

###### Mar 13, 2019 | opensource.com

Cat can also number a file's lines during output. There are two commands to do this, as shown in the help documentation: -b, --number-nonblank number nonempty output lines, overrides -n
-n, --number number all output lines

If I use the -b command with the hello.world file, the output will be numbered like this:

   $cat -b hello.world 1 Hello World ! In the example above, there is an empty line. We can determine why this empty line appears by using the -n argument: $ cat -n hello.world
1 Hello World !
2
$ Now we see that there is an extra empty line. These two arguments are operating on the final output rather than the file contents, so if we were to use the -n option with both files, numbering will count lines as follows: $ cat -n hello.world goodbye.world
1 Hello World !
2
3 Good Bye World !
4
$ One other option that can be useful is -s for squeeze-blank . This argument tells cat to reduce repeated empty line output down to one line. This is helpful when reviewing files that have a lot of empty lines, because it effectively fits more text on the screen. Suppose I have a file with three lines that are spaced apart by several empty lines, such as in this example, greetings.world : $ cat greetings.world
Greetings World !

We Come in Peace !
$ Using the -s option saves screen space: $ cat -s greetings.world

Cat is often used to copy contents of one file to another file. You may be asking, "Why not just use cp ?" Here is how I could create a new file, called both.files , that contains the contents of the hello and goodbye files:

$cat hello.world goodbye.world > both.files$ cat both.files
Hello World !
Good Bye World !
$ zcat There is another variation on the cat command known as zcat . This command is capable of displaying files that have been compressed with Gzip without needing to uncompress the files with the gunzip command. As an aside, this also preserves disk space, which is the entire reason files are compressed! The zcat command is a bit more exciting because it can be a huge time saver for system administrators who spend a lot of time reviewing system log files. Where can we find compressed log files? Take a look at /var/log on most Linux systems. On my system, /var/log contains several files, such as syslog.2.gz and syslog.3.gz . These files are the result of the log management system, which rotates and compresses log files to save disk space and prevent logs from growing to unmanageable file sizes. Without zcat , I would have to uncompress these files with the gunzip command before viewing them. Thankfully, I can use zcat :$ cd / var / log
$ls * .gz syslog.2.gz syslog.3.gz$
$zcat syslog.2.gz | more Jan 30 00:02: 26 workstation systemd [ 1850 ] : Starting GNOME Terminal Server... Jan 30 00:02: 26 workstation dbus-daemon [ 1920 ] : [ session uid = 2112 pid = 1920 ] Successful ly activated service 'org.gnome.Terminal' Jan 30 00:02: 26 workstation systemd [ 1850 ] : Started GNOME Terminal Server. Jan 30 00:02: 26 workstation org.gnome.Terminal.desktop [ 2059 ] : # watch_fast: "/org/gno me / terminal / legacy / " (establishing: 0, active: 0) Jan 30 00:02:26 workstation org.gnome.Terminal.desktop[2059]: # unwatch_fast: " / org / g nome / terminal / legacy / " (active: 0, establishing: 1) Jan 30 00:02:26 workstation org.gnome.Terminal.desktop[2059]: # watch_established: " / org / gnome / terminal / legacy / " (establishing: 0) --More-- We can also pass both files to zcat if we want to review both of them uninterrupted. Due to how log rotation works, you need to pass the filenames in reverse order to preserve the chronological order of the log contents:$ ls -l * .gz
-rw-r----- 1 syslog adm 196383 Jan 31 00:00 syslog.2.gz
-rw-r----- 1 syslog adm 1137176 Jan 30 00:00 syslog.3.gz
$zcat syslog.3.gz syslog.2.gz | more The cat command seems simple but is very useful. I use it regularly. You also don't need to feed or pet it like a real cat. As always, I suggest you review the man pages ( man cat ) for the cat and zcat commands to learn more about how it can be used. You can also use the --help argument for a quick synopsis of command line arguments. Victorhck on 13 Feb 2019 Permalink and there's also a "tac" command, that is just a "cat" upside down! Following your example: ~~~~~ tac both.files Good Bye World! Hello World! ~~~~ Happy hacking! :) Johan Godfried on 26 Feb 2019 Permalink Interesting article but please don't misuse cat to pipe to more...... I am trying to teach people to use less pipes and here you go abusing cat to pipe to other commands. IMHO, 99.9% of the time this is not necessary! In stead of "cat file | command" most of the time, you can use "command file" (yes, I am an old dinosaur from a time where memory was very expensive and forking multiple commands could fill it all up) Uri Ran on 03 Mar 2019 Permalink Run cat then press keys to see the codes your shortcut send. (Press Ctrl+C to kill the cat when you're done.) For example, on my Mac, the key combination option-leftarrow is ^[^[[D and command-downarrow is ^[[B. I learned it from https://stackoverflow.com/users/787216/lolesque in his answer to https://stackoverflow.com/questions/12382499/looking-for-altleftarrowkey... Geordie on 04 Mar 2019 Permalink cat is also useful to make (or append to) text files without an editor: $ cat >> foo << "EOF"> Hello World> Another Line> EOF$ #### [Mar 01, 2019] Emergency reboot/shutdown using SysRq by Ilija Matoski ###### peakoilbarrel.com As you know linux implements some type of mechanism to gracefully shutdown and reboot, this means the daemons are stopping, usually linux stops them one by one, the file cache is synced to disk. But what sometimes happens is that the system will not reboot or shutdown no mater how many times you issue the shutdown or reboot command. If the server is close to you, you can always just do a physical reset, but what if it's far away from you, where you can't reach it, sometimes it's not feasible, why if the OpenSSH server crashes and you cannot log in again in the system. If you ever find yourself in a situation like that, there is another option to force the system to reboot or shutdown. The magic SysRq key is a key combination understood by the Linux kernel, which allows the user to perform various low-level commands regardless of the system's state. It is often used to recover from freezes, or to reboot a computer without corrupting the filesystem. Description QWERTY Immediately reboot the system, without unmounting or syncing filesystems b Sync all mounted filesystems s Shut off the system o Send the SIGKILL signal to all processes except init i So if you are in a situation where you cannot reboot or shutdown the server, you can force an immediate reboot by issuing echo 1 > /proc/sys/kernel/sysrq echo b > /proc/sysrq-trigger  If you want you can also force a sync before rebooting by issuing these commands echo 1 > /proc/sys/kernel/sysrq echo s > /proc/sysrq-trigger echo b > /proc/sysrq-trigger  These are called magic commands , and they're pretty much synonymous with holding down Alt-SysRq and another key on older keyboards. Dropping 1 into /proc/sys/kernel/sysrq tells the kernel that you want to enable SysRq access (it's usually disabled). The second command is equivalent to pressing * Alt-SysRq-b on a QWERTY keyboard. If you want to keep SysRq enabled all the time, you can do that with an entry in your server's sysctl.conf: echo "kernel.sysrq = 1" >> /etc/sysctl.conf  #### [Feb 11, 2019] Resuming rsync on a interrupted transfer ###### May 15, 2013 | stackoverflow.com Glitches , May 15, 2013 at 18:06 I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning. Here is my command: rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp" When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 . Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume. How do I fix this so I don't have to manually intervene each time? Richard Michael , Nov 6, 2013 at 4:26 TL;DR : Use --timeout=X (X in seconds) to change the default rsync server timeout, not --inplace . The issue is the rsync server processes (of which there are two, see rsync --server ... in ps output on the receiver) continue running, to wait for the rsync client to send data. If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume. If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9 ), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client). Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well. For example, if you specify rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming. I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds. If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process). Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume. Finally, a few short remarks: • Don't use --inplace to workaround this. You will undoubtedly have other problems as a result, man rsync for the details. • It's trivial, but -t in your rsync options is redundant, it is implied by -a . • An already compressed disk image sent over rsync without compression might result in shorter transfer time (by avoiding double compression). However, I'm unsure of the compression techniques in both cases. I'd test it. • As far as I understand --checksum / -c , it won't help you in this case. It affects how rsync decides if it should transfer a file. Though, after a first rsync completes, you could run a second rsync with -c to insist on checksums, to prevent the strange case that file size and modtime are the same on both sides, but bad data was written. JamesTheAwesomeDude , Dec 29, 2013 at 16:50 Just curious: wouldn't SIGINT (aka ^C ) be 'politer' than SIGTERM ? – JamesTheAwesomeDude Dec 29 '13 at 16:50 Richard Michael , Dec 29, 2013 at 22:34 I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives SIGINT via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) – Richard Michael Dec 29 '13 at 22:34 d-b , Feb 3, 2015 at 8:48 I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/ but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? – d-b Feb 3 '15 at 8:48

Cees Timmerman , Sep 15, 2015 at 17:10

@user23122 --checksum reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. – Cees Timmerman Sep 15 '15 at 17:10

#### [Feb 11, 2019] prsync command man page - pssh

##### Originally from Brent N. Chun ~ Intel Research Berkeley
###### Feb 11, 2019 | www.mankier.com

prsync -- parallel file sync program

Synopsis

prsync [ - v A r a z ] [ -h hosts_file ] [ -H [ user @] host [: port ]] [ -l user ] [ -p par ] [ -o outdir ] [ -e errdir ] [ -t timeout ] [ -O options ] [ -x args ] [ -X arg ] [ -S args ] local ... remote

Description

prsync is a program for copying files in parallel to a number of hosts using the popular rsync program. It provides features such as passing a password to ssh, saving output to files, and timing out.

Options
-h host_file
--hosts host_file
Read hosts from the given host_file . Lines in the host file are of the form [ user @] host [: port ] and can include blank lines and comments (lines beginning with "#"). If multiple host files are given (the -h option is used more than once), then prsync behaves as though these files were concatenated together. If a host is specified multiple times, then prsync will connect the given number of times.
-H
[ user @] host [: port ]
--host
[ user @] host [: port ]
-H
"[ user @] host [: port ] [ [ user @] host [: port ] ... ]"
--host
"[ user @] host [: port ] [ [ user @] host [: port ] ... ]"

Add the given host strings to the list of hosts. This option may be given multiple times, and may be used in conjunction with the -h option.

-l user
--user user
Use the given username as the default for any host entries that don't specifically specify a user.
-p parallelism
--par parallelism
Use the given number as the maximum number of concurrent connections.
-t timeout
--timeout timeout
Make connections time out after the given number of seconds. With a value of 0, prsync will not timeout any connections.
-o outdir
--outdir outdir
Save standard output to files in the given directory. Filenames are of the form [ user @] host [: port ][. num ] where the user and port are only included for hosts that explicitly specify them. The number is a counter that is incremented each time for hosts that are specified more than once.
-e errdir
--errdir errdir
Save standard error to files in the given directory. Filenames are of the same form as with the -o option.
-x args
--extra-args args
Passes extra rsync command-line arguments (see the rsync(1) man page for more information about rsync arguments). This option may be specified multiple times. The arguments are processed to split on whitespace, protect text within quotes, and escape with backslashes. To pass arguments without such processing, use the -X option instead.
-X arg
--extra-arg arg
Passes a single rsync command-line argument (see the rsync(1) man page for more information about rsync arguments). Unlike the -x option, no processing is performed on the argument, including word splitting. To pass multiple command-line arguments, use the option once for each argument.
-O options
--options options
SSH options in the format used in the SSH configuration file (see the ssh_config(5) man page for more information). This option may be specified multiple times.
-A
Prompt for a password and pass it to ssh. The password may be used for either to unlock a key or for password authentication. The password is transferred in a fairly secure manner (e.g., it will not show up in argument lists). However, be aware that a root user on your system could potentially intercept the password.
-v
--verbose
Include error messages from rsync with the -i and \ options.
-r
--recursive
Recursively copy directories.
-a
--archive
Use rsync archive mode (rsync's -a option).
-z
--compress
Use rsync compression.
-S args
--ssh-args args
Passes extra SSH command-line arguments (see the ssh(1) man page for more information about SSH arguments). The given value is appended to the ssh command (rsync's -e option) without any processing.
Tips

The ssh_config file can include an arbitrary number of Host sections. Each host entry specifies ssh options which apply only to the given host. Host definitions can even behave like aliases if the HostName option is included. This ssh feature, in combination with pssh host files, provides a tremendous amount of flexibility.

Exit Status

The exit status codes from prsync are as follows:

0
Success
1
Miscellaneous error
2
Syntax or usage error
3
At least one process was killed by a signal or timed out.
4
All processes completed, but at least one rsync process reported an error (exit status other than 0).
Authors

Written by Brent N. Chun <bnc@theether.org> and Andrew McNabb <amcnabb@mcnabbs.org>.

https://github.com/lilydjwg/pssh

rsync(1) , ssh(1) , ssh_config(5) , pssh(1) , prsync (1), pslurp(1) , pnuke(1) ,

Referenced By

#### [Jan 29, 2019] Backing things up with rsync

##### "... rsync with ssh as the transport mechanism works very well with my nightly LAN backups. I've found this page to be very helpful: http://www.mikerubel.org/computers/rsync_snapshots/ ..."
###### Jul 20, 2017 | www.linuxjournal.com

Anonymous on Fri, 11/08/2002 - 03:00.

The Subject, not the content, really brings back memories.

Imagine this, your tasked with complete control over the network in a multi-million dollar company. You've had some experience in the real world of network maintaince, but mostly you've learned from breaking things at home.

Time comes to implement (yes this was a startup company), a backup routine. You carefully consider the best way to do it and decide copying data to a holding disk before the tape run would be perfect in the situation, faster restore if the holding disk is still alive.

So off you go configuring all your servers for ssh pass through, and create the rsync scripts. Then before the trial run you think it would be a good idea to create a local backup of all the websites.

You logon to the web server, create a temp directory and start testing your newly advance rsync skills. After a couple of goes, you think your ready for the real thing, but you decide to run the test one more time.

Everything seems fine so you delete the temp directory. You pause for a second and your month drops open wider than it has ever opened before, and a feeling of terror overcomes you. You want to hide in a hole and hope you didn't see what you saw.

I RECURSIVELY DELETED ALL THE LIVE CORPORATE WEBSITES ON FRIDAY AFTERNOON AT 4PM!

Anonymous on Sun, 11/10/2002 - 03:00.

This is why it's ALWAYS A GOOD IDEA to use Midnight Commander or something similar to delete directories!!

...Root for (5) years and never trashed a filesystem yet (knockwoody)...

Anonymous on Fri, 11/08/2002 - 03:00.

rsync with ssh as the transport mechanism works very well with my nightly LAN backups. I've found this page to be very helpful: http://www.mikerubel.org/computers/rsync_snapshots/

#### [Dec 05, 2018] How can I scroll up to see the past output in PuTTY?

###### Dec 05, 2018 | superuser.com

user1721949 ,Dec 12, 2012 at 8:32

I have a script which, when I run it from PuTTY, it scrolls the screen. Now, I want to go back to see the errors, but when I scroll up, I can see the past commands, but not the output of the command.

How can I see the past output?

Rico ,Dec 13, 2012 at 8:24

Shift+Pgup/PgDn should work for scrolling without using the scrollbar.

> ,Jul 12, 2017 at 21:45

If shift pageup/pagedown fails, try this command: "reset", which seems to correct the display. – user530079 Jul 12 '17 at 21:45

RedGrittyBrick ,Dec 12, 2012 at 9:31

If you don't pipe the output of your commands into something like less , you will be able to use Putty's scroll-bars to view earlier output.

Putty has settings for how many lines of past output it retains in it's buffer.

before scrolling

after scrolling back (upwards)

If you use something like less the output doesn't get into Putty's scroll buffer

after using less

David Dai ,Dec 14, 2012 at 3:31

why is putty different with the native linux console at this point? – David Dai Dec 14 '12 at 3:31

konradstrack ,Dec 12, 2012 at 9:52

I would recommend using screen if you want to have good control over the scroll buffer on a remote shell.

You can change the scroll buffer size to suit your needs by setting:

defscrollback 4000


in ~/.screenrc , which will specify the number of lines you want to be buffered (4000 in this case).

Then you should run your script in a screen session, e.g. by executing screen ./myscript.sh or first executing screen and then ./myscript.sh inside the session.

It's also possible to enable logging of the console output to a file. You can find more info on the screen's man page .

,

From your descript, it sounds like the "problem" is that you are using screen, tmux, or another window manager dependent on them (byobu). Normally you should be able to scroll back in putty with no issue. Exceptions include if you are in an application like less or nano that creates it's own "window" on the terminal.

With screen and tmux you can generally scroll back with SHIFT + PGUP (same as you could from the physical terminal of the remote machine). They also both have a "copy" mode that frees the cursor from the prompt and lets you use arrow keys to move it around (for selecting text to copy with just the keyboard). It also lets you scroll up and down with the PGUP and PGDN keys. Copy mode under byobu using screen or tmux backends is accessed by pressing F7 (careful, F6 disconnects the session). To do so directly under screen you press CTRL + a then ESC or [ . You can use ESC to exit copy mode. Under tmux you press CTRL + b then [ to enter copy mode and ] to exit.

The simplest solution, of course, is not to use either. I've found both to be quite a bit more trouble than they are worth. If you would like to use multiple different terminals on a remote machine simply connect with multiple instances of putty and manage your windows using, er... Windows. Now forgive me but I must flee before I am burned at the stake for my heresy.

EDIT: almost forgot, some keys may not be received correctly by the remote terminal if putty has not been configured correctly. In your putty config check Terminal -> Keyboard . You probably want the function keys and keypad set to be either Linux or Xterm R6 . If you are seeing strange characters on the terminal when attempting the above this is most likely the problem.

#### [Nov 21, 2018] Linux Shutdown Command 5 Practical Examples Linux Handbook

###### Nov 21, 2018 | linuxhandbook.com

Restart the system with shutdown command

There is a separate reboot command but you don't need to learn a new command just for rebooting the system. You can use the Linux shutdown command for rebooting as wel.

To reboot a system using the shutdown command, use the -r option.

sudo shutdown -r


The behavior is the same as the regular shutdown command. It's just that instead of a shutdown, the system will be restarted.

So, if you used shutdown -r without any time argument, it will schedule a reboot after one minute.

You can schedule reboots the same way you did with shutdown.

sudo shutdown -r +30


You can also reboot the system immediately with shutdown command:

sudo shutdown -r now


If you are in a multi-user environment and there are several users logged on the system, you can send them a custom broadcast message with the shutdown command.

By default, all the logged users will receive a notification about scheduled shutdown and its time. You can customize the broadcast message in the shutdown command itself:

sudo shutdown 16:00 "systems will be shutdown for hardware upgrade, please save your work"


Fun Stuff: You can use the shutdown command with -k option to initiate a 'fake shutdown'. It won't shutdown the system but the broadcast message will be sent to all logged on users.

5. Cancel a scheduled shutdown

If you scheduled a shutdown, you don't have to live with it. You can always cancel a shutdown with option -c.

sudo shutdown -c


And if you had broadcasted a messaged about the scheduled shutdown, as a good sysadmin, you might also want to notify other users about cancelling the scheduled shutdown.

sudo shutdown -c "planned shutdown has been cancelled"


Halt vs Power off

Halt (option -H): terminates all processes and shuts down the cpu .
Power off (option -P): Pretty much like halt but it also turns off the unit itself (lights and everything on the system).

Historically, the earlier computers used to halt the system and then print a message like "it's ok to power off now" and then the computers were turned off through physical switches.

These days, halt should automically power off the system thanks to ACPI .

These were the most common and the most useful examples of the Linux shutdown command. I hope you have learned how to shut down a Linux system via command line. You might also like reading about the less command usage or browse through the list of Linux commands we have covered so far.

If you have any questions or suggestions, feel free to let me know in the comment section.

#### [Nov 15, 2018] Is Glark a Better Grep Linux.com The source for Linux information

##### "... stringfilenames ..."
###### Nov 15, 2018 | www.linux.com

Is Glark a Better Grep? GNU grep is one of my go-to tools on any Linux box. But grep isn't the only tool in town. If you want to try something a bit different, check out glark a grep alternative that might might be better in some situations.

What is glark? Basically, it's a utility that's similar to grep, but it has a few features that grep does not. This includes complex expressions, Perl-compatible regular expressions, and excluding binary files. It also makes showing contextual lines a bit easier. Let's take a look.

I installed glark (yes, annoyingly it's yet another *nix utility that has no initial cap) on Linux Mint 11. Just grab it with apt-get install glark and you should be good to go.

Simple searches work the same way as with grep : glark stringfilenames . So it's pretty much a drop-in replacement for those.

But you're interested in what makes glark special. So let's start with a complex expression, where you're looking for this or that term:

glark -r -o thing1 thing2 *

This will search the current directory and subdirectories for "thing1" or "thing2." When the results are returned, glark will colorize the results and each search term will be highlighted in a different color. So if you search for, say "Mozilla" and "Firefox," you'll see the terms in different colors.

You can also use this to see if something matches within a few lines of another term. Here's an example:

glark --and=3 -o Mozilla Firefox -o ID LXDE *

This was a search I was using in my directory of Linux.com stories that I've edited. I used three terms I knew were in one story, and one term I knew wouldn't be. You can also just use the --and option to spot two terms within X number of lines of each other, like so:

glark --and=3 term1 term2

That way, both terms must be present.

You'll note the --and option is a bit simpler than grep's context line options. However, glark tries to stay compatible with grep, so it also supports the -A , -B and -C options from grep.

Miss the grep output format? You can tell glark to use grep format with the --grep option.

Most, if not all, GNU grep options should work with glark .

Before and After

If you need to search through the beginning or end of a file, glark has the --before and --after options (short versions, -b and -a ). You can use these as percentages or as absolute number of lines. For instance:

glark -a 20 expression *

That will find instances of expression after line 20 in a file.

The glark Configuration File

Note that you can have a ~/.glarkrc that will set common options for each use of glark (unless overridden at the command line). The man page for glark does include some examples, like so:

after-context:     1
before-context:    6
context:           5
file-color:        blue on yellow
highlight:         off
ignore-case:       false
quiet:             yes
text-color:        bold reverse
line-number-color: bold
verbose:           false
grep:              true


Just put that in your ~/.glarkrc and customize it to your heart's content. Note that I've set mine to grep: false and added the binary-files: without-match option. You'll definitely want the quiet option to suppress all the notes about directories, etc. See the man page for more options. It's probably a good idea to spend about 10 minutes on setting up a configuration file.

Final Thoughts

One thing that I have noticed is that glark doesn't seem as fast as grep . When I do a recursive search through a bunch of directories containing (mostly) HTML files, I seem to get results a lot faster with grep . This is not terribly important for most of the stuff I do with either utility. However, if you're doing something where performance is a major factor, then you may want to see if grep fits the bill better.

Is glark "better" than grep? It depends entirely on what you're doing. It has a few features that give it an edge over grep, and I think it's very much worth trying out if you've never given it a shot.

#### [Nov 13, 2018] Resuming rsync partial (-P/--partial) on a interrupted transfer

##### "... should ..."
###### May 15, 2013 | stackoverflow.com

Glitches , May 15, 2013 at 18:06

I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning.

Here is my command:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp" 

When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 .

Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume.

How do I fix this so I don't have to manually intervene each time?

Richard Michael , Nov 6, 2013 at 4:26

TL;DR : Use  --timeout=X  (X in seconds) to change the default rsync server timeout, not --inplace .

The issue is the rsync server processes (of which there are two, see rsync --server ...  in ps  output on the receiver) continue running, to wait for the rsync client to send data.

If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL  (e.g., -9 ), but SIGTERM  (e.g.,  pkill -TERM -x rsync  - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X  (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify  rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM  the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client,  SIGTERM  the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

• Don't use --inplace  to workaround this. You will undoubtedly have other problems as a result, man rsync  for the details.
• It's trivial, but -t  in your rsync options is redundant, it is implied by -a .
• An already compressed disk image sent over rsync without compression might result in shorter transfer time (by avoiding double compression). However, I'm unsure of the compression techniques in both cases. I'd test it.
• As far as I understand  --checksum  / -c , it won't help you in this case. It affects how rsync decides if it should transfer a file. Though, after a first rsync completes, you could run a second rsync with -c  to insist on checksums, to prevent the strange case that file size and modtime are the same on both sides, but bad data was written.

JamesTheAwesomeDude , Dec 29, 2013 at 16:50

Just curious: wouldn't  SIGINT  (aka ^C ) be 'politer' than  SIGTERM ? – JamesTheAwesomeDude Dec 29 '13 at 16:50

Richard Michael , Dec 29, 2013 at 22:34

I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT  to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives  SIGINT  via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) – Richard Michael Dec 29 '13 at 22:34

d-b , Feb 3, 2015 at 8:48

I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/  but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? – d-b Feb 3 '15 at 8:48 Cees Timmerman , Sep 15, 2015 at 17:10 @user23122 --checksum  reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. – Cees Timmerman Sep 15 '15 at 17:10 #### [Nov 08, 2018] Can rsync resume after being interrupted? ###### Sep 15, 2012 | unix.stackexchange.com Tim , Sep 15, 2012 at 23:36 I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly. After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time? Gilles , Sep 16, 2012 at 1:56 Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56 Tim , Sep 16, 2012 at 2:30 @Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30 jwbensley , Sep 16, 2012 at 16:15 There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15 Tim , Sep 19, 2012 at 5:20 @Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20 Gilles , Sep 19, 2012 at 9:25 @Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25 DanielSmedegaardBuus , Nov 1, 2014 at 12:32 First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred. While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify . So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you. With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files. So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt. As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions. --append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target. Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences." That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name. This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets. It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :) So, in short: If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify . If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred. When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size. UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!) UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!) Alex , Aug 28, 2015 at 3:49 According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49 DanielSmedegaardBuus , Sep 1, 2015 at 13:29 Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29 Cees Timmerman , Sep 15, 2015 at 17:21 This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21 DanielSmedegaardBuus , May 10, 2016 at 19:31 @CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31 Jonathan Y. , Jun 14, 2017 at 5:48 What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48 #### [Nov 08, 2018] collectl ###### Nov 08, 2018 | collectl.sourceforge.net  Collectl Latest Version: 4.2.0 June 12, 2017 To use it download the tarball, unpack it and run ./INSTALL Collectl now supports OpenStack Clouds Colmux now part of collectl package Looking for colplot ? It's now here! Remember, to get lustre support contact Peter Piela to get his custom plugin. Home | Architecture | Features | Documentation | Releases | FAQ | Support | News | Acknowledgements There are a number of times in which you find yourself needing performance data. These can include benchmarking, monitoring a system's general heath or trying to determine what your system was doing at some time in the past. Sometimes you just want to know what the system is doing right now. Depending on what you're doing, you often end up using different tools, each designed to for that specific situation. Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interatively or as a daemon but not both, collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp. The following is an example taken while writing a large file and running the collectl command with no arguments. By default it shows cpu, network and disk stats in brief format . The key point of this format is all output appears on a single line making it much easier to spot spikes or other anomalies in the output: collectl #<--------CPU--------><-----------Disks-----------><-----------Network----------> #cpu sys inter ctxsw KBRead Reads KBWrit Writes netKBi pkt-in netKBo pkt-out 37 37 382 188 0 0 27144 254 45 68 3 21 25 25 366 180 20 4 31280 296 0 1 0 0 25 25 368 183 0 0 31720 275 2 20 0 1  In this example, taken while writing to an NFS mounted filesystem, collectl displays interrupts, memory usage and nfs activity with timestamps. Keep in mind that you can mix and match any data and in the case of brief format you simply need to have a window wide enough to accommodate your output. collectl -sjmf -oT # <-------Int--------><-----------Memory-----------><------NFS Totals------> #Time Cpu0 Cpu1 Cpu2 Cpu3 Free Buff Cach Inac Slab Map Reads Writes Meta Comm 08:36:52 1001 66 0 0 2G 201M 609M 363M 219M 106M 0 0 5 0 08:36:53 999 1657 0 0 2G 201M 1G 918M 252M 106M 0 12622 0 2 08:36:54 1001 7488 0 0 1G 201M 1G 1G 286M 106M 0 20147 0 2  You can also display the same information in verbose format , in which case you get a single line for each type of data at the expense of more screen real estate, as can be seen in this example of network data during NFS writes. Note how you can actually see the network traffic stall while waiting for the server to physically write the data. collectl -sn --verbose -oT # NETWORK SUMMARY (/sec) # KBIn PktIn SizeIn MultI CmpI ErrIn KBOut PktOut SizeO CmpO ErrOut 08:46:35 3255 41000 81 0 0 0 112015 78837 1454 0 0 08:46:36 0 9 70 0 0 0 29 25 1174 0 0 08:46:37 0 2 70 0 0 0 0 2 134 0 0  In this last example we see what detail format looks like where we see multiple lines of output for a partitular type of data, which in this case is interrupts. We've also elected to show the time in msecs as well. collectl -sJ -oTm # Int Cpu0 Cpu1 Cpu2 Cpu3 Type Device(s) 08:52:32.002 225 0 4 0 0 IO-APIC-level ioc0 08:52:32.002 000 1000 0 0 0 IO-APIC-edge timer 08:52:32.002 014 0 0 18 0 IO-APIC-edge ide0 08:52:32.002 090 0 0 0 15461 IO-APIC-level eth1  Collectl output can also be saved in a rolling set of logs for later playback or displayed interactively in a variety of formats. If all that isn't enough there are plugins that allow you to report data in alternate formats or even send them over a socket to remote tools such as ganglia or graphite. You can even create files in space-separated format for plotting with external packages like gnuplot. The one below was created with colplot, part of the collectl utilities project, which provides a web-based interface to gnuplot. Are you a big user of the top command? Have you ever wanted to look across a cluster to see what the top processes are? Better yet, how about using iostat across a cluster? Or maybe vmstat or even looking at top network interfaces across a cluster? Look no more because if collectl reports it for one node, colmux can do it across a cluster AND you can sort by any column of your choice by simply using the right/left arrow keys. Collectl and Colmux run on all linux distros and are available in redhat and debian respositories and so getting it may be as simple as running yum or apt-get. Note that since colmux has just been merged into the collectl V4.0.0 package it may not yet be available in the repository of your choice and you should install collectl-utils V4.8.2 or earlier to get it for the time being. Collectl requires perl which is usually installed by default on all major Linux distros and optionally uses Time::Hires which is also usually installed and allows collectl to use fractional intervals and display timestamps in msec. The Compress::Zlib module is usually installed as well and if present the recorded data will be compressed and therefore use on average 90% less storage when recording to a file. If you're still not sure if collectl is right for you, take a couple of minutes to look at the Collectl Tutorial to get a better feel for what collectl can do. Also be sure to check back and see what's new on the website, sign up for a Mailing List or watch the Forums .  "I absolutely love it and have been using it extensively for months." Kevin Closson: Performance Architect, EMC "Collectl is indispensable to any system admin." Matt Heaton: President, Bluehost.com #### [Nov 08, 2018] pexec utility is similar to parallel ###### Nov 08, 2018 | www.gnu.org Welcome to the web page of the pexec program! The main purpose of the program pexec is to execute the given command or shell script (e.g. parsed by /bin/sh ) in parallel on the local host or on remote hosts, while some of the execution parameters, namely the redirected standard input, output or error and environmental variables can be varied. This program is therefore capable to replace the classic shell loop iterators (e.g. for ~ in ~ done , in bash ) by executing the body of the loop in parallel. Thus, the program pexec implements shell level data parallelism in a barely simple form. The capabilities of the program is extended with additional features, such as allowing to define mutual exclusions, do atomic command executions and implement higher level resource and job control. See the complete manual for more details. See a brief Hungarian description of the program here . The actual version of the program package is 1.0rc8 . You may browse the package directory here (for FTP access, see this directory ). See the GNU summary page of this project here . The latest version of the program source package is pexec-1.0rc8.tar.gz . Here is another mirror of the package directory. Please consider making donations to the author (via PayPal ) in order to help further development of the program or support the GNU project via the FSF . #### [Nov 08, 2018] 15 Linux Split and Join Command Examples to Manage Large Files ###### Nov 08, 2018 | www.thegeekstuff.com by Himanshu Arora on October 16, 2012 https://apis.google.com/se/0/_/+1/fastbutton?usegapi=1&size=medium&origin=https%3A%2F%2Fwww.thegeekstuff.com&url=https%3A%2F%2Fwww.thegeekstuff.com%2F2012%2F10%2F15-linux-split-and-join-command-examples-to-manage-large-files%2F&gsrc=3p&jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en_US.f5JujS1eFMY.O%2Fm%3D__features__%2Fam%3DQQE%2Frt%3Dj%2Fd%3D1%2Frs%3DAGLTcCNDI1_ftdVIpg6jNiygedEKTreQ2A#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&id=I0_1529064634502&_gfid=I0_1529064634502&parent=https%3A%2F%2Fwww.thegeekstuff.com&pfname=&rpctoken=68750732 https://www.facebook.com/plugins/like.php?href=https%3A%2F%2Fwww.thegeekstuff.com%2F2012%2F10%2F15-linux-split-and-join-command-examples-to-manage-large-files%2F&send=false&layout=button_count&width=450&show_faces=false&action=like&colorscheme=light&font&height=21 https://platform.twitter.com/widgets/tweet_button.c5b006ac082bc92aa829181b9ce63af1.en.html#dnt=false&id=twitter-widget-0&lang=en&original_referer=https%3A%2F%2Fwww.thegeekstuff.com%2F2012%2F10%2F15-linux-split-and-join-command-examples-to-manage-large-files%2F&size=m&text=15%20Linux%20Split%20and%20Join%20Command%20Examples%20to%20Manage%20Large%20Files&time=1529064635577&type=share&url=https%3A%2F%2Fwww.thegeekstuff.com%2F2012%2F10%2F15-linux-split-and-join-command-examples-to-manage-large-files%2F Linux split and join commands are very helpful when you are manipulating large files. This article explains how to use Linux split and join command with descriptive examples. Join and split command syntax: join [OPTION] FILE1 FILE2 split [OPTION] [INPUT [PREFIX]] Linux Split Command Examples 1. Basic Split Example Here is a basic example of split command. $ split split.zip

$ls split.zip xab xad xaf xah xaj xal xan xap xar xat xav xax xaz xbb xbd xbf xbh xbj xbl xbn xaa xac xae xag xai xak xam xao xaq xas xau xaw xay xba xbc xbe xbg xbi xbk xbm xbo  So we see that the file split.zip was split into smaller files with x** as file names. Where ** is the two character suffix that is added by default. Also, by default each x** file would contain 1000 lines. $ wc -l *
40947 split.zip
1000 xaa
1000 xab
1000 xac
1000 xae
1000 xaf
1000 xag
1000 xah
1000 xai
...
...
...


So the output above confirms that by default each x** file contains 1000 lines.

2.Change the Suffix Length using -a option

As discussed in example 1 above, the default suffix length is 2. But this can be changed by using -a option.

me name=

As you see in the following example, it is using suffix of length 5 on the split files.

$split -a5 split.zip$ ls
split.zip  xaaaac  xaaaaf  xaaaai  xaaaal  xaaaao  xaaaar  xaaaau  xaaaax  xaaaba  xaaabd  xaaabg  xaaabj  xaaabm
xaaaaa     xaaaad  xaaaag  xaaaaj  xaaaam  xaaaap  xaaaas  xaaaav  xaaaay  xaaabb  xaaabe  xaaabh  xaaabk  xaaabn
xaaaab     xaaaae  xaaaah  xaaaak  xaaaan  xaaaaq  xaaaat  xaaaaw  xaaaaz  xaaabc  xaaabf  xaaabi  xaaabl  xaaabo


Note: Earlier we also discussed about other file manipulation utilities – tac, rev, paste .

3.Customize Split File Size using -b option

Size of each output split file can be controlled using -b option.

In this example, the split files were created with a size of 200000 bytes.

$split -b200000 split.zip$ ls -lart
total 21084
drwxrwxr-x 3 himanshu himanshu     4096 Sep 26 21:20 ..
-rw-rw-r-- 1 himanshu himanshu 10767315 Sep 26 21:21 split.zip
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xad
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xac
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xab
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xaa
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xah
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xag
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xaf
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xae
-rw-rw-r-- 1 himanshu himanshu   200000 Sep 26 21:35 xar
...
...
...

4. Create Split Files with Numeric Suffix using -d option

As seen in examples above, the output has the format of x** where ** are alphabets. You can change this to number using -d option.

Here is an example. This has numeric suffix on the split files.

$split -d split.zip$ ls
split.zip  x01  x03  x05  x07  x09  x11  x13  x15  x17  x19  x21  x23  x25  x27  x29  x31  x33  x35  x37  x39
x00        x02  x04  x06  x08  x10  x12  x14  x16  x18  x20  x22  x24  x26  x28  x30  x32  x34  x36  x38  x40

5. Customize the Number of Split Chunks using -C option

To get control over the number of chunks, use the -C option.

This example will create 50 chunks of split files.

$split -n50 split.zip$ ls
split.zip  xac  xaf  xai  xal  xao  xar  xau  xax  xba  xbd  xbg  xbj  xbm  xbp  xbs  xbv
xaa        xad  xag  xaj  xam  xap  xas  xav  xay  xbb  xbe  xbh  xbk  xbn  xbq  xbt  xbw
xab        xae  xah  xak  xan  xaq  xat  xaw  xaz  xbc  xbf  xbi  xbl  xbo  xbr  xbu  xbx

6. Avoid Zero Sized Chunks using -e option

While splitting a relatively small file in large number of chunks, its good to avoid zero sized chunks as they do not add any value. This can be done using -e option.

Here is an example:

$split -n50 testfile$ ls -lart x*
-rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xag
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xaf
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xae
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xad
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xac
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xab
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xaa
-rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xbx
-rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xbw
-rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xbv
...
...
...


So we see that lots of zero size chunks were produced in the above output. Now, lets use -e option and see the results:

$split -n50 -e testfile$ ls
split.zip  testfile  xaa  xab  xac  xad  xae  xaf

$ls -lart x* -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xaf -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xae -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xad -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xac -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xab -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xaa  So we see that no zero sized chunk was produced in the above output. 7. Customize Number of Lines using -l option Number of lines per output split file can be customized using the -l option. As seen in the example below, split files are created with 20000 lines. $ split -l20000 split.zip

$ls split.zip testfile xaa xab xac$ wc -l x*
20000 xaa
20000 xab
947 xac
40947 total

Get Detailed Information using –verbose option

To get a diagnostic message each time a new split file is opened, use –verbose option as shown below.

$split -l20000 --verbose split.zip creating file xaa' creating file xab' creating file xac'  #### [Nov 08, 2018] Utilizing multi core for tar+gzip-bzip compression-decompression ###### Nov 08, 2018 | stackoverflow.com user1118764 , Sep 7, 2012 at 6:58 I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit). I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression. Is there any way I can utilize the unused cores to make it faster? Warren Severin , Nov 13, 2017 at 4:37 The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37 Mark Adler , Sep 7, 2012 at 14:48 You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz: tar cf - paths-to-archive | pigz > archive.tar.gz  By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g. tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz  user788171 , Feb 20, 2013 at 12:43 How do you use pigz to decompress in the same fashion? Or does it only work for compression? – user788171 Feb 20 '13 at 12:43 Mark Adler , Feb 20, 2013 at 16:18 pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression. The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores. – Mark Adler Feb 20 '13 at 16:18 Garrett , Mar 1, 2014 at 7:26 The hyphen here is stdout (see this page ). – Garrett Mar 1 '14 at 7:26 Mark Adler , Jul 2, 2014 at 21:29 Yes. 100% compatible in both directions. – Mark Adler Jul 2 '14 at 21:29 Mark Adler , Apr 23, 2015 at 5:23 There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. – Mark Adler Apr 23 '15 at 5:23 Jen , Jun 14, 2013 at 14:34 You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use. For example use: tar -c --use-compress-program=pigz -f tar.file dir_to_zip  ranman , Nov 13, 2013 at 10:01 This is an awesome little nugget of knowledge and deserves more upvotes. I had no idea this option even existed and I've read the man page a few times over the years. – ranman Nov 13 '13 at 10:01 Valerio Schiavoni , Aug 5, 2014 at 22:38 Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38 bovender , Sep 18, 2015 at 10:14 @ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14 Valerio Schiavoni , Sep 28, 2015 at 23:41 On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41 Offenso , Jan 11, 2017 at 17:26 I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26 Maxim Suslov , Dec 18, 2014 at 7:31 Common approach There is option for tar program: -I, --use-compress-program PROG filter through PROG (must accept -d)  You can use multithread version of archiver or compressor utility. Most popular multithread archivers are pigz (instead of gzip) and pbzip2 (instead of bzip2). For instance: $ tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive
$tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive  Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary): $ tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz
$tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz  Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa. p7zip For p7zip for compression you need a small shell script like the following: #!/bin/sh case$1 in
-d) 7za -txz -si -so e;;
*) 7za -txz -si -so a .;;
esac 2>/dev/null


Save it as 7zhelper.sh. Here the example of usage:

$tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive$ tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z

xz

Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ).

This is a fragment of man for 5.1.0alpha version:

Multithreaded compression and decompression are not implemented yet, so this option has no effect for now.

However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2:

Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used.

Recompiling with replacement

If you build tar from sources, then you can recompile with parameters

--with-gzip=pigz
--with-bzip2=lbzip2
--with-lzip=plzip


After recompiling tar with these options you can check the output of tar's help:

$tar --help | grep "lbzip2\|plzip\|pigz" -j, --bzip2 filter the archive through lbzip2 --lzip filter the archive through plzip -z, --gzip, --gunzip, --ungzip filter the archive through pigz  > , Apr 28, 2015 at 20:41 This is indeed the best answer. I'll definitely rebuild my tar! – user1985657 Apr 28 '15 at 20:41 mpibzip2 , Apr 28, 2015 at 20:57 I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57 oᴉɹǝɥɔ , Jun 10, 2015 at 17:39 This is a great and elaborate answer. It may be good to mention that multithreaded compression (e.g. with pigz ) is only enabled when it reads from the file. Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39 selurvedu , May 26, 2016 at 22:13 Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13 panticz.de , Sep 1, 2014 at 15:02 You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores: tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/  einpoklum , Feb 11, 2017 at 15:59 A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59 , If you want to have more flexibility with filenames and compression options, you can use: find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \ tar -P --transform='s@/my/path/@@g' -cf - {} + | \ pigz -9 -p 4 > myarchive.tar.gz  Step 1: find find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want. -exec will execute the next command using the results of find : tar Step 2: tar tar -P --transform='s@/my/path/@@g' -cf - {} + --transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included. -P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading /' from member names". Leading '/' with be removed by --transform anyway. -cf - tells tar to use the tarball name we'll specify later {} + uses everyfiles that find found previously Step 3: pigz pigz -9 -p 4 Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores. Step 4: archive name > myarchive.tar.gz Finally. #### [Nov 03, 2018] David Both ###### Jun 22, 2017 | opensource.com ... The long listing of the /lib64 directory above shows that the first character in the filemode is the letter "l," which means that each is a soft or symbolic link. Hard links In An introduction to Linux's EXT4 filesystem , I discussed the fact that each file has one inode that contains information about that file, including the location of the data belonging to that file. Figure 2 in that article shows a single directory entry that points to the inode. Every file must have at least one directory entry that points to the inode that describes the file. The directory entry is a hard link, thus every file has at least one hard link. In Figure 1 below, multiple directory entries point to a single inode. These are all hard links. I have abbreviated the locations of three of the directory entries using the tilde ( ~ ) convention for the home directory, so that ~ is equivalent to /home/user in this example. Note that the fourth directory entry is in a completely different directory, /home/shared , which might be a location for sharing files between users of the computer. Figure 1 Hard links are limited to files contained within a single filesystem. "Filesystem" is used here in the sense of a partition or logical volume (LV) that is mounted on a specified mount point, in this case /home . This is because inode numbers are unique only within each filesystem, and a different filesystem, for example, /var or /opt , will have inodes with the same number as the inode for our file. Because all the hard links point to the single inode that contains the metadata about the file, all of these attributes are part of the file, such as ownerships, permissions, and the total number of hard links to the inode, and cannot be different for each hard link. It is one file with one set of attributes. The only attribute that can be different is the file name, which is not contained in the inode. Hard links to a single file/inode located in the same directory must have different names, due to the fact that there can be no duplicate file names within a single directory. The number of hard links for a file is displayed with the ls -l command. If you want to display the actual inode numbers, the command ls -li does that. Symbolic (soft) links The difference between a hard link and a soft link, also known as a symbolic link (or symlink), is that, while hard links point directly to the inode belonging to the file, soft links point to a directory entry, i.e., one of the hard links. Because soft links point to a hard link for the file and not the inode, they are not dependent upon the inode number and can work across filesystems, spanning partitions and LVs. The downside to this is: If the hard link to which the symlink points is deleted or renamed, the symlink is broken. The symlink is still there, but it points to a hard link that no longer exists. Fortunately, the ls command highlights broken links with flashing white text on a red background in a long listing. Lab project: experimenting with links I think the easiest way to understand the use of and differences between hard and soft links is with a lab project that you can do. This project should be done in an empty directory as a non-root user . I created the ~/temp directory for this project, and you should, too. It creates a safe place to do the project and provides a new, empty directory to work in so that only files associated with this project will be located there. Initial setup First, create the temporary directory in which you will perform the tasks needed for this project. Ensure that the present working directory (PWD) is your home directory, then enter the following command. mkdir temp  Change into ~/temp to make it the PWD with this command. cd temp  To get started, we need to create a file we can link to. The following command does that and provides some content as well. du -h > main.file.txt  Use the ls -l long list to verify that the file was created correctly. It should look similar to my results. Note that the file size is only 7 bytes, but yours may vary by a byte or two. [ dboth @ david temp ]$ ls -l
total 4
-rw-rw-r-- 1 dboth dboth 7 Jun 13 07: 34 main.file.txt

Notice the number "1" following the file mode in the listing. That number represents the number of hard links that exist for the file. For now, it should be 1 because we have not created any additional links to our test file.

Hard links create a new directory entry pointing to the same inode, so when hard links are added to a file, you will see the number of links increase. Ensure that the PWD is still ~/temp . Create a hard link to the file main.file.txt , then do another long list of the directory.

[ dboth @ david temp ] $ln main.file.txt link1.file.txt [ dboth @ david temp ]$ ls -l
total 8
-rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 link1.file.txt
-rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 main.file.txt

Notice that both files have two links and are exactly the same size. The date stamp is also the same. This is really one file with one inode and two links, i.e., directory entries to it. Create a second hard link to this file and list the directory contents. You can create the link to either of the existing ones: link1.file.txt or main.file.txt .

[ dboth @ david temp ] $ln link1.file.txt link2.file.txt ; ls -l total 16 -rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link1.file.txt -rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link2.file.txt -rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 main.file.txt Notice that each new hard link in this directory must have a different name because two files -- really directory entries -- cannot have the same name within the same directory. Try to create another link with a target name the same as one of the existing ones. [ dboth @ david temp ]$ ln main.file.txt link2.file.txt

Clearly that does not work, because link2.file.txt already exists. So far, we have created only hard links in the same directory. So, create a link in your home directory, the parent of the temp directory in which we have been working so far.

[ dboth @ david temp ] $ln main.file.txt .. / main.file.txt ; ls -l .. / main * -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt The ls command in the above listing shows that the main.file.txt file does exist in the home directory with the same name as the file in the temp directory. Of course, these are not different files; they are the same file with multiple links -- directory entries -- to the same inode. To help illustrate the next point, add a file that is not a link. [ dboth @ david temp ]$ touch unlinked.file ; ls -l
total 12
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt
-rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

Look at the inode number of the hard links and that of the new file using the -i option to the ls command.

[ dboth @ david temp ] $ls -li total 12 657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt 657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt 657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file Notice the number 657024 to the left of the file mode in the example above. That is the inode number, and all three file links point to the same inode. You can use the -i option to view the inode number for the link we created in the home directory as well, and that will also show the same value. The inode number of the file that has only one link is different from the others. Note that the inode numbers will be different on your system. Let's change the size of one of the hard-linked files. [ dboth @ david temp ]$ df -h > link2.file.txt ; ls -li
total 12
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The file size of all the hard-linked files is now larger than before. That is because there is really only one file that is linked to by multiple directory entries.

I know this next experiment will work on my computer because my /tmp directory is on a separate LV. If you have a separate LV or a filesystem on a different partition (if you're not using LVs), determine whether or not you have access to that LV or partition. If you don't, you can try to insert a USB memory stick and mount it. If one of those options works for you, you can do this experiment.

Try to create a link to one of the files in your ~/temp directory in /tmp (or wherever your different filesystem directory is located).

[ dboth @ david temp ] $ln link2.file.txt / tmp / link3.file.txt ln: failed to create hard link '/tmp/link3.file.txt' = > 'link2.file.txt' : Invalid cross-device link Why does this error occur? The reason is each separate mountable filesystem has its own set of inode numbers. Simply referring to a file by an inode number across the entire Linux directory structure can result in confusion because the same inode number can exist in each mounted filesystem. There may be a time when you will want to locate all the hard links that belong to a single inode. You can find the inode number using the ls -li command. Then you can use the find command to locate all links with that inode number. [ dboth @ david temp ]$ find . -inum 657024
. / main.file.txt

Note that the find command did not find all four of the hard links to this inode because we started at the current directory of ~/temp . The find command only finds files in the PWD and its subdirectories. To find all the links, we can use the following command, which specifies your home directory as the starting place for the search.

[ dboth @ david temp ] $find ~ -samefile main.file.txt / home / dboth / temp / main.file.txt / home / dboth / temp / link1.file.txt / home / dboth / temp / link2.file.txt / home / dboth / main.file.txt You may see error messages if you do not have permissions as a non-root user. This command also uses the -samefile option instead of specifying the inode number. This works the same as using the inode number and can be easier if you know the name of one of the hard links. Experimenting with soft links As you have just seen, creating hard links is not possible across filesystem boundaries; that is, from a filesystem on one LV or partition to a filesystem on another. Soft links are a means to answer that problem with hard links. Although they can accomplish the same end, they are very different, and knowing these differences is important. Let's start by creating a symlink in our ~/temp directory to start our exploration. [ dboth @ david temp ]$ ln -s link2.file.txt link3.file.txt ; ls -li
total 12
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt
658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - >
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The hard links, those that have the inode number 657024 , are unchanged, and the number of hard links shown for each has not changed. The newly created symlink has a different inode, number 658270 . The soft link named link3.file.txt points to link2.file.txt . Use the cat command to display the contents of link3.file.txt . The file mode information for the symlink starts with the letter " l " which indicates that this file is actually a symbolic link.

The size of the symlink link3.file.txt is only 14 bytes in the example above. That is the size of the text link3.file.txt -> link2.file.txt , which is the actual content of the directory entry. The directory entry link3.file.txt does not point to an inode; it points to another directory entry, which makes it useful for creating links that span file system boundaries. So, let's create that link we tried before from the /tmp directory.

[ dboth @ david temp ] $ln -s / home / dboth / temp / link2.file.txt / tmp / link3.file.txt ; ls -l / tmp / link * lrwxrwxrwx 1 dboth dboth 31 Jun 14 21 : 53 / tmp / link3.file.txt - > / home / dboth / temp / link2.file.txt Deleting links There are some other things that you should consider when you need to delete links or the files to which they point. First, let's delete the link main.file.txt . Remember that every directory entry that points to an inode is simply a hard link. [ dboth @ david temp ]$ rm main.file.txt ; ls -li
total 8
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt
658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - >
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The link main.file.txt was the first link created when the file was created. Deleting it now still leaves the original file and its data on the hard drive along with all the remaining hard links. To delete the file and its data, you would have to delete all the remaining hard links.

[ dboth @ david temp ] $rm link2.file.txt ; ls -li total 8 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - > link2.file.txt 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file Notice what happens to the soft link. Deleting the hard link to which the soft link points leaves a broken link. On my system, the broken link is highlighted in colors and the target hard link is flashing. If the broken link needs to be fixed, you can create another hard link in the same directory with the same name as the old one, so long as not all the hard links have been deleted. You could also recreate the link itself, with the link maintaining the same name but pointing to one of the remaining hard links. Of course, if the soft link is no longer needed, it can be deleted with the rm command. The unlink command can also be used to delete files and links. It is very simple and has no options, as the rm command does. It does, however, more accurately reflect the underlying process of deletion, in that it removes the link -- the directory entry -- to the file being deleted. Final thoughts I worked with both types of links for a long time before I began to understand their capabilities and idiosyncrasies. It took writing a lab project for a Linux class I taught to fully appreciate how links work. This article is a simplification of what I taught in that class, and I hope it speeds your learning curve. David Both - David Both is a Linux and Open Source advocate who resides in Raleigh, North Carolina. He has been in the IT industry for over forty years and taught OS/2 for IBM where he worked for over 20 years. While at IBM, he wrote the first training course for the original IBM PC in 1981. He has taught RHCE classes for Red Hat and has worked at MCI Worldcom, Cisco, and the State of North Carolina. He has been working with Linux and Open Source Software for almost 20 years. dgrb on 23 Jun 2017 Permalink There is a hard link "gotcha" which IMHO is worth mentioning. If you use an editor which makes automatic backups - emacs certainly is one such - then you may end up with a new version of the edited file, while the backup is the linked copy, because the editor simply renames the file to the backup name (with emacs, test.c would be renamed test.c~) and the new version when saved under the old name is no longer linked. Symbolic links avoid this problem, so I tend to use them for source code where required. #### [Nov 03, 2018] David Both ###### Nov 03, 2018 | opensource.com Feed 161 up 4 comments Image by : Paul Lewin . Modified by Opensource.com. CC BY-SA 2.0 x Get the newsletter Join the 85,000 open source advocates who receive our giveaway alerts and article roundups. https://opensource.com/eloqua-embedded-email-capture-block.html?offer_id=70160000000QzXNAA0 An introduction to Linux's EXT4 filesystem ; Managing devices in Linux ; An introduction to Linux filesystems ; and A Linux user's guide to Logical Volume Management , I have briefly mentioned an interesting feature of Linux filesystems that can make some tasks easier by providing access to files from multiple locations in the filesystem directory tree. There are two types of Linux filesystem links: hard and soft. The difference between the two types of links is significant, but both types are used to solve similar problems. They both provide multiple directory entries (or references) to a single file, but they do it quite differently. Links are powerful and add flexibility to Linux filesystems because everything is a file . More Linux resources I have found, for instance, that some programs required a particular version of a library. When a library upgrade replaced the old version, the program would crash with an error specifying the name of the old, now-missing library. Usually, the only change in the library name was the version number. Acting on a hunch, I simply added a link to the new library but named the link after the old library name. I tried the program again and it worked perfectly. And, okay, the program was a game, and everyone knows the lengths that gamers will go to in order to keep their games running. In fact, almost all applications are linked to libraries using a generic name with only a major version number in the link name, while the link points to the actual library file that also has a minor version number. In other instances, required files have been moved from one directory to another to comply with the Linux file specification, and there are links in the old directories for backwards compatibility with those programs that have not yet caught up with the new locations. If you do a long listing of the /lib64 directory, you can find many examples of both. lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.hwm -> ../../usr/share/cracklib/pw_dict.hwm lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.pwd -> ../../usr/share/cracklib/pw_dict.pwd lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.pwi -> ../../usr/share/cracklib/pw_dict.pwi lrwxrwxrwx. 1 root root 27 Jun 9 2016 libaccountsservice.so.0 -> libaccountsservice.so.0.0.0 -rwxr-xr-x. 1 root root 288456 Jun 9 2016 libaccountsservice.so.0.0.0 lrwxrwxrwx 1 root root 15 May 17 11:47 libacl.so.1 -> libacl.so.1.1.0 -rwxr-xr-x 1 root root 36472 May 17 11:47 libacl.so.1.1.0 lrwxrwxrwx. 1 root root 15 Feb 4 2016 libaio.so.1 -> libaio.so.1.0.1 -rwxr-xr-x. 1 root root 6224 Feb 4 2016 libaio.so.1.0.0 -rwxr-xr-x. 1 root root 6224 Feb 4 2016 libaio.so.1.0.1 lrwxrwxrwx. 1 root root 30 Jan 16 16:39 libakonadi-calendar.so.4 -> libakonadi-calendar.so.4.14.26 -rwxr-xr-x. 1 root root 816160 Jan 16 16:39 libakonadi-calendar.so.4.14.26 lrwxrwxrwx. 1 root root 29 Jan 16 16:39 libakonadi-contact.so.4 -> libakonadi-contact.so.4.14.26 A few of the links in the /lib64 directory The long listing of the /lib64 directory above shows that the first character in the filemode is the letter "l," which means that each is a soft or symbolic link. Hard links In An introduction to Linux's EXT4 filesystem , I discussed the fact that each file has one inode that contains information about that file, including the location of the data belonging to that file. Figure 2 in that article shows a single directory entry that points to the inode. Every file must have at least one directory entry that points to the inode that describes the file. The directory entry is a hard link, thus every file has at least one hard link. In Figure 1 below, multiple directory entries point to a single inode. These are all hard links. I have abbreviated the locations of three of the directory entries using the tilde ( ~ ) convention for the home directory, so that ~ is equivalent to /home/user in this example. Note that the fourth directory entry is in a completely different directory, /home/shared , which might be a location for sharing files between users of the computer. Figure 1 Hard links are limited to files contained within a single filesystem. "Filesystem" is used here in the sense of a partition or logical volume (LV) that is mounted on a specified mount point, in this case /home . This is because inode numbers are unique only within each filesystem, and a different filesystem, for example, /var or /opt , will have inodes with the same number as the inode for our file. Because all the hard links point to the single inode that contains the metadata about the file, all of these attributes are part of the file, such as ownerships, permissions, and the total number of hard links to the inode, and cannot be different for each hard link. It is one file with one set of attributes. The only attribute that can be different is the file name, which is not contained in the inode. Hard links to a single file/inode located in the same directory must have different names, due to the fact that there can be no duplicate file names within a single directory. The number of hard links for a file is displayed with the ls -l command. If you want to display the actual inode numbers, the command ls -li does that. Symbolic (soft) links The difference between a hard link and a soft link, also known as a symbolic link (or symlink), is that, while hard links point directly to the inode belonging to the file, soft links point to a directory entry, i.e., one of the hard links. Because soft links point to a hard link for the file and not the inode, they are not dependent upon the inode number and can work across filesystems, spanning partitions and LVs. The downside to this is: If the hard link to which the symlink points is deleted or renamed, the symlink is broken. The symlink is still there, but it points to a hard link that no longer exists. Fortunately, the ls command highlights broken links with flashing white text on a red background in a long listing. Lab project: experimenting with links I think the easiest way to understand the use of and differences between hard and soft links is with a lab project that you can do. This project should be done in an empty directory as a non-root user . I created the ~/temp directory for this project, and you should, too. It creates a safe place to do the project and provides a new, empty directory to work in so that only files associated with this project will be located there. Initial setup First, create the temporary directory in which you will perform the tasks needed for this project. Ensure that the present working directory (PWD) is your home directory, then enter the following command. mkdir temp  Change into ~/temp to make it the PWD with this command. cd temp  To get started, we need to create a file we can link to. The following command does that and provides some content as well. du -h > main.file.txt  Use the ls -l long list to verify that the file was created correctly. It should look similar to my results. Note that the file size is only 7 bytes, but yours may vary by a byte or two. [ dboth @ david temp ]$ ls -l
total 4
-rw-rw-r-- 1 dboth dboth 7 Jun 13 07: 34 main.file.txt

Notice the number "1" following the file mode in the listing. That number represents the number of hard links that exist for the file. For now, it should be 1 because we have not created any additional links to our test file.

Hard links create a new directory entry pointing to the same inode, so when hard links are added to a file, you will see the number of links increase. Ensure that the PWD is still ~/temp . Create a hard link to the file main.file.txt , then do another long list of the directory.

[ dboth @ david temp ] $ln main.file.txt link1.file.txt [ dboth @ david temp ]$ ls -l
total 8
-rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 link1.file.txt
-rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 main.file.txt

Notice that both files have two links and are exactly the same size. The date stamp is also the same. This is really one file with one inode and two links, i.e., directory entries to it. Create a second hard link to this file and list the directory contents. You can create the link to either of the existing ones: link1.file.txt or main.file.txt .

[ dboth @ david temp ] $ln link1.file.txt link2.file.txt ; ls -l total 16 -rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link1.file.txt -rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link2.file.txt -rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 main.file.txt Notice that each new hard link in this directory must have a different name because two files -- really directory entries -- cannot have the same name within the same directory. Try to create another link with a target name the same as one of the existing ones. [ dboth @ david temp ]$ ln main.file.txt link2.file.txt

Clearly that does not work, because link2.file.txt already exists. So far, we have created only hard links in the same directory. So, create a link in your home directory, the parent of the temp directory in which we have been working so far.

[ dboth @ david temp ] $ln main.file.txt .. / main.file.txt ; ls -l .. / main * -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt The ls command in the above listing shows that the main.file.txt file does exist in the home directory with the same name as the file in the temp directory. Of course, these are not different files; they are the same file with multiple links -- directory entries -- to the same inode. To help illustrate the next point, add a file that is not a link. [ dboth @ david temp ]$ touch unlinked.file ; ls -l
total 12
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt
-rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

Look at the inode number of the hard links and that of the new file using the -i option to the ls command.

[ dboth @ david temp ] $ls -li total 12 657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt 657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt 657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file Notice the number 657024 to the left of the file mode in the example above. That is the inode number, and all three file links point to the same inode. You can use the -i option to view the inode number for the link we created in the home directory as well, and that will also show the same value. The inode number of the file that has only one link is different from the others. Note that the inode numbers will be different on your system. Let's change the size of one of the hard-linked files. [ dboth @ david temp ]$ df -h > link2.file.txt ; ls -li
total 12
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The file size of all the hard-linked files is now larger than before. That is because there is really only one file that is linked to by multiple directory entries.

I know this next experiment will work on my computer because my /tmp directory is on a separate LV. If you have a separate LV or a filesystem on a different partition (if you're not using LVs), determine whether or not you have access to that LV or partition. If you don't, you can try to insert a USB memory stick and mount it. If one of those options works for you, you can do this experiment.

Try to create a link to one of the files in your ~/temp directory in /tmp (or wherever your different filesystem directory is located).

[ dboth @ david temp ] $ln link2.file.txt / tmp / link3.file.txt ln: failed to create hard link '/tmp/link3.file.txt' = > 'link2.file.txt' : Invalid cross-device link Why does this error occur? The reason is each separate mountable filesystem has its own set of inode numbers. Simply referring to a file by an inode number across the entire Linux directory structure can result in confusion because the same inode number can exist in each mounted filesystem. There may be a time when you will want to locate all the hard links that belong to a single inode. You can find the inode number using the ls -li command. Then you can use the find command to locate all links with that inode number. [ dboth @ david temp ]$ find . -inum 657024
. / main.file.txt

Note that the find command did not find all four of the hard links to this inode because we started at the current directory of ~/temp . The find command only finds files in the PWD and its subdirectories. To find all the links, we can use the following command, which specifies your home directory as the starting place for the search.

[ dboth @ david temp ] $find ~ -samefile main.file.txt / home / dboth / temp / main.file.txt / home / dboth / temp / link1.file.txt / home / dboth / temp / link2.file.txt / home / dboth / main.file.txt You may see error messages if you do not have permissions as a non-root user. This command also uses the -samefile option instead of specifying the inode number. This works the same as using the inode number and can be easier if you know the name of one of the hard links. Experimenting with soft links As you have just seen, creating hard links is not possible across filesystem boundaries; that is, from a filesystem on one LV or partition to a filesystem on another. Soft links are a means to answer that problem with hard links. Although they can accomplish the same end, they are very different, and knowing these differences is important. Let's start by creating a symlink in our ~/temp directory to start our exploration. [ dboth @ david temp ]$ ln -s link2.file.txt link3.file.txt ; ls -li
total 12
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt
658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - >
657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The hard links, those that have the inode number 657024 , are unchanged, and the number of hard links shown for each has not changed. The newly created symlink has a different inode, number 658270 . The soft link named link3.file.txt points to link2.file.txt . Use the cat command to display the contents of link3.file.txt . The file mode information for the symlink starts with the letter " l " which indicates that this file is actually a symbolic link.

The size of the symlink link3.file.txt is only 14 bytes in the example above. That is the size of the text link3.file.txt -> link2.file.txt , which is the actual content of the directory entry. The directory entry link3.file.txt does not point to an inode; it points to another directory entry, which makes it useful for creating links that span file system boundaries. So, let's create that link we tried before from the /tmp directory.

[ dboth @ david temp ] $ln -s / home / dboth / temp / link2.file.txt / tmp / link3.file.txt ; ls -l / tmp / link * lrwxrwxrwx 1 dboth dboth 31 Jun 14 21 : 53 / tmp / link3.file.txt - > / home / dboth / temp / link2.file.txt Deleting links There are some other things that you should consider when you need to delete links or the files to which they point. First, let's delete the link main.file.txt . Remember that every directory entry that points to an inode is simply a hard link. [ dboth @ david temp ]$ rm main.file.txt ; ls -li
total 8
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt
658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - >
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The link main.file.txt was the first link created when the file was created. Deleting it now still leaves the original file and its data on the hard drive along with all the remaining hard links. To delete the file and its data, you would have to delete all the remaining hard links.

[ dboth @ david temp ] $rm link2.file.txt ; ls -li total 8 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - > link2.file.txt 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file Notice what happens to the soft link. Deleting the hard link to which the soft link points leaves a broken link. On my system, the broken link is highlighted in colors and the target hard link is flashing. If the broken link needs to be fixed, you can create another hard link in the same directory with the same name as the old one, so long as not all the hard links have been deleted. You could also recreate the link itself, with the link maintaining the same name but pointing to one of the remaining hard links. Of course, if the soft link is no longer needed, it can be deleted with the rm command. The unlink command can also be used to delete files and links. It is very simple and has no options, as the rm command does. It does, however, more accurately reflect the underlying process of deletion, in that it removes the link -- the directory entry -- to the file being deleted. Final thoughts I worked with both types of links for a long time before I began to understand their capabilities and idiosyncrasies. It took writing a lab project for a Linux class I taught to fully appreciate how links work. This article is a simplification of what I taught in that class, and I hope it speeds your learning curve. Topics Linux About the author David Both - David Both is a Linux and Open Source advocate who resides in Raleigh, North Carolina. He has been in the IT industry for over forty years and taught OS/2 for IBM where he worked for over 20 years. While at IBM, he wrote the first training course for the original IBM PC in 1981. He has taught RHCE classes for Red Hat and has worked at MCI Worldcom, Cisco, and the State of North Carolina. He has been working with Linux and Open Source Software for almost 20 years. David has written articles for... #### [Nov 02, 2018] How to Recover from an Accidental SSH Disconnection on Linux RoseHosting ###### Nov 02, 2018 | www.rosehosting.com ... I can get a list of all previous screens using the command: screen -ls  And this gives me the output as shown here: As you can see, there is a screen session here with the name: pts-0.test-centos-server To reconnect to it, just type: screen -r  And this will take you back to where you were before the SSH connection was terminated! It's an amazing tool that you need to use for all important operations as insurance against accidental terminations. Manually Detaching Screens When you break an SSH session, what actually happens is that the screen is automatically detached from it and exists independently. While this is great, you can also detach screens manually and have multiple screens existing at the same time. For example, to detach a screen just type: screen -d  And the current screen will be detached and preserved. However, all the processes inside it are still running, and all the states are preserved: You can re-attach to a screen at any time using the "screen -r" command. To connect to a specific screen instead of the most recent, use: screen -r [screenname]  Changing the Screen Names to Make Them More Relevant By default, the screen names don't mean much. And when you have a bunch of them present, you won't know which screens contain which processes. Fortunately, renaming a screen is easy when inside one. Just type: ctrl+a : We saw in the previous article that "ctrl+a" is the trigger condition for screen commands. The colon (:) will take you to the bottom of the screen where you can type commands. To rename, use: sessionname [newscreenname]  As shown here: And now when you detach the screen, it will show with the new name like this: Now you can have as many screens as you want without getting confused about which one is which! If you are one of our Managed VPS hosting clients, we can do all of this for you. Simply contact our system administrators and they will respond to your request as soon as possible. If you liked this blog post on how to recover from an accidental SSH disconnection on Linux, please share it with your friends on social media networks, or if you have any question regarding this blog post, simply leave a comment below and we will answer it. Thanks! #### [Oct 30, 2018] 10 tr Command Examples in Linux ###### Oct 30, 2018 | www.tecmint.com 8. Here is an example of breaking a single line of words (sentence) into multiple lines, where each word appears in a separate line. $ echo "My UID is $UID" My UID is 1000$ echo "My UID is $UID" | tr " " "\n" My UID is 1000  9. Related to the previous example, you can also translate multiple lines of words into a single sentence as shown. $ cat uid.txt

My
UID
is
1000

$tr "\n" " " < uid.txt My UID is 1000  10. It is also possible to translate just a single character, for instance a space into a " : " character, as follows. $ echo "Tecmint.com =>Linux-HowTos,Guides,Tutorials" | tr " " ":"

Tecmint.com:=>Linux-HowTos,Guides,Tutorials


There are several sequence characters you can use with tr , for more information, see the tr man page.

... ... ...

#### [Oct 29, 2018] Getting all the matches with 'grep -f' option

##### Perverted example, but interesting question.
###### Oct 29, 2018 | stackoverflow.com

Arturo ,Mar 24, 2017 at 8:59

I would like to find all the matches of the text I have in one file ('file1.txt') that are found in another file ('file2.txt') using the grep option -f, that tells to read the expressions to be found from file.

'file1.txt'

a

a

'file2.txt'

a

When I run the command:

grep -f file1.txt file2.txt -w

I get only once the output of the 'a'. instead I would like to get it twice, because it occurs twice in my 'file1.txt' file. Is there a way to let grep (or any other unix/linux) tool to output a match for each line it reads? Thanks in advance. Arturo

RomanPerekhrest ,Mar 24, 2017 at 9:02

the matches of the text - some exact text? should it compare line to line? – RomanPerekhrest Mar 24 '17 at 9:02

Arturo ,Mar 24, 2017 at 9:04

Yes it contains exact match. I added the -w options, following your input. Yes, it is a comparison line by line. – Arturo Mar 24 '17 at 9:04

Remko ,Mar 24, 2017 at 9:19

Grep works as designed, giving only one output line. You could use another approach:
while IFS= read -r pattern; do
grep -e $pattern file2.txt done < file1.txt  This would use every line in file1.txt as a pattern for the grep, thus resulting in the output you're looking for. Arturo ,Mar 24, 2017 at 9:30 That did the trick!. Thank you. And it is even much faster than my previous grep command. – Arturo Mar 24 '17 at 9:30 ar7 ,Mar 24, 2017 at 9:12 When you use grep -f pattern.txt file.txt  It means match the pattern found in pattern.txt in the file file.txt . It is giving you only one output because that is all is there in the second file. Try interchanging the files, grep -f file2.txt file1.txt -w  Does this answer your question? Arturo ,Mar 24, 2017 at 9:17 I understand that, but still I would like to find a way to print a match each time a pattern (even a repeated one) from 'pattern.txt' is found in 'file.txt'. Even a tool or a script rather then 'grep -f' would suffice. – Arturo Mar 24 '17 at 9:17 #### [Oct 22, 2018] linux - If I rm -rf a symlink will the data the link points to get erased, to ##### Notable quotes: ##### "... Put it in another words, those symlink-files will be deleted. The files they "point"/"link" to will not be touch. ..." ###### Oct 22, 2018 | unix.stackexchange.com user4951 ,Jan 25, 2013 at 2:40 This is the contents of the /home3 directory on my system: ./ backup/ hearsttr@ lost+found/ randomvi@ sexsmovi@ ../ freemark@ investgr@ nudenude@ romanced@ wallpape@ I want to clean this up but I am worried because of the symlinks, which point to another drive. If I say rm -rf /home3 will it delete the other drive? John Sui rm -rf /home3 will delete all files and directory within home3 and home3 itself, which include symlink files, but will not "follow"(de-reference) those symlink. Put it in another words, those symlink-files will be deleted. The files they "point"/"link" to will not be touch. #### [Oct 22, 2018] Does rm -rf follow symbolic links? ###### Jan 25, 2012 | superuser.com I have a directory like this: $ ls -l
total 899166
drwxr-xr-x 12 me scicomp       324 Jan 24 13:47 data
-rw-r--r--  1 me scicomp     84188 Jan 24 13:47 lod-thin-1.000000-0.010000-0.030000.rda
drwxr-xr-x  2 me scicomp       808 Jan 24 13:47 log
lrwxrwxrwx  1 me scicomp        17 Jan 25 09:41 msg -> /home/me/msg


And I want to remove it using rm -r .

However I'm scared rm -r will follow the symlink and delete everything in that directory (which is very bad).

I can't find anything about this in the man pages. What would be the exact behavior of running rm -rf from a directory above this one?

LordDoskias Jan 25 '12 at 16:43, Jan 25, 2012 at 16:43

How hard it is to create a dummy dir with a symlink pointing to a dummy file and execute the scenario? Then you will know for sure how it works! –

hakre ,Feb 4, 2015 at 13:09

X-Ref: If I rm -rf a symlink will the data the link points to get erased, too? ; Deleting a folder that contains symlinkshakre Feb 4 '15 at 13:09

Susam Pal ,Jan 25, 2012 at 16:47

Example 1: Deleting a directory containing a soft link to another directory.
susam@nifty:~/so$mkdir foo bar susam@nifty:~/so$ touch bar/a.txt
susam@nifty:~/so$ln -s /home/susam/so/bar/ foo/baz susam@nifty:~/so$ tree
.
├── bar
│   └── a.txt
└── foo
└── baz -> /home/susam/so/bar/

3 directories, 1 file
susam@nifty:~/so$rm -r foo susam@nifty:~/so$ tree
.
└── bar
└── a.txt

1 directory, 1 file
susam@nifty:~/so$ So, we see that the target of the soft-link survives. Example 2: Deleting a soft link to a directory susam@nifty:~/so$ ln -s /home/susam/so/bar baz
susam@nifty:~/so$tree . ├── bar │ └── a.txt └── baz -> /home/susam/so/bar 2 directories, 1 file susam@nifty:~/so$ rm -r baz
susam@nifty:~/so$tree . └── bar └── a.txt 1 directory, 1 file susam@nifty:~/so$


Only, the soft link is deleted. The target of the soft-link survives.

Example 3: Attempting to delete the target of a soft-link

susam@nifty:~/so$ln -s /home/susam/so/bar baz susam@nifty:~/so$ tree
.
├── bar
│   └── a.txt
└── baz -> /home/susam/so/bar

2 directories, 1 file
susam@nifty:~/so$rm -r baz/ rm: cannot remove 'baz/': Not a directory susam@nifty:~/so$ tree
.
├── bar
└── baz -> /home/susam/so/bar

2 directories, 0 files


The file in the target of the symbolic link does not survive.

The above experiments were done on a Debian GNU/Linux 9.0 (stretch) system.

Wyrmwood ,Oct 30, 2014 at 20:36

rm -rf baz/* will remove the contents – Wyrmwood Oct 30 '14 at 20:36

Buttle Butkus ,Jan 12, 2016 at 0:35

Yes, if you do rm -rf [symlink], then the contents of the original directory will be obliterated! Be very careful. – Buttle Butkus Jan 12 '16 at 0:35

frnknstn ,Sep 11, 2017 at 10:22

Your example 3 is incorrect! On each system I have tried, the file a.txt will be removed in that scenario. – frnknstn Sep 11 '17 at 10:22

Susam Pal ,Sep 11, 2017 at 15:20

@frnknstn You are right. I see the same behaviour you mention on my latest Debian system. I don't remember on which version of Debian I performed the earlier experiments. In my earlier experiments on an older version of Debian, either a.txt must have survived in the third example or I must have made an error in my experiment. I have updated the answer with the current behaviour I observe on Debian 9 and this behaviour is consistent with what you mention. – Susam Pal Sep 11 '17 at 15:20

Ken Simon ,Jan 25, 2012 at 16:43

Your /home/me/msg directory will be safe if you rm -rf the directory from which you ran ls. Only the symlink itself will be removed, not the directory it points to.

The only thing I would be cautious of, would be if you called something like "rm -rf msg/" (with the trailing slash.) Do not do that because it will remove the directory that msg points to, rather than the msg symlink itself.

> ,Jan 25, 2012 at 16:54

"The only thing I would be cautious of, would be if you called something like "rm -rf msg/" (with the trailing slash.) Do not do that because it will remove the directory that msg points to, rather than the msg symlink itself." - I don't find this to be true. See the third example in my response below. – Susam Pal Jan 25 '12 at 16:54

Andrew Crabb ,Nov 26, 2013 at 21:52

I get the same result as @Susam ('rm -r symlink/' does not delete the target of symlink), which I am pleased about as it would be a very easy mistake to make. – Andrew Crabb Nov 26 '13 at 21:52

,

rm should remove files and directories. If the file is symbolic link, link is removed, not the target. It will not interpret a symbolic link. For example what should be the behavior when deleting 'broken links'- rm exits with 0 not with non-zero to indicate failure

#### [Oct 18, 2018] 'less' command clearing screen upon exit - how to switch it off?

##### "... To prevent less from clearing the screen upon exit, use -X . ..."
###### Oct 18, 2018 | superuser.com

Wojciech Kaczmarek ,Feb 9, 2010 at 11:21

How to force the less program to not clear the screen upon exit?

I'd like it to behave like git log command:

• it leaves the recently seen page on screen upon exiting
• it does not exit the less even if the content fits on one screen (try git log -1 )

Any ideas? I haven't found any suitable less options nor env variables in a manual, I suspect it's set via some env variable though.

sleske ,Feb 9, 2010 at 11:59

To prevent less from clearing the screen upon exit, use -X .

From the manpage:

-X or --no-init

Disables sending the termcap initialization and deinitialization strings to the terminal. This is sometimes desirable if the deinitialization string does something unnecessary, like clearing the screen.

As to less exiting if the content fits on one screen, that's option -F :

-F or --quit-if-one-screen

Causes less to automatically exit if the entire file can be displayed on the first screen.

-F is not the default though, so it's likely preset somewhere for you. Check the env var LESS .

markpasc ,Oct 11, 2010 at 3:44

This is especially annoying if you know about -F but not -X , as then moving to a system that resets the screen on init will make short files simply not appear, for no apparent reason. This bit me with ack when I tried to take my ACK_PAGER='less -RF' setting to the Mac. Thanks a bunch! – markpasc Oct 11 '10 at 3:44

sleske ,Oct 11, 2010 at 8:45

@markpasc: Thanks for pointing that out. I would not have realized that this combination would cause this effect, but now it's obvious. – sleske Oct 11 '10 at 8:45

Michael Goldshteyn ,May 30, 2013 at 19:28

This is especially useful for the man pager, so that man pages do not disappear as soon as you quit less with the 'q' key. That is, you scroll to the position in a man page that you are interested in only for it to disappear when you quit the less pager in order to use the info. So, I added: export MANPAGER='less -s -X -F' to my .bashrc to keep man page info up on the screen when I quit less, so that I can actually use it instead of having to memorize it. – Michael Goldshteyn May 30 '13 at 19:28

Michael Burr ,Mar 18, 2014 at 22:00

It kinda sucks that you have to decide when you start less how it must behave when you're going to exit. – Michael Burr Mar 18 '14 at 22:00

Derek Douville ,Jul 11, 2014 at 19:11

If you want any of the command-line options to always be default, you can add to your .profile or .bashrc the LESS environment variable. For example:
export LESS="-XF"


will always apply -X -F whenever less is run from that login session.

Sometimes commands are aliased (even by default in certain distributions). To check for this, type

alias


without arguments to see if it got aliased with options that you don't want. To run the actual command in your $PATH instead of an alias, just preface it with a back-slash : \less  To see if a LESS environment variable is set in your environment and affecting behavior: echo$LESS


dotancohen ,Sep 2, 2014 at 10:12

In fact, I add export LESS="-XFR" so that the colors show through less as well. – dotancohen Sep 2 '14 at 10:12

Giles Thomas ,Jun 10, 2015 at 12:23

Thanks for that! -XF on its own was breaking the output of git diff , and -XFR gets the best of both worlds -- no screen-clearing, but coloured git diff output. – Giles Thomas Jun 10 '15 at 12:23

#### [Oct 18, 2018] Isn't less just more

##### Highly recommended!
###### Oct 18, 2018 | unix.stackexchange.com

Bauna ,Aug 18, 2010 at 3:07

less is a lot more than more , for instance you have a lot more functionality:
g: go top of the file
G: go bottom of the file
/: search forward
?: search backward
N: show line number
: goto line
F: similar to tail -f, stop with ctrl+c
S: split lines


And I don't remember more ;-)

törzsmókus ,Feb 19 at 13:19

h : everything you don't remember ;) – törzsmókus Feb 19 at 13:19

KeithB ,Aug 18, 2010 at 0:36

There are a couple of things that I do all the time in less , that doesn't work in more (at least the versions on the systems I use. One is using G to go to the end of the file, and g to go to the beginning. This is useful for log files, when you are looking for recent entries at the end of the file. The other is search, where less highlights the match, while more just brings you to the section of the file where the match occurs, but doesn't indicate where it is.

geoffc ,Sep 8, 2010 at 14:11

Less has a lot more functionality.

You can use v to jump into the current $EDITOR. You can convert to tail -f mode with f as well as all the other tips others offered. Ubuntu still has distinct less/more bins. At least mine does, or the more command is sending different arguments to less. In any case, to see the difference, find a file that has more rows than you can see at one time in your terminal. Type cat , then the file name. It will just dump the whole file. Type more , then the file name. If on ubuntu, or at least my version (9.10), you'll see the first screen, then --More--(27%) , which means there's more to the file, and you've seen 27% so far. Press space to see the next page. less allows moving line by line, back and forth, plus searching and a whole bunch of other stuff. Basically, use less . You'll probably never need more for anything. I've used less on huge files and it seems OK. I don't think it does crazy things like load the whole thing into memory ( cough Notepad). Showing line numbers could take a while, though, with huge files. #### [Oct 18, 2018] What are the differences between most, more and less ##### Highly recommended! ###### Jun 29, 2013 | unix.stackexchange.com Smith John ,Jun 29, 2013 at 13:16 more more is an old utility. When the text passed to it is too large to fit on one screen, it pages it. You can scroll down but not up. Some systems hardlink more to less , providing users with a strange hybrid of the two programs that looks like more and quits at the end of the file like more but has some less features such as backwards scrolling. This is a result of less 's more compatibility mode. You can enable this compatibility mode temporarily with LESS_IS_MORE=1 less ... . more passes raw escape sequences by default. Escape sequences tell your terminal which colors to display. less less was written by a man who was fed up with more 's inability to scroll backwards through a file. He turned less into an open source project and over time, various individuals added new features to it. less is massive now. That's why some small embedded systems have more but not less . For comparison, less 's source is over 27000 lines long. more implementations are generally only a little over 2000 lines long. In order to get less to pass raw escape sequences, you have to pass it the -r flag. You can also tell it to only pass ANSI escape characters by passing it the -R flag. most most is supposed to be more than less . It can display multiple files at a time. By default, it truncates long lines instead of wrapping them and provides a left/right scrolling mechanism. most's website has no information about most 's features. Its manpage indicates that it is missing at least a few less features such as log-file writing (you can use tee for this though) and external command running. By default, most uses strange non-vi-like keybindings. man most | grep '\<vi.?\>' doesn't return anything so it may be impossible to put most into a vi-like mode. most has the ability to decompress gunzip-compressed files before reading. Its status bar has more information than less 's. most passes raw escape sequences by default. tifo ,Oct 14, 2014 at 8:44 Short answer: Just use less and forget about more Longer version: more is old utility. You can't browse step wise with more, you can use space to browse page wise, or enter line by line, that is about it. less is more + more additional features. You can browse page wise, line wise both up and down, search Jonathan.Brink ,Aug 9, 2015 at 20:38 If "more" is lacking for you and you know a few vi commands use "less" – Jonathan.Brink Aug 9 '15 at 20:38 Wilko Fokken ,Jan 30, 2016 at 20:31 There is one single application whereby I prefer more to less : To check my LATEST modified log files (in /var/log/ ), I use ls -AltF | more . While less deletes the screen after exiting with q , more leaves those files and directories listed by ls on the screen, sparing me memorizing their names for examination. (Should anybody know a parameter or configuration enabling less to keep it's text after exiting, that would render this post obsolete.) Jan Warchoł ,Mar 9, 2016 at 10:18 The parameter you want is -X (long form: --no-init ). From less ' manpage: Disables sending the termcap initialization and deinitialization strings to the terminal. This is sometimes desirable if the deinitialization string does something unnecessary, like clearing the screen. #### [Oct 16, 2018] Taking Command of the Terminal with GNU Screen ##### It is available from EPEL repository; to launch it type byobu-screen ##### Notable quotes: ##### "... Note that byobu doesn't actually do anything to screen itself. It's an elaborate (and pretty groovy) screen configuration customization. You could do something similar on your own by hacking your ~/.screenrc, but the byobu maintainers have already done it for you. ..." ###### Oct 16, 2018 | www.linux.com Can I have a Copy of That? Want a quick and dirty way to take notes of what's on your screen? Yep, there's a command for that. Run Ctrl-a h and screen will save a text file called "hardcopy.n" in your current directory that has all of the existing text. Want to get a quick snapshot of the top output on a system? Just run Ctrl-a h and there you go. You can also save a log of what's going on in a window by using Ctrl-a H . This will create a file called screenlog.0 in the current directory. Note that it may have limited usefulness if you're doing something like editing a file in Vim, and the output can look pretty odd if you're doing much more than entering a few simple commands. To close a screenlog, use Ctrl-a H again. Note if you want a quick glance at the system info, including hostname, system load, and system time, you can get that with Ctrl-a t . Simplifying Screen with Byobu If the screen commands seem a bit too arcane to memorize, don't worry. You can tap the power of GNU Screen in a slightly more user-friendly package called byobu . Basically, byobu is a souped-up screen profile originally developed for Ubuntu. Not using Ubuntu? No problem, you can find RPMs or a tarball with the profiles to install on other Linux distros or Unix systems that don't feature a native package. Note that byobu doesn't actually do anything to screen itself. It's an elaborate (and pretty groovy) screen configuration customization. You could do something similar on your own by hacking your ~/.screenrc, but the byobu maintainers have already done it for you. Since most of byobu is self-explanatory, I won't go into great detail about using it. You can launch byobu by running byobu . You'll see a shell prompt plus a few lines at the bottom of the screen with additional information about your system, such as the system CPUs, uptime, and system time. To get a quick help menu, hit F9 and then use the Help entry. Most of the commands you would use most frequently are assigned F keys as well. Creating a new window is F2, cycling between windows is F3 and F4, and detaching from a session is F6. To re-title a window use F8, and if you want to lock the screen use F12. The only downside to byobu is that it's not going to be on all systems, and in a pinch it may help to know your way around plain-vanilla screen rather than byobu. For an easy reference, here's a list of the most common screen commands that you'll want to know. This isn't exhaustive, but it should be enough for most users to get started using screen happily for most use cases. • Start Screen: screen • Detatch Screen: Ctrl-a d • Re-attach Screen: screen -x or screen -x PID • Split Horizontally: Ctrl-a S • Split Vertically: Ctrl-a | • Move Between Windows: Ctrl-a Tab • Name Session: Ctrl-a A • Log Session: Ctrl-a H • Note Session: Ctrl-a h Finally, if you want help on GNU Screen, use the man page (man screen) and its built-in help with Ctrl-a :help. Screen has quite a few advanced options that are beyond an introductory tutorial, so be sure to check out the man page when you have the basics down. #### [Oct 16, 2018] How To Use Linux Screen ###### Oct 16, 2018 | linuxize.com Working with Linux Screen Windows When you start a new screen session by default it creates a single window with a shell in it. You can have multiple windows inside a Screen session. To create a new window with shell type Ctrl+a  c , the first available number from the range  0...9  will be assigned to it. Bellow are some most common commands for managing Linux Screen Windows: • Ctrl+a  c  Create a new window (with shell) • Ctrl+a  "  List all window • Ctrl+a  0  Switch to window 0 (by number ) • Ctrl+a  A  Rename the current window • Ctrl+a  S  Split current region horizontally into two regions • Ctrl+a  |  Split current region vertically into two regions • Ctrl+a  tab  Switch the input focus to the next region • Ctrl+a  Ctrl+a  Toggle between current and previous region • Ctrl+a  Q  Close all regions but the current one • Ctrl+a  X  Close the current region Detach from Linux Screen Session You can detach from the screen session at anytime by typing: Ctrl+a  d  The program running in the screen session will continue to run after you detach from the session. To resume your screen session use the following command: screen -r  Copy In case you have multiple screen sessions running on you machine you will need to append the screen session ID after the r  switch. To find the session ID list the current running screen sessions with: screen -ls  Copy There are screens on: 10835.pts-0.linuxize-desktop (Detached) 10366.pts-0.linuxize-desktop (Detached) 2 Sockets in /run/screens/S-linuxize.  Copy If you want to restore screen 10835.pts-0, then type the following command: screen -r 10835  Copy When screen is started it reads its configuration parameters from /etc/screenrc  and ~/.screenrc  if the file is present. We can modify the default Screen settings according to our own preferences using the .screenrc  file. Here is a sample ~/.screenrc  configuration with customized status line and few additional options: ~/.screenrc # Turn off the welcome message startup_message off # Disable visual bell vbell off # Set scrollback buffer to 10000 defscrollback 10000 # Customize the status line hardstatus alwayslastline hardstatus string '%{= kG}[ %{G}%H %{g}][%= %{= kw}%?%-Lw%?%{r}(%{W}%n*%f%t%?(%u)%?%{r})%{w}%?%+  #### [Oct 14, 2018] Linux and Unix cut command tutorial with examples by George Ornbo ###### Jul 19, 2016 | shapeshed.com ... ... ... How to cut by complement pattern To cut by complement us the --complement option. Note this option is not available on the BSD version of cut . The --complement option selects the inverse of the options passed to sort. In the following example the -c option is used to select the first character. Because the --complement option is also passed to cut the second and third characters are cut. echo 'foo' | cut --complement -c 1 oo  How to modify the output delimiter To modify the output delimiter use the --output-delimiter option. Note that this option is not available on the BSD version of cut . In the following example a semi-colon is converted to a space and the first, third and fourth fields are selected. echo 'how;now;brown;cow' | cut -d ';' -f 1,3,4 --output-delimiter=' ' how brown cow  George Ornbo is a hacker, futurist, blogger and Dad based in Buckinghamshire, England.He is the author of Sams Teach Yourself Node.js in 24 Hours .He can be found in most of the usual places as shapeshed including Twitter and GitHub . Content is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) #### [Oct 12, 2018] How To Create And Maintain Your Own Man Pages by sk ###### Oct 09, 2018 | www.ostechnix.com We already have discussed about a few good alternatives to Man pages . Those alternatives are mainly used for learning concise Linux command examples without having to go through the comprehensive man pages. If you're looking for a quick and dirty way to easily and quickly learn a Linux command, those alternatives are worth trying. Now, you might be thinking – how can I create my own man-like help pages for a Linux command? This is where "Um" comes in handy. Um is a command line utility, used to easily create and maintain your own Man pages that contains only what you've learned about a command so far. By creating your own alternative to man pages, you can avoid lots of unnecessary, comprehensive details in a man page and include only what is necessary to keep in mind. If you ever wanted to created your own set of man-like pages, Um will definitely help. In this brief tutorial, we will see how to install "Um" command line utility and how to create our own man pages. Installing Um Um is available for Linux and Mac OS. At present, it can only be installed using Linuxbrew package manager in Linux systems. Refer the following link if you haven't installed Linuxbrew yet. Once Linuxbrew installed, run the following command to install Um utility. $ brew install sinclairtarget/wst/um


If you will see an output something like below, congratulations! Um has been installed and ready to use.

[...]
==> Installing sinclairtarget/wst/um
-=#=# # #
######################################################################## 100.0%
==> Caveats
Bash completion has been installed to:
/home/linuxbrew/.linuxbrew/etc/bash_completion.d
==> Summary
/home/linuxbrew/.linuxbrew/Cellar/um/4.0.0: 714 files, 1.3MB, built in 35 seconds
==> Caveats
==> openssl
A CA file has been bootstrapped using certificates from the SystemRoots
the System keychain), place .pem files in
/home/linuxbrew/.linuxbrew/etc/openssl/certs

and run
/home/linuxbrew/.linuxbrew/opt/openssl/bin/c_rehash
==> ruby
Emacs Lisp files have been installed to:
/home/linuxbrew/.linuxbrew/share/emacs/site-lisp/ruby
==> um
Bash completion has been installed to:
/home/linuxbrew/.linuxbrew/etc/bash_completion.d


Before going to use to make your man pages, you need to enable bash completion for Um.

To do so, open your ~/.bash_profile file:

$nano ~/.bash_profile  And, add the following lines in it: if [ -f$(brew --prefix)/etc/bash_completion.d/um-completion.sh ]; then
. $(brew --prefix)/etc/bash_completion.d/um-completion.sh fi  Save and close the file. Run the following commands to update the changes. $ source ~/.bash_profile


All done. let us go ahead and create our first man page.

###### Create And Maintain Your Own Man Pages

Let us say, you want to create your own man page for "dpkg" command. To do so, run:

$um edit dpkg The above command will open a markdown template in your default editor: Create a new man page My default editor is Vi, so the above commands open it in the Vi editor. Now, start adding everything you want to remember about "dpkg" command in this template. Here is a sample: Add contents in dpkg man page As you see in the above output, I have added Synopsis, description and two options for dpkg command. You can add as many as sections you want in the man pages. Make sure you have given proper and easily-understandable titles for each section. Once done, save and quit the file (If you use Vi editor, Press ESC key and type :wq ). Finally, view your newly created man page using command: $ um dpkg

View dpkg man page

As you can see, the the dpkg man page looks exactly like the official man pages. If you want to edit and/or add more details in a man page, again run the same command and add the details.

$um edit dpkg To view the list of newly created man pages using Um, run: $ um list

All man pages will be saved under a directory named .um in your home directory

Just in case, if you don't want a particular page, simply delete it as shown below.

$um rm dpkg To view the help section and all available general options, run: $ um --help
usage: um <page name>
um <sub-command> [ARGS...]

The first form is equivalent to um read <page name>.

Subcommands:
um (l)ist                 List the available pages for the current topic.
um (r)ead <page name>     Read the given page under the current topic.
um (e)dit <page name>     Create or edit the given page under the current topic.
um rm <page name>         Remove the given page.
um (t)opic [topic]        Get or set the current topic.
um topics                 List all topics.
um (c)onfig [config key]  Display configuration environment.
um (h)elp [sub-command]   Display this help message, or the help message for a sub-command.

Configure Um

To view the current configuration, run:

$um config Options prefixed by '*' are set in /home/sk/.um/umconfig. editor = vi pager = less pages_directory = /home/sk/.um/pages default_topic = shell pages_ext = .md In this file, you can edit and change the values for pager , editor , default_topic , pages_directory , and pages_ext options as you wish. Say for example, if you want to save the newly created Um pages in your Dropbox folder, simply change the value of pages_directory directive and point it to the Dropbox folder in ~/.um/umconfig file. pages_directory = /Users/myusername/Dropbox/um And, that's all for now. Hope this was useful. More good stuffs to come. Stay tuned! Cheers! Resource: #### [Sep 27, 2018] bash - Conflict between pushd . and cd - - Unix Linux Stack Exchange ###### Sep 27, 2018 | unix.stackexchange.com Bernhard ,Feb 21, 2012 at 12:07 I am a happy user of the cd - command to go to the previous directory. At the same time I like pushd . and popd . However, when I want to remember the current working directory by means of pushd . , I lose the possibility to go to the previous directory by cd - . (As pushd . also performs cd . ). How can I use pushd to still be able to use cd - By the way: GNU bash, version 4.1.7(1) Patrick ,Feb 21, 2012 at 12:39 Why not use pwd to figure out where you are? – Patrick Feb 21 '12 at 12:39 Bernhard ,Feb 21, 2012 at 12:46 I don't understand your question? The point is that pushd breaks the behavior of cd - that I want (or expect). I know perfectly well in which directory I am, but I want to increase the speed with which I change directories :) – Bernhard Feb 21 '12 at 12:46 jofel ,Feb 21, 2012 at 14:39 Do you know zsh ? It has really nice features like AUTO_PUSHD. – jofel Feb 21 '12 at 14:39 Theodore R. Smith ,Feb 21, 2012 at 16:26 +1 Thank you for teaching me about cd -! For most of a decade, I've been doing$ cd $OLDPWD instead. – Theodore R. Smith Feb 21 '12 at 16:26 Patrick ,Feb 22, 2012 at 1:58 @bernhard Oh, I misunderstood what you were asking. You were wanting to know how to store the current working directory. I was interpreting it as you wanted to remember (as in you forgot) your current working directory. – Patrick Feb 22 '12 at 1:58 Wojtek Rzepala ,Feb 21, 2012 at 12:32 You can use something like this: push() { if [ "$1" = . ]; then
old=$OLDPWD current=$PWD
builtin pushd .
cd "$old" cd "$current"
else
builtin pushd "$1" fi }  If you name it pushd , then it will have precedence over the built-in as functions are evaluated before built-ins. You need variables old and current as overwriting OLDPWD will make it lose its special meaning. Bernhard ,Feb 21, 2012 at 12:41 This works perfectly for me. Is there no such feature in the built-in pushd? As I would always prefer a standard solution. Thanks for this function however, maybe I will leave out the argument and it's checking at some point. – Bernhard Feb 21 '12 at 12:41 bsd ,Feb 21, 2012 at 12:53 There is no such feature in the builtin. Your own function is the best solution because pushd and popd both call cd modifying$OLDPWD, hence the source of your problem. I would name the function saved and use it in the context you like too, that of saving cwd. – bsd Feb 21 '12 at 12:53

Wildcard ,Mar 29, 2016 at 23:08

You might also want to unset old and current after you're done with them. – Wildcard Mar 29 '16 at 23:08

Kevin ,Feb 21, 2012 at 16:11

A slightly more concise version of Wojtek's answer :
pushd () {
if [ "$1" = . ]; then cd - builtin pushd - else builtin pushd "$1"
fi
}


By naming the function pushd , you can use pushd as normal, you don't need to remember to use the function name.

,

Kevin's answer is excellent. I've written up some details about what's going on, in case people are looking for a better understanding of why their script is necessary to solve the problem.

The reason that pushd . breaks the behavior of cd - will be apparent if we dig into the workings of cd and the directory stack. Let's push a few directories onto the stack:

$mkdir dir1 dir2 dir3$ pushd dir1
~/dir1 ~
$pushd../dir2 ~/dir2 ~/dir1 ~$ pushd../dir3
~/dir3 ~/dir2 ~/dir1 ~
$dirs -v 0 ~/dir3 1 ~/dir2 2 ~/dir1 3 ~  Now we can try cd - to jump back a directory: $ cd -
$dirs -v 0 ~/dir2 1 ~/dir2 2 ~/dir1 3 ~  We can see that cd - jumped us back to the previous directory, replacing stack ~0 with the directory we jumped into. We can jump back with cd - again: $ cd -
$dirs -v 0 ~/dir3 1 ~/dir2 2 ~/dir1 3 ~  Notice that we jumped back to our previous directory, even though the previous directory wasn't actually listed in the directory stack. This is because cd uses the environment variable $OLDPWD to keep track of the previous directory:

$echo$OLDPWD


If we do pushd . we will push an extra copy of the current directory onto the stack:

$pushd . ~/dir3 ~/dir3 ~/dir2 ~/dir1 ~$ dirs -v
0       ~/dir3
1       ~/dir3
2       ~/dir2
3       ~/dir1
4       ~


In addition to making an extra copy of the current directory in the stack, pushd . has updated $OLDPWD : $echo $OLDPWD /home/username/dir3  So cd - has lost its useful history, and will now just move you to the current directory - accomplishing nothing. #### [Sep 26, 2018] bash - removing or clearing stack of popd-pushd paths ###### Sep 26, 2018 | unix.stackexchange.com chrisjlee ,Feb 9, 2012 at 6:24 After pushd ing too many times, I want to clear the whole stack of paths. How would I popd all the items in the stack? I'd like to popd without needing to know how many are in the stack? The bash manual doesn't seem to cover this . Why do I need to know this? I'm fastidious and to clean out the stack. jw013 ,Feb 9, 2012 at 6:39 BTW, the complete bash manual is over at gnu.org. If you use the all on one page version, it may be easier to find stuff there. – jw013 Feb 9 '12 at 6:39 jw013 ,Feb 9, 2012 at 6:37 dirs -c is what you are looking for. Eliran Malka ,Mar 23, 2017 at 15:20 this does empty the stack, but does not restore the working directory from the stack bottom – Eliran Malka Mar 23 '17 at 15:20 Eliran Malka ,Mar 23, 2017 at 15:37 In order to both empty the stack and restore the working directory from the stack bottom, either: • retrieve that directory from dirs , change to that directory, and than clear the stack: cd "$(dirs -l -0)" && dirs -c


The -l option here will list full paths, to make sure we don't fail if we try to cd into ~ , and the -0 retrieves the first entry from the stack bottom.

@jw013 suggested making this command more robust, by avoiding path expansions:

pushd -0 && dirs -c

• or, popd until you encounter an error (which is the status of a popd call when the directory stack is empty):
while (( $? == 0 )); do popd; done  Chuck Wilbur ,Nov 14, 2017 at 18:21 The first method is exactly what I wanted. The second wouldn't work in my case since I had called pushd a few times, then removed one of the directories in the middle, then popd was failing when I tried to unroll. I needed to jump over all the buggered up stuff in the middle to get back to where I started. – Chuck Wilbur Nov 14 '17 at 18:21 Eliran Malka ,Nov 14, 2017 at 22:51 right @ChuckWilbur - if you scrambled the dir stack, popd won't save you :) – Eliran Malka Nov 14 '17 at 22:51 jw013 ,Dec 7, 2017 at 20:50 It's better to pushd -0 instead of cd "$(dirs ...)" . – jw013 Dec 7 '17 at 20:50

Eliran Malka ,Dec 11, 2017 at 13:56

@jw013 how so? that would mess with the dir stack even more (which we're trying to clear here..) – Eliran Malka Dec 11 '17 at 13:56

jw013 ,Dec 12, 2017 at 15:31

cd "$(...)" works in 90%, probably even 99% of use cases, but with pushd -0 you can confidently say 100%. There are so many potential gotchas and edge cases associated with expanding file/directory paths in the shell that the most robust thing to do is just avoid it altogether, which pushd -0 does very concisely. There is no chance of getting caught by a bug with a weird edge case if you never take the risk. If you want further reading on the possible headaches involved with Unix file / path names, a good starting point is mywiki.wooledge.org/ParsingLsjw013 Dec 12 '17 at 15:31 #### [Sep 25, 2018] Sorting Text ##### Notable quotes: ##### "... POSIX does not require that sort be stable, and most implementations are not ..." ##### "... Fortunately, the GNU implementation in the coreutils package [1] remedies that deficiency via the -- stable option ..." ###### Sep 25, 2018 | www.amazon.com awk , cut , and join , sort views its input as a stream of records made up of fields of variable width, with records delimited by newline characters and fields delimited by whitespace or a user-specifiable single character. sort Usage sort [ options ] [ file(s) ] Purpose Sort input lines into an order determined by the key field and datatype options, and the locale. Major options -b Ignore leading whitespace. -c Check that input is correctly sorted. There is no output, but the exit code is nonzero if the input is not sorted. -d Dictionary order: only alphanumerics and whitespace are significant. -g General numeric value: compare fields as floating-point numbers. This works like -n , except that numbers may have decimal points and exponents (e.g., 6.022e+23 ). GNU version only. -f Fold letters implicitly to a common lettercase so that sorting is case-insensitive. -i Ignore nonprintable characters. -k Define the sort key field. -m Merge already-sorted input files into a sorted output stream. -n Compare fields as integer numbers. -o outfile Write output to the specified file instead of to standard output. If the file is one of the input files, sort copies it to a temporary file before sorting and writing the output. -r Reverse the sort order to descending, rather than the default ascending. -t char Use the single character char as the default field separator, instead of the default of whitespace. -u Unique records only: discard all but the first record in a group with equal keys. Only the key fields matter: other parts of the discarded records may differ. Behavior sort reads the specified files, or standard input if no files are given, and writes the sorted data on standard output. Sorting by Lines In the simplest case, when no command-line options are supplied, complete records are sorted according to the order defined by the current locale. In the traditional C locale, that means ASCII order, but you can set an alternate locale as we described in Section 2.8 . A tiny bilingual dictionary in the ISO 8859-1 encoding translates four French words differing only in accents: $ cat french-english                           Show the tiny dictionary

côte    coast

cote    dimension

coté    dimensioned

côté    side

To understand the sorting, use the octal dump tool, od , to display the French words in ASCII and octal:
$cut -f1 french-english | od -a -b Display French words in octal bytes 0000000 c t t e nl c o t e nl c o t i nl c 143 364 164 145 012 143 157 164 145 012 143 157 164 351 012 143 0000020 t t i nl 364 164 351 012 0000024  Evidently, with the ASCII option -a , od strips the high-order bit of characters, so the accented letters have been mangled, but we can see their octal values: é is 351 8 and ô is 364 8 . On GNU/Linux systems, you can confirm the character values like this: $ man iso_8859_1                               Check the ISO 8859-1 manual page

...

Oct   Dec   Hex   Char   Description

--------------------------------------------------------------------

...

351   233   E9     é     LATIN SMALL LETTER E WITH ACUTE

...

364   244   F4     ô     LATIN SMALL LETTER O WITH CIRCUMFLEX

...

First, sort the file in strict byte order:
$LC_ALL=C sort french-english Sort in traditional ASCII order cote dimension coté dimensioned côte coast côté side  Notice that e (145 8 ) sorted before é (351 8 ), and o (157 8 ) sorted before ô (364 8 ), as expected from their numerical values. Now sort the text in Canadian-French order: $ LC_ALL=fr_CA.iso88591 sort french-english          Sort in Canadian-French locale

côte    coast

cote    dimension

coté    dimensioned

côté    side

The output order clearly differs from the traditional ordering by raw byte values. Sorting conventions are strongly dependent on language, country, and culture, and the rules are sometimes astonishingly complex. Even English, which mostly pretends that accents are irrelevant, can have complex sorting rules: examine your local telephone directory to see how lettercase, digits, spaces, punctuation, and name variants like McKay and Mackay are handled.

Sorting by Fields

For more control over sorting, the -k option allows you to specify the field to sort on, and the -t option lets you choose the field delimiter. If -t is not specified, then fields are separated by whitespace and leading and trailing whitespace in the record is ignored. With the -t option, the specified character delimits fields, and whitespace is significant. Thus, a three-character record consisting of space-X-space has one field without -t , but three with -t ' ' (the first and third fields are empty). The -k option is followed by a field number, or number pair, optionally separated by whitespace after -k . Each number may be suffixed by a dotted character position, and/or one of the modifier letters shown in Table.

Letter

Description

b

d

Dictionary order.

f

Fold letters implicitly to a common lettercase.

g

Compare as general floating-point numbers. GNU version only.

i

Ignore nonprintable characters.

n

Compare as (integer) numbers.

r

Reverse the sort order.

Fields and characters within fields are numbered starting from one.

If only one field number is specified, the sort key begins at the start of that field, and continues to the end of the record ( not the end of the field).

If a comma-separated pair of field numbers is given, the sort key starts at the beginning of the first field, and finishes at the end of the second field.

With a dotted character position, comparison begins (first of a number pair) or ends (second of a number pair) at that character position: -k2.4,5.6 compares starting with the fourth character of the second field and ending with the sixth character of the fifth field.

If the start of a sort key falls beyond the end of the record, then the sort key is empty, and empty sort keys sort before all nonempty ones.

When multiple -k options are given, sorting is by the first key field, and then, when records match in that key, by the second key field, and so on.

 ! While the -k option is available on all of the systems that we tested, sort also recognizes an older field specification, now considered obsolete, where fields and character positions are numbered from zero. The key start for character m in field n is defined by + n.m , and the key end by - n.m . For example, sort +2.1 -3.2 is equivalent to sort -k3.2,4.3 . If the character position is omitted, it defaults to zero. Thus, +4.0nr and +4nr mean the same thing: a numeric key, beginning at the start of the fifth field, to be sorted in reverse (descending) order.

Let's try out these options on a sample password file, sorting it by the username, which is found in the first colon-separated field:
$sort -t: -k1,1 /etc/passwd Sort by username bin:x:1:1:bin:/bin:/sbin/nologin chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash daemon:x:2:2:daemon:/sbin:/sbin/nologin groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93 harpo:x:12502:1000:Harpo Marx:/home/harpo:/bin/ksh root:x:0:0:root:/root:/bin/bash zeppo:x:12505:1000:Zeppo Marx:/home/zeppo:/bin/zsh  For more control, add a modifier letter in the field selector to define the type of data in the field and the sorting order. Here's how to sort the password file by descending UID: $ sort -t: -k3nr /etc/passwd               Sort by descending UID

zeppo:x:12505:1000:Zeppo Marx:/home/zeppo:/bin/zsh

gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93

groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh

harpo:x:12502:1000:Harpo Marx:/home/harpo:/bin/ksh

chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash

root:x:0:0:root:/root:/bin/bash


A more precise field specification would have been -k3nr,3 (that is, from the start of field three, numerically, in reverse order, to the end of field three), or -k3,3nr , or even -k3,3 -n -r , but sort stops collecting a number at the first nondigit, so -k3nr works correctly.

In our password file example, three users have a common GID in field 4, so we could sort first by GID, and then by UID, with:

$sort -t: -k4n -k3n /etc/passwd Sort by GID and UID root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash harpo:x:12502:1000:Harpo Marx:/home/harpo:/bin/ksh zeppo:x:12505:1000:Zeppo Marx:/home/zeppo:/bin/zsh groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93  The useful -u option asks sort to output only unique records, where unique means that their sort-key fields match, even if there are differences elsewhere. Reusing the password file one last time, we find: $ sort -t: -k4n -u /etc/passwd             Sort by unique GID

root:x:0:0:root:/root:/bin/bash

chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash

groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh

gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93


Notice that the output is shorter: three users are in group 1000, but only one of them was output...

Sorting Text Blocks

Sometimes you need to sort data composed of multiline records. A good example is an address list, which is conveniently stored with one or more blank lines between addresses. For data like this, there is no constant sort-key position that could be used in a -k option, so you have to help out by supplying some extra markup. Here's a simple example:

$cat my-friends Show address file # SORTKEY: Schloß, Hans Jürgen Hans Jürgen Schloß Unter den Linden 78 D-10117 Berlin Germany # SORTKEY: Jones, Adrian Adrian Jones 371 Montgomery Park Road Henley-on-Thames RG9 4AJ UK # SORTKEY: Brown, Kim Kim Brown 1841 S Main Street Westchester, NY 10502 USA  The sorting trick is to use the ability of awk to handle more-general record separators to recognize paragraph breaks, temporarily replace the line breaks inside each address with an otherwise unused character, such as an unprintable control character, and replace the paragraph break with a newline. sort then sees lines that look like this: # SORTKEY: Schloß, Hans Jürgen^ZHans Jürgen Schloß^ZUnter den Linden 78^Z... # SORTKEY: Jones, Adrian^ZAdrian Jones^Z371 Montgomery Park Road^Z... # SORTKEY: Brown, Kim^ZKim Brown^Z1841 S Main Street^Z...  Here, ^Z is a Ctrl-Z character. A filter step downstream from sort restores the line breaks and paragraph breaks, and the sort key lines are easily removed, if desired, with grep . The entire pipeline looks like this: cat my-friends | Pipe in address file awk -v RS="" { gsub("\n", "^Z"); print }' | Convert addresses to single lines sort -f | Sort address bundles, ignoring case awk -v ORS="\n\n" '{ gsub("^Z", "\n"); print }' | Restore line structure grep -v '# SORTKEY' Remove markup lines  The gsub( ) function performs "global substitutions." It is similar to the s/x/y/g construct in sed . The RS variable is the input Record Separator. Normally, input records are separated by newlines, making each line a separate record. Using RS=" " is a special case, whereby records are separated by blank lines; i.e., each block or "paragraph" of text forms a separate record. This is exactly the form of our input data. Finally, ORS is the Output Record Separator; each output record printed with print is terminated with its value. Its default is also normally a single newline; setting it here to " \n\n " preserves the input format with blank lines separating records. (More detail on these constructs may be found in Chapter 9 .) The beauty of this approach is that we can easily include additional keys in each address that can be used for both sorting and selection: for example, an extra markup line of the form: # COUNTRY: UK in each address, and an additional pipeline stage of grep '# COUNTRY: UK ' just before the sort , would let us extract only the UK addresses for further processing. You could, of course, go overboard and use XML markup to identify the parts of the address in excruciating detail: <address> <personalname>Hans Jürgen</personalname> <familyname>Schloß</familyname> <streetname>Unter den Linden<streetname> <streetnumber>78</streetnumber> <postalcode>D-10117</postalcode> <city>Berlin</city> <country>Germany</country> </address> With fancier data-processing filters, you could then please your post office by presorting your mail by country and postal code, but our minimal markup and simple pipeline are often good enough to get the job done. 4.1.4. Sort Efficiency The obvious way to sort data requires comparing all pairs of items to see which comes first, and leads to algorithms known as bubble sort and insertion sort . These quick-and-dirty algorithms are fine for small amounts of data, but they certainly are not quick for large amounts, because their work to sort n records grows like n 2 . This is quite different from almost all of the filters that we discuss in this book: they read a record, process it, and output it, so their execution time is directly proportional to the number of records, n . Fortunately, the sorting problem has had lots of attention in the computing community, and good sorting algorithms are known whose average complexity goes like n 3/2 ( shellsort ), n log n ( heapsort , mergesort , and quicksort ), and for restricted kinds of data, n ( distribution sort ). The Unix sort command implementation has received extensive study and optimization: you can be confident that it will do the job efficiently, and almost certainly better than you can do yourself without learning a lot more about sorting algorithms. 4.1.5. Sort Stability An important question about sorting algorithms is whether or not they are stable : that is, is the input order of equal records preserved in the output? A stable sort may be desirable when records are sorted by multiple keys, or more than once in a pipeline. POSIX does not require that sort be stable, and most implementations are not, as this example shows: $ sort -t_ -k1,1 -k2,2 << EOF              Sort four lines by first two fields
> one_two
> one_two_three
> one_two_four
> one_two_five
> EOF

one_two
one_two_five
one_two_four
one_two_three


The sort fields are identical in each record, but the output differs from the input, so sort is not stable. Fortunately, the GNU implementation in the coreutils package [1] remedies that deficiency via the -- stable option: its output for this example correctly matches the input.

[1] Available at ftp://ftp.gnu.org/gnu/coreutils/ .

>

#### [Sep 18, 2018] Getting started with Tmux

###### Sep 15, 2018 | linuxize.com

... ... ...

When Tmux is started it reads its configuration parameters from ~/.tmux.conf  if the file is present.

Here is a sample ~/.tmux.conf  configuration with customized status line and few additional options:

~/.tmux.conf
# Improve colors
set -g default-terminal 'screen-256color'

# Set scrollback buffer to 10000
set -g history-limit 10000

# Customize the status line
set -g status-fg  green
set -g status-bg  black

Copy Basic Tmux Usage

Below are the most basic steps for getting started with Tmux:

1. On the command prompt, type tmux new -s my_session ,
2. Run the desired program.
3. Use the key sequence Ctrl-b  + d  to detach from the session.
4. Reattach to the Tmux session by typing tmux attach-session -t my_session .
Conclusion

In this tutorial, you learned how to use Tmux. Now you can start creating multiple Tmux windows in a single session, split windows by creating new panes, navigate between windows, detach and resume sessions and personalize your Tmux instance using the .tmux.conf  file.

There's lots more to learn about Tmux at Tmux User's Manual page.

#### [Jul 05, 2018] Can rsync resume after being interrupted

##### "... as if it were successfully transferred ..."
###### Jul 05, 2018 | unix.stackexchange.com

Tim ,Sep 15, 2012 at 23:36

I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly.

After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time?

Gilles ,Sep 16, 2012 at 1:56

Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56

Tim ,Sep 16, 2012 at 2:30

@Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30

jwbensley ,Sep 16, 2012 at 16:15

There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15

Tim ,Sep 19, 2012 at 5:20

@Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20

Gilles ,Sep 19, 2012 at 9:25

@Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25

DanielSmedegaardBuus ,Nov 1, 2014 at 12:32

First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred.

While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify .

So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you.

With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files.

So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt.

As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions.

--append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target.

Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences."

That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name.

This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets.

It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :)

So, in short:

If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify .

If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred.

When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size.

UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!)

UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!)

Alex ,Aug 28, 2015 at 3:49

According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49

DanielSmedegaardBuus ,Sep 1, 2015 at 13:29

Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29

Cees Timmerman ,Sep 15, 2015 at 17:21

This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21

DanielSmedegaardBuus ,May 10, 2016 at 19:31

@CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31

Jonathan Y. ,Jun 14, 2017 at 5:48

What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48

Alexander O'Mara ,Jan 3, 2016 at 6:34

TL;DR:

Just specify a partial directory as the rsync man pages recommends:

--partial-dir=.rsync-partial


Longer explanation:

There is actually a built-in feature for doing this using the --partial-dir option, which has several advantages over the --partial and --append-verify / --append alternative.

Excerpt from the rsync man pages:
--partial-dir=DIR
A  better way to keep partial files than the --partial option is
to specify a DIR that will be used  to  hold  the  partial  data
(instead  of  writing  it  out to the destination file).  On the
next transfer, rsync will use a file found in this dir  as  data
to  speed  up  the resumption of the transfer and then delete it
after it has served its purpose.

Note that if --whole-file is specified (or  implied),  any  par-
tial-dir  file  that  is  found for a file that is being updated
will simply be removed (since rsync  is  sending  files  without
using rsync's delta-transfer algorithm).

Rsync will create the DIR if it is missing (just the last dir --
not the whole path).  This makes it easy to use a relative  path
(such  as  "--partial-dir=.rsync-partial")  to have rsync create
the partial-directory in the destination file's  directory  when
needed,  and  then  remove  it  again  when  the partial file is
deleted.

If the partial-dir value is not an absolute path, rsync will add
an  exclude rule at the end of all your existing excludes.  This
will prevent the sending of any partial-dir files that may exist
on the sending side, and will also prevent the untimely deletion
of partial-dir items on the receiving  side.   An  example:  the
above  --partial-dir  option would add the equivalent of "-f '-p
.rsync-partial/'" at the end of any other filter rules.


By default, rsync uses a random temporary file name which gets deleted when a transfer fails. As mentioned, using --partial you can make rsync keep the incomplete file as if it were successfully transferred , so that it is possible to later append to it using the --append-verify / --append options. However there are several reasons this is sub-optimal.

1. Your backup files may not be complete, and without checking the remote file which must still be unaltered, there's no way to know.
2. If you are attempting to use --backup and --backup-dir , you've just added a new version of this file that never even exited before to your version history.

However if we use --partial-dir , rsync will preserve the temporary partial file, and resume downloading using that partial file next time you run it, and we do not suffer from the above issues.

trs ,Apr 7, 2017 at 0:00

This is really the answer. Hey everyone, LOOK HERE!! – trs Apr 7 '17 at 0:00

JKOlaf ,Jun 28, 2017 at 0:11

I agree this is a much more concise answer to the question. the TL;DR: is perfect and for those that need more can read the longer bit. Strong work. – JKOlaf Jun 28 '17 at 0:11

N2O ,Jul 29, 2014 at 18:24

You may want to add the -P option to your command.

From the man page:

--partial By default, rsync will delete any partially transferred file if the transfer
is interrupted. In some circumstances it is more desirable to keep partially
transferred files. Using the --partial option tells rsync to keep the partial
file which should make a subsequent transfer of the rest of the file much faster.

-P     The -P option is equivalent to --partial --progress.   Its  pur-
pose  is to make it much easier to specify these two options for
a long transfer that may be interrupted.


sudo rsync -azvv /home/path/folder1/ /home/path/folder2


Do:

sudo rsync -azvvP /home/path/folder1/ /home/path/folder2


Of course, if you don't want the progress updates, you can just use --partial , i.e.:

sudo rsync --partial -azvv /home/path/folder1/ /home/path/folder2


gaoithe ,Aug 19, 2015 at 11:29

@Flimm not quite correct. If there is an interruption (network or receiving side) then when using --partial the partial file is kept AND it is used when rsync is resumed. From the manpage: "Using the --partial option tells rsync to keep the partial file which should <b>make a subsequent transfer of the rest of the file much faster</b>." – gaoithe Aug 19 '15 at 11:29

DanielSmedegaardBuus ,Sep 1, 2015 at 14:11

@Flimm and @gaoithe, my answer wasn't quite accurate, and definitely not up-to-date. I've updated it to reflect version 3 + of rsync . It's important to stress, though, that --partial does not itself resume a failed transfer. See my answer for details :) – DanielSmedegaardBuus Sep 1 '15 at 14:11

guettli ,Nov 18, 2015 at 12:28

@DanielSmedegaardBuus I tried it and the -P is enough in my case. Versions: client has 3.1.0 and server has 3.1.1. I interrupted the transfer of a single large file with ctrl-c. I guess I am missing something. – guettli Nov 18 '15 at 12:28

Yadunandana ,Sep 16, 2012 at 16:07

I think you are forcibly calling the rsync and hence all data is getting downloaded when you recall it again. use --progress option to copy only those files which are not copied and --delete option to delete any files if already copied and now it does not exist in source folder...
rsync -avz --progress --delete -e  /home/path/folder1/ /home/path/folder2


If you are using ssh to login to other system and copy the files,

rsync -avz --progress --delete -e "ssh -o UserKnownHostsFile=/dev/null -o \
StrictHostKeyChecking=no" /home/path/folder1/ /home/path/folder2


let me know if there is any mistake in my understanding of this concept...

Fabien ,Jun 14, 2013 at 12:12

Can you please edit your answer and explain what your special ssh call does, and why you advice to do it? – Fabien Jun 14 '13 at 12:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:12

@Fabien He tells rsync to set two ssh options (rsync uses ssh to connect). The second one tells ssh to not prompt for confirmation if the host he's connecting to isn't already known (by existing in the "known hosts" file). The first one tells ssh to not use the default known hosts file (which would be ~/.ssh/known_hosts). He uses /dev/null instead, which is of course always empty, and as ssh would then not find the host in there, it would normally prompt for confirmation, hence option two. Upon connecting, ssh writes the now known host to /dev/null, effectively forgetting it instantly :) – DanielSmedegaardBuus Dec 7 '14 at 0:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:23

...but you were probably wondering what effect, if any, it has on the rsync operation itself. The answer is none. It only serves to not have the host you're connecting to added to your SSH known hosts file. Perhaps he's a sysadmin often connecting to a great number of new servers, temporary systems or whatnot. I don't know :) – DanielSmedegaardBuus Dec 7 '14 at 0:23

moi ,May 10, 2016 at 13:49

"use --progress option to copy only those files which are not copied" What? – moi May 10 '16 at 13:49

Paul d'Aoust ,Nov 17, 2016 at 22:39

There are a couple errors here; one is very serious: --delete will delete files in the destination that don't exist in the source. The less serious one is that --progress doesn't modify how things are copied; it just gives you a progress report on each file as it copies. (I fixed the serious error; replaced it with --remove-source-files .) – Paul d'Aoust Nov 17 '16 at 22:39

#### [Jun 24, 2018] Three Ways to Script Processes in Parallel by Rudis Muiznieks

###### Sep 02, 2015 | www.codeword.xyz
Wednesday, September 02, 2015 | 9 Comments

I was recently troubleshooting some issues we were having with Shippable , trying to get a bunch of our unit tests to run in parallel so that our builds would complete faster. I didn't care what order the different processes completed in, but I didn't want the shell script to exit until all the spawned unit test processes had exited. I ultimately wasn't able to satisfactorily solve the issue we were having, but I did learn more than I ever wanted to know about how to run processes in parallel in shell scripts. So here I shall impart unto you the knowledge I have gained. I hope someone else finds it useful!

Wait

The simplest way to achieve what I wanted was to use the wait command. You simply fork all of your processes with & , and then follow them with a wait command. Behold:

#!/bin/sh

/usr/bin/my-process-1 --args1 &
/usr/bin/my-process-2 --args2 &
/usr/bin/my-process-3 --args3 &

wait
echo all processes complete


It's really as easy as that. When you run the script, all three processes will be forked in parallel, and the script will wait until all three have completed before exiting. Anything after the wait command will execute only after the three forked processes have exited.

Pros

Damn, son! It doesn't get any simpler than that!

Cons

I don't think there's really any way to determine the exit codes of the processes you forked. That was a deal-breaker for my use case, since I needed to know if any of the tests failed and return an error code from the parent shell script if they did.

Another downside is that output from the processes will be all mish-mashed together, which makes it difficult to follow. In our situation, it was basically impossible to determine which unit tests had failed because they were all spewing their output at the same time.

GNU Parallel

There is a super nifty program called GNU Parallel that does exactly what I wanted. It works kind of like xargs in that you can give it a collection of arguments to pass to a single command which will all be run, only this will run them in parallel instead of in serial like xargs does (OR DOES IT??</foreshadowing>). It is super powerful, and all the different ways you can use it are beyond the scope of this article, but here's a rough equivalent to the example script above:

#!/bin/sh

parallel /usr/bin/my-process-{} --args{} ::: 1 2 3
echo all processes complete


The official "10 seconds installation" method for the latest version of GNU Parallel (from the README) is as follows:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash


Pros

If any of the processes returns a non-zero exit code, parallel will return a non-zero exit code. This means you can use $? in your shell script to detect if any of the processes failed. Nice! GNU Parallel also (by default) collates the output of each process together, so you'll see the complete output of each process as it completes instead of a mash-up of all the output combined together as it's produced. Also nice! I am such a damn fanboy I might even buy an official GNU Parallel mug and t-shirt . Actually I'll probably save the money and get the new Star Wars Battlefront game when it comes out instead. But I did seriously consider the parallel schwag for a microsecond or so. Cons Literally none. Xargs So it turns out that our old friend xargs has supported parallel processing all along! Who knew? It's like the nerdy chick in the movies who gets a makeover near the end and it turns out she's even hotter than the stereotypical hot cheerleader chicks who were picking on her the whole time. Just pass it a -Pn argument and it will run your commands using up to n threads. Check out this mega-sexy equivalent to the above scripts: #!/bin/sh printf "1\n2\n3" | xargs -n1 -P3 -I{} /usr/bin/my-process-{} --args{} echo all processes complete  Pros xargs returns a non-zero exit code if any of the processes fails, so you can again use $? in your shell script to detect errors. The difference is it will return 123 , unlike GNU Parallel which passes through the non-zero exit code of the process that failed (I'm not sure how parallel picks if more than one process fails, but I'd assume it's either the first or last process to fail). Another pro is that xargs is most likely already installed on your preferred distribution of Linux.

Cons

I have read reports that the non-GNU version of xargs does not support parallel processing, so you may or may not be out of luck with this option if you're on AIX or a BSD or something.

xargs also has the same problem as the wait solution where the output from your processes will be all mixed together.

Another con is that xargs is a little less flexible than parallel in how you specify the processes to run. You have to pipe your values into it, and if you use the -I argument for string-replacement then your values have to be separated by newlines (which is more annoying when running it ad-hoc). It's still pretty nice, but nowhere near as flexible or powerful as parallel .

Also there's no place to buy an xargs mug and t-shirt. Lame!

And The Winner Is

After determining that the Shippable problem we were having was completely unrelated to the parallel scripting method I was using, I ended up sticking with parallel for my unit tests. Even though it meant one more dependency on our build machine, the ease

#### [Jun 23, 2018] Queuing tasks for batch execution with Task Spooler by Ben Martin

###### Aug 12, 2008 | www.linux.com

The Task Spooler project allows you to queue up tasks from the shell for batch execution. Task Spooler is simple to use and requires no configuration. You can view and edit queued commands, and you can view the output of queued commands at any time.

Task Spooler has some similarities with other delayed and batch execution projects, such as " at ." While both Task Spooler and at handle multiple queues and allow the execution of commands at a later point, the at project handles output from commands by emailing the results to the user who queued the command, while Task Spooler allows you to get at the results from the command line instead. Another major difference is that Task Spooler is not aimed at executing commands at a specific time, but rather at simply adding to and executing commands from queues.

The main repositories for Fedora, openSUSE, and Ubuntu do not contain packages for Task Spooler. There are packages for some versions of Debian, Ubuntu, and openSUSE 10.x available along with the source code on the project's homepage. In this article I'll use a 64-bit Fedora 9 machine and install version 0.6 of Task Spooler from source. Task Spooler does not use autotools to build, so to install it, simply run make; sudo make install . This will install the main Task Spooler command ts  and its manual page into /usr/local.

A simple interaction with Task Spooler is shown below. First I add a new job to the queue and check the status. As the command is a very simple one, it is likely to have been executed immediately. Executing ts by itself with no arguments shows the executing queue, including tasks that have completed. I then use ts -c  to get at the stdout of the executed command. The -c  option uses cat  to display the output file for a task. Using ts -i  shows you information about the job. To clear finished jobs from the queue, use the ts -C  command, not shown in the example.

$ts echo "hello world" 6$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/1]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world

$ts -c 6 hello world$ ts -i 6
Command: echo hello world
Enqueue time: Tue Jul 22 14:42:22 2008
Start time: Tue Jul 22 14:42:22 2008
End time: Tue Jul 22 14:42:22 2008
Time run: 0.003336s

The -t  option operates like tail -f , showing you the last few lines of output and continuing to show you any new output from the task. If you would like to be notified when a task has completed, you can use the -m  option to have the results mailed to you, or you can queue another command to be executed that just performs the notification. For example, I might add a tar command and want to know when it has completed. The below commands will create a tarball and use libnotify commands to create an inobtrusive popup window on my desktop when the tarball creation is complete. The popup will be dismissed automatically after a timeout.

$ts tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011 11$ ts notify-send "tarball creation" "the long running tar creation process is complete."
12
$ts ID State Output E-Level Times(r/u/s) Command [run=0/1] 11 finished /tmp/ts-out.O6epsS 0 4.64/4.31/0.29 tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011 12 finished /tmp/ts-out.4KbPSE 0 0.05/0.00/0.02 notify-send tarball creation the long... is complete. Notice in the output above, toward the far right of the header information, the run=0/1  line. This tells you that Task Spooler is executing nothing, and can possibly execute one task. Task spooler allows you to execute multiple tasks at once from your task queue to take advantage of multicore CPUs. The -S  option allows you to set how many tasks can be executed in parallel from the queue, as shown below.$ ts -S 2
$ts ID State Output E-Level Times(r/u/s) Command [run=0/2] 6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world If you have two tasks that you want to execute with Task Spooler but one depends on the other having already been executed (and perhaps that the previous job has succeeded too) you can handle this by having one task wait for the other to complete before executing. This becomes more important on a quad core machine when you might have told Task Spooler that it can execute three tasks in parallel. The commands shown below create an explicit dependency, making sure that the second command is executed only if the first has completed successfully, even when the queue allows multiple tasks to be executed. The first command is queued normally using ts . I use a subshell to execute the commands by having ts  explicitly start a new bash shell. The second command uses the -d  option, which tells ts  to execute the command only after the successful completion of the last command that was appended to the queue. When I first inspect the queue I can see that the first command (28) is executing. The second command is queued but has not been added to the list of executing tasks because Task Spooler is aware that it cannot execute until task 28 is complete. The second time I view the queue, both tasks have completed.$ ts bash -c "sleep 10; echo hi"
28
$ts -d echo there 29$ ts
ID State Output E-Level Times(r/u/s) Command [run=1/2]
28 running /tmp/ts-out.hKqDva bash -c sleep 10; echo hi
29 queued (file) && echo there
$ts ID State Output E-Level Times(r/u/s) Command [run=0/2] 28 finished /tmp/ts-out.hKqDva 0 10.01/0.00/0.01 bash -c sleep 10; echo hi 29 finished /tmp/ts-out.VDtVp7 0 0.00/0.00/0.00 && echo there$ cat /tmp/ts-out.hKqDva
hi
$cat /tmp/ts-out.VDtVp7 there You can also explicitly set dependencies on other tasks as shown below. Because the ts  command prints the ID of a new task to the console, the first command puts that ID into a shell variable for use in the second command. The second command passes the task ID of the first task to ts, telling it to wait for the task with that ID to complete before returning. Because this is joined with the command we wish to execute with the &&  operation, the second command will execute only if the first one has finished and succeeded. The first time we view the queue you can see that both tasks are running. The first task will be in the sleep  command that we used explicitly to slow down its execution. The second command will be executing ts , which will be waiting for the first task to complete. One downside of tracking dependencies this way is that the second command is added to the running queue even though it cannot do anything until the first task is complete.$ FIRST_TASKID=ts bash -c "sleep 10; echo hi"
$ts sh -c "ts -w$FIRST_TASKID && echo there"
25
$ts ID State Output E-Level Times(r/u/s) Command [run=2/2] 24 running /tmp/ts-out.La9Gmz bash -c sleep 10; echo hi 25 running /tmp/ts-out.Zr2n5u sh -c ts -w 24 && echo there$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
24 finished /tmp/ts-out.La9Gmz 0 10.01/0.00/0.00 bash -c sleep 10; echo hi
25 finished /tmp/ts-out.Zr2n5u 0 9.47/0.00/0.01 sh -c ts -w 24 && echo there
$ts -c 24 hi$ ts -c 25
there Wrap-up

Task Spooler allows you to convert a shell command to a queued command by simply prepending ts  to the command line. One major advantage of using ts over something like the at  command is that you can effectively run tail -f  on the output of a running task and also get at the output of completed tasks from the command line. The utility's ability to execute multiple tasks in parallel is very handy if you are running on a multicore CPU. Because you can explicitly wait for a task, you can set up very complex interactions where you might have several tasks running at once and have jobs that depend on multiple other tasks to complete successfully before they can execute.

Because you can make explicitly dependant tasks take up slots in the actively running task queue, you can effectively delay the execution of the queue until a time of your choosing. For example, if you queue up a task that waits for a specific time before returning successfully and have a small group of other tasks that are dependent on this first task to complete, then no tasks in the queue will execute until the first task completes.

Category:

• Tools & Utilities

#### [Jun 23, 2018] at, batch, atq, and atrm examples

###### Jun 23, 2018 | www.computerhope.com
at -m 01:35 < my-at-jobs.txt


Run the commands listed in the ' my-at-jobs.txt ' file at 1:35 AM. All output from the job will be mailed to the user running the task. When this command has been successfully entered you should receive a prompt similar to the example below:

commands will be executed using /bin/sh
job 1 at Wed Dec 24 00:22:00 2014

at -l


This command will list each of the scheduled jobs in a format like the following:

1          Wed Dec 24 00:22:00 2003


...this is the same as running the command atq .

at -r 1


Deletes job 1 . This command is the same as running the command atrm 1 .

atrm 23


Deletes job 23. This command is the same as running the command at -r 23 .

#### [Jun 23, 2018] Bash script processing limited number of commands in parallel

###### Jun 23, 2018 | stackoverflow.com

AL-Kateb ,Oct 23, 2013 at 13:33

I have a bash script that looks like this:
#!/bin/bash
# ..
# ..


But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed.

I thought of wget LINK1 >/dev/null 2>&1 & to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea.

One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop:

while [  $(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do sleep 1 done  Of course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it. So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good. kojiro ,Oct 23, 2013 at 13:46 wait is the right answer here, but your while [$(ps would be much better written while pkill -0 $KEYWORD – using proctools that is, for legitimate reasons to check if a process with a specific name is still running. – kojiro Oct 23 '13 at 13:46 VasyaNovikov ,Jan 11 at 19:01 I think this question should be re-opened. The "possible duplicate" QA is all about running a finite number of programs in parallel. Like 2-3 commands. This question, however, is focused on running commands in e.g. a loop. (see "but there are 4000 lines"). – VasyaNovikov Jan 11 at 19:01 robinCTS ,Jan 11 at 23:08 @VasyaNovikov Have you read all the answers to both this question and the duplicate? Every single answer to this question here, can also be found in the answers to the duplicate question. That is precisely the definition of a duplicate question. It makes absolutely no difference whether or not you are running the commands in a loop. – robinCTS Jan 11 at 23:08 VasyaNovikov ,Jan 12 at 4:09 @robinCTS there are intersections, but questions themselves are different. Also, 6 of the most popular answers on the linked QA deal with 2 processes only. – VasyaNovikov Jan 12 at 4:09 Dan Nissenbaum ,Apr 20 at 15:35 I recommend reopening this question because its answer is clearer, cleaner, better, and much more highly upvoted than the answer at the linked question, though it is three years more recent. – Dan Nissenbaum Apr 20 at 15:35 devnull ,Oct 23, 2013 at 13:35 Use the wait built-in: process1 & process2 & process3 & process4 & wait process5 & process6 & process7 & process8 & wait  For the above example, 4 processes process1 .. process4 would be started in the background, and the shell would wait until those are completed before starting the next set .. From the manual : wait [jobspec or pid ...]  Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127. kojiro ,Oct 23, 2013 at 13:48 So basically i=0; waitevery=4; for link in "${links[@]}"; do wget "$link" & (( i++%waitevery==0 )) && wait; done >/dev/null 2>&1kojiro Oct 23 '13 at 13:48 rsaw ,Jul 18, 2014 at 17:26 Unless you're sure that each process will finish at the exact same time, this is a bad idea. You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer. – rsaw Jul 18 '14 at 17:26 DomainsFeatured ,Sep 13, 2016 at 22:55 Is there a way to do this in a loop? – DomainsFeatured Sep 13 '16 at 22:55 Bobby ,Apr 27, 2017 at 7:55 I've tried this but it seems that variable assignments done in one block are not available in the next block. Is this because they are separate processes? Is there a way to communicate the variables back to the main process? – Bobby Apr 27 '17 at 7:55 choroba ,Oct 23, 2013 at 13:38 See parallel . Its syntax is similar to xargs , but it runs the commands in parallel. chepner ,Oct 23, 2013 at 14:35 This is better than using wait , since it takes care of starting new jobs as old ones complete, instead of waiting for an entire batch to finish before starting the next. – chepner Oct 23 '13 at 14:35 Mr. Llama ,Aug 13, 2015 at 19:30 For example, if you have the list of links in a file, you can do cat list_of_links.txt | parallel -j 4 wget {} which will keep four wget s running at a time. – Mr. Llama Aug 13 '15 at 19:30 0x004D44 ,Nov 2, 2015 at 21:42 There is a new kid in town called pexec which is a replacement for parallel . – 0x004D44 Nov 2 '15 at 21:42 mat ,Mar 1, 2016 at 21:04 Not to be picky, but xargs can also parallelize commands. – mat Mar 1 '16 at 21:04 Vader B ,Jun 27, 2016 at 6:41 In fact, xargs can run commands in parallel for you. There is a special -P max_procs command-line option for that. See man xargs . > , You can run 20 processes and use the command: wait  Your script will wait and continue when all your background jobs are finished. #### [Jun 23, 2018] parallelism - correct xargs parallel usage ###### Jun 23, 2018 | unix.stackexchange.com Yan Zhu ,Apr 19, 2015 at 6:59 I am using xargs to call a python script to process about 30 million small files. I hope to use xargs to parallelize the process. The command I am using is: find ./data -name "*.json" -print0 | xargs -0 -I{} -P 40 python Convert.py {} > log.txt  Basically, Convert.py will read in a small json file (4kb), do some processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no other CPU-intense process is running on this server. By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I find that -P 40 is not as fast as expected. Sometimes all cores will freeze and decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease the number of parallel processes to -P 20-30 , but it's still not very fast. The ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs ? Ole Tange ,Apr 19, 2015 at 8:45 You are most likely hit by I/O: The system cannot read the files fast enough. Try starting more than 40: This way it will be fine if some of the processes have to wait for I/O. – Ole Tange Apr 19 '15 at 8:45 Fox ,Apr 19, 2015 at 10:30 What kind of processing does the script do? Any database/network/io involved? How long does it run? – Fox Apr 19 '15 at 10:30 PSkocik ,Apr 19, 2015 at 11:41 I second @OleTange. That is the expected behavior if you run as many processes as you have cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep), then they will process, and then repeat. If you add more processes, then the additional processes that currently aren't running on a physical core will have kicked off parallel IO operations, which will, when finished, eliminate or at least reduce the sleep periods on your cores. – PSkocik Apr 19 '15 at 11:41 Bichoy ,Apr 20, 2015 at 3:32 1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually overwritten with each call to convert.py ... not sure if this is the intended behavior or not. – Bichoy Apr 20 '15 at 3:32 Ole Tange ,May 11, 2015 at 18:38 xargs -P and > is opening up for race conditions because of the half-line problem gnu.org/software/parallel/ Using GNU Parallel instead will not have that problem. – Ole Tange May 11 '15 at 18:38 James Scriven ,Apr 24, 2015 at 18:00 I'd be willing to bet that your problem is python . You didn't say what kind of processing is being done on each file, but assuming you are just doing in-memory processing of the data, the running time will be dominated by starting up 30 million python virtual machines (interpreters). If you can restructure your python program to take a list of files, instead of just one, you will get a huge improvement in performance. You can then still use xargs to further improve performance. For example, 40 processes, each processing 1000 files: find ./data -name "*.json" -print0 | xargs -0 -L1000 -P 40 python Convert.py  This isn't to say that python is a bad/slow language; it's just not optimized for startup time. You'll see this with any virtual machine-based or interpreted language. Java, for example, would be even worse. If your program was written in C, there would still be a cost of starting a separate operating system process to handle each file, but it would be much less. From there you can fiddle with -P to see if you can squeeze out a bit more speed, perhaps by increasing the number of processes to take advantage of idle processors while data is being read/written. Stephen ,Apr 24, 2015 at 13:03 So firstly, consider the constraints: What is the constraint on each job? If it's I/O you can probably get away with multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its going to be worse than pointless running more jobs concurrently than you have CPU cores. My understanding of these things is that GNU Parallel would give you better control over the queue of jobs etc. See GNU parallel vs & (I mean background) vs xargs -P for a more detailed explanation of how the two differ. , As others said, check whether you're I/O-bound. Also, xargs' man page suggests using -n with -P , you don't mention the number of Convert.py processes you see running in parallel. As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try doing the processing in a tmpfs (of course, in this case you should check for enough memory, avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in the first place). #### [Jun 23, 2018] Linux/Bash, how to schedule commands in a FIFO queue? ###### Jun 23, 2018 | superuser.com Andrei ,Apr 10, 2013 at 14:26 I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be run at a specified time in the future as would be the case with the "at" command. I want them to start running now, but not simultaneously. The next scheduled command in the queue should be run only after the first command finishes executing. Alternatively, it would be nice if I could specify a maximum number of commands from the queue that could be run simultaneously; for example if the maximum number of simultaneous commands is 2, then only at most 2 commands scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the next command in the remaining queue being started only when one of the currently 2 running commands finishes. I've heard task-spooler could do something like this, but this package doesn't appear to be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what I'm using). If that's the best alternative then let me know and I'll use task-spooler, otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free, canonical way to do such a thing with bash. UPDATE: Simple solutions like ; or && from bash do not work. I need to schedule these commands from an external program, when an event occurs. I just don't want to have hundreds of instances of my command running simultaneously, hence the need for a queue. There's an external program that will trigger events where I can run my own commands. I want to handle ALL triggered events, I don't want to miss any event, but I also don't want my system to crash, so that's why I want a queue to handle my commands triggered from the external program. Andrei ,Apr 11, 2013 at 11:40 Task Spooler: http://vicerveza.homeunix.net/~viric/soft/ts/ https://launchpad.net/ubuntu/+source/task-spooler/0.7.3-1 Does the trick very well. Hopefully it will be included in Ubuntu's package repos. Hennes ,Apr 10, 2013 at 15:00 Use ; For example: ls ; touch test ; ls That will list the directory. Only after ls has run it will run touch test which will create a file named test. And only after that has finished it will run the next command. (In this case another ls which will show the old contents and the newly created file). Similar commands are || and && . ; will always run the next command. && will only run the next command it the first returned success. Example: rm -rf *.mp3 && echo "Success! All MP3s deleted!" || will only run the next command if the first command returned a failure (non-zero) return value. Example: rm -rf *.mp3 || echo "Error! Some files could not be deleted! Check permissions!" If you want to run a command in the background, append an ampersand ( & ). Example: make bzimage & mp3blaster sound.mp3 make mytestsoftware ; ls ; firefox ; make clean Will run two commands int he background (in this case a kernel build which will take some time and a program to play some music). And in the foregrounds it runs another compile job and, once that is finished ls, firefox and a make clean (all sequentially) For more details, see man bash [Edit after comment] in pseudo code, something like this? Program run_queue: While(true) { Wait_for_a_signal(); While( queue not empty ) { run next command from the queue. remove this command from the queue. // If commands where added to the queue during execution then // the queue is not empty, keep processing them all. } // Queue is now empty, returning to wait_for_a_signal }  // // Wait forever on commands and add them to a queue // Signal run_quueu when something gets added. // program add_to_queue() { While(true) { Wait_for_event(); Append command to queue signal run_queue } }  terdon ,Apr 10, 2013 at 15:03 The easiest way would be to simply run the commands sequentially: cmd1; cmd2; cmd3; cmdN  If you want the next command to run only if the previous command exited successfully, use && : cmd1 && cmd2 && cmd3 && cmdN  That is the only bash native way I know of doing what you want. If you need job control (setting a number of parallel jobs etc), you could try installing a queue manager such as TORQUE but that seems like overkill if all you want to do is launch jobs sequentially. psusi ,Apr 10, 2013 at 15:24 You are looking for at 's twin brother: batch . It uses the same daemon but instead of scheduling a specific time, the jobs are queued and will be run whenever the system load average is low. mpy ,Apr 10, 2013 at 14:59 Apart from dedicated queuing systems (like the Sun Grid Engine ) which you can also use locally on one machine and which offer dozens of possibilities, you can use something like  command1 && command2 && command3  which is the other extreme -- a very simple approach. The latter neither does provide multiple simultaneous processes nor gradually filling of the "queue". Bogdan Dumitru ,May 3, 2016 at 10:12 I went on the same route searching, trying out task-spooler and so on. The best of the best is this: GNU Parallel --semaphore --fg It also has -j for parallel jobs. #### [Jun 23, 2018] Task Spooler ##### Notable quotes: ##### "... As in freshmeat.net : ..." ##### "... doesn't work anymore ..." ###### Jun 23, 2018 | vicerveza.homeunix.net As in freshmeat.net : task spooler is a Unix batch system where the tasks spooled run one after the other. The amount of jobs to run at once can be set at any time. Each user in each system has his own job queue. The tasks are run in the correct context (that of enqueue) from any shell/process, and its output/results can be easily watched. It is very useful when you know that your commands depend on a lot of RAM, a lot of disk use, give a lot of output, or for whatever reason it's better not to run them all at the same time, while you want to keep your resources busy for maximum benfit. Its interface allows using it easily in scripts. For your first contact, you can read an article at linux.com , which I like as overview, guide and examples (original url) . On more advanced usage, don't neglect the TRICKS file in the package. Features I wrote Task Spooler because I didn't have any comfortable way of running batch jobs in my linux computer. I wanted to: • Queue jobs from different terminals. • Use it locally in my machine (not as in network queues). • Have a good way of seeing the output of the processes (tail, errorlevels, ...). • Easy use: almost no configuration. • Easy to use in scripts. At the end, after some time using and developing ts , it can do something more: • It works in most systems I use and some others, like GNU/Linux, Darwin, Cygwin, and FreeBSD. • No configuration at all for a simple queue. • Good integration with renice, kill, etc. (through ts -p and process groups). • Have any amount of queues identified by name, writting a simple wrapper script for each (I use ts2, tsio, tsprint, etc). • Control how many jobs may run at once in any queue (taking profit of multicores). • It never removes the result files, so they can be reached even after we've lost the ts task list. • Transparent if used as a subprogram with -nf . • Optional separation of stdout and stderr. You can look at an old (but representative) screenshot of ts-0.2.1 if you want. Mailing list I created a GoogleGroup for the program. You look for the archive and the join methods in the taskspooler google group page . Alessandro Öhler once maintained a mailing list for discussing newer functionalities and interchanging use experiences. I think this doesn't work anymore , but you can look at the old archive or even try to subscribe . How it works The queue is maintained by a server process. This server process is started if it isn't there already. The communication goes through a unix socket usually in /tmp/ . When the user requests a job (using a ts client), the client waits for the server message to know when it can start. When the server allows starting , this client usually forks, and runs the command with the proper environment, because the client runs run the job and not the server, like in 'at' or 'cron'. So, the ulimits, environment, pwd,. apply. When the job finishes, the client notifies the server. At this time, the server may notify any waiting client, and stores the output and the errorlevel of the finished job. Moreover the client can take advantage of many information from the server: when a job finishes, where does the job output go to, etc. Download Download the latest version (GPLv2+ licensed): ts-1.0.tar.gz - v1.0 (2016-10-19) - Changelog Look at the version repository if you are interested in its development. Андрей Пантюхин (Andrew Pantyukhin) maintains the BSD port . Alessandro Öhler provided a Gentoo ebuild for 0.4 , which with simple changes I updated to the ebuild for 0.6.4 . Moreover, the Gentoo Project Sunrise already has also an ebuild ( maybe old ) for ts . Alexander V. Inyukhin maintains unofficial debian packages for several platforms. Find the official packages in the debian package system . Pascal Bleser packed the program for SuSE and openSuSE in RPMs for various platforms . Gnomeye maintains the AUR package . Eric Keller wrote a nodejs web server showing the status of the task spooler queue ( github project ). Manual Look at its manpage (v0.6.1). Here you also have a copy of the help for the same version: usage: ./ts [action] [-ngfmd] [-L <lab>] [cmd...] Env vars: TS_SOCKET the path to the unix socket used by the ts command. TS_MAILTO where to mail the result (on -m). Local user by default. TS_MAXFINISHED maximum finished jobs in the queue. TS_ONFINISH binary called on job end (passes jobid, error, outfile, command). TS_ENV command called on enqueue. Its output determines the job information. TS_SAVELIST filename which will store the list, if the server dies. TS_SLOTS amount of jobs which can run at once, read on server start. Actions: -K kill the task spooler server -C clear the list of finished jobs -l show the job list (default action) -S [num] set the number of max simultanious jobs of the server. -t [id] tail -f the output of the job. Last run if not specified. -c [id] cat the output of the job. Last run if not specified. -p [id] show the pid of the job. Last run if not specified. -o [id] show the output file. Of last job run, if not specified. -i [id] show job information. Of last job run, if not specified. -s [id] show the job state. Of the last added, if not specified. -r [id] remove a job. The last added, if not specified. -w [id] wait for a job. The last added, if not specified. -u [id] put that job first. The last added, if not specified. -U <id-id> swap two jobs in the queue. -h show this help -V show the program version Options adding jobs: -n don't store the output of the command. -g gzip the stored output (if not -n). -f don't fork into background. -m send the output by e-mail (uses sendmail). -d the job will be run only if the job before ends well -L <lab> name this task with a label, to be distinguished on listing.  Thanks • To Raúl Salinas, for his inspiring ideas • To Alessandro Öhler, the first non-acquaintance user, who proposed and created the mailing list. • Андрею Пантюхину, who created the BSD port . • To the useful, although sometimes uncomfortable, UNIX interface. • To Alexander V. Inyukhin, for the debian packages. • To Pascal Bleser, for the SuSE packages. • To Sergio Ballestrero, who sent code and motivated the development of a multislot version of ts. • To GNU, an ugly but working and helpful ol' UNIX implementation. #### [Jun 23, 2018] bash - Shell Scripting Using xargs to execute parallel instances of a shell function ###### Jun 23, 2018 | stackoverflow.com Gnats ,Jul 23, 2010 at 19:33 I'm trying to use xargs in a shell script to run parallel instances of a function I've defined in the same script. The function times the fetching of a page, and so it's important that the pages are actually fetched concurrently in parallel processes, and not in background processes (if my understanding of this is wrong and there's negligible difference between the two, just let me know). The function is: function time_a_url () { oneurltime=$($time_command -p wget -p$1 -O /dev/null 2>&1 1>/dev/null | grep real | cut -d" " -f2)
echo "Fetching $1 took$oneurltime seconds."
}


How does one do this with an xargs pipe in a form that can take number of times to run time_a_url in parallel as an argument? And yes, I know about GNU parallel, I just don't have the privilege to install software where I'm writing this.

Dennis Williamson ,Jul 23, 2010 at 23:03

Here's a demo of how you might be able to get your function to work:
$f() { echo "[$@]"; }
$export -f f$ echo -e "b 1\nc 2\nd 3 4" | xargs -P 0 -n 1 -I{} bash -c f\ \{\}
[b 1]
[d 3 4]
[c 2]


The keys to making this work are to export the function so the bash that xargs spawns will see it and to escape the space between the function name and the escaped braces. You should be able to adapt this to work in your situation. You'll need to adjust the arguments for -P and -n (or remove them) to suit your needs.

You can probably get rid of the grep and cut . If you're using the Bash builtin time , you can specify an output format using the TIMEFORMAT variable. If you're using GNU /usr/bin/time , you can use the --format argument. Either of these will allow you to drop the -p also.

You can replace this part of your wget command: 2>&1 1>/dev/null with -q . In any case, you have those reversed. The correct order would be >/dev/null 2>&1 .

Lee Netherton ,Aug 30, 2011 at 16:32

I used xargs -P0 -n1 -I{} bash -c "f {}" which still works, and seems a little tidier. – Lee Netherton Aug 30 '11 at 16:32

tmpvar ,Jul 24, 2010 at 15:21

On Mac OS X:

xargs: max. processes must be >0 (for: xargs -P [>0])

f() { echo "[$@]"; } export -f f echo -e "b 1\nc 2\nd 3 4" | sed 's/ /\\ /g' | xargs -P 10 -n 1 -I{} bash -c f\ \{\} echo -e "b 1\nc 2\nd 3 4" | xargs -P 10 -I '{}' bash -c 'f "$@"' arg0 '{}'


,

If you install GNU Parallel on another system, you will see the functionality is in a single file (called parallel).

You should be able to simply copy that file to your own ~/bin.

#### [Jun 13, 2018] parsync - a parallel rsync wrapper for large data transfers by Harry Mangalam

###### Jan 22, 2017 | nac.uci.edu

If you already know you want it, get it here: parsync+utils.tar.gz (contains parsync plus the kdirstat-cache-writer , stats , and scut utilities below) Extract it into a dir on your $PATH and after verifying the other dependencies below, give it a shot. While parsync is developed for and test on Linux, the latest version of parsync has been modified to (mostly) work on the Mac (tested on OSX 10.9.5). A number of the Linux-specific dependencies have been removed and there are a number of Mac-specific work arounds. Thanks to Phil Reese < preese@stanford.edu > for the code mods needed to get it started. It's the same package and instructions for both platforms. 2. Dependencies parsync requires the following utilities to work: • stats - self-writ Perl utility for providing descriptive stats on STDIN • scut - self-writ Perl utility like cut that allows regex split tokens • kdirstat-cache-writer (included in the tarball mentioned above), requires a non-default Perl utility: URI::Escape qw(uri_escape) sudo yum install perl-URI # CentOS-like sudo apt-get install liburi-perl # Debian-like parsync needs to be installed only on the SOURCE end of the transfer and uses whatever rsync is available on the TARGET. It uses a number of Linux- specific utilities so if you're transferring between Linux and a FreeBSD host, install parsync on the Linux side. In fact, as currently written, it will only PUSH data to remote targets ; it will not pull data as rsync itself can do. This will probably in the near future. 3. Overview rsync is a fabulous data mover. Possibly more bytes have been moved (or have been prevented from being moved) by rsync than by any other application. So what's not to love? For transferring large, deep file trees, rsync will pause while it generates lists of files to process. Since Version 3, it does this pretty fast, but on sluggish filesystems, it can take hours or even days before it will start to actually exchange rsync data. Second, due to various bottlenecks, rsync will tend to use less than the available bandwidth on high speed networks. Starting multiple instances of rsync can improve this significantly. However, on such transfers, it is also easy to overload the available bandwidth, so it would be nice to both limit the bandwidth used if necessary and also to limit the load on the system. parsync tries to satisfy all these conditions and more by: • using the kdir-cache-writer utility from the beautiful kdirstat directory browser which can produce lists of files very rapidly • allowing re-use of the cache files so generated. • doing crude loadbalancing of the number of active rsyncs, suspending and un-suspending the processes as necessary. • using rsync's own bandwidth limiter (--bwlimit) to throttle the total bandwidth. • using rsync's own vast option selection is available as a pass-thru (tho limited to those compatible with the --files-from option).  Only use for LARGE data transfers The main use case for parsync is really only very large data transfers thru fairly fast network connections (>1Gb/s). Below this speed, a single rsync can saturate the connection, so there's little reason to use parsync and in fact the overhead of testing the existence of and starting more rsyncs tends to worsen its performance on small transfers to slightly less than rsync alone. Beyond this introduction, parsync's internal help is about all you'll need to figure out how to use it; below is what you'll see when you type parsync -h . There are still edge cases where parsync will fail or behave oddly, especially with small data transfers, so I'd be happy to hear of such misbehavior or suggestions to improve it. Download the complete tarball of parsync, plus the required utilities here: parsync+utils.tar.gz Unpack it, move the contents to a dir on your$PATH , chmod it executable, and try it out.
parsync --help
or just
parsync
Below is what you should see:

4. parsync help

parsync version 1.67 (Mac compatibility beta) Jan 22, 2017
by Harry Mangalam <hjmangalam@gmail.com> || <harry.mangalam@uci.edu>

parsync is a Perl script that wraps Andrew Tridgell's miraculous 'rsync' to
provide some load balancing and parallel operation across network connections
to increase the amount of bandwidth it can use.

parsync is primarily tested on Linux, but (mostly) works on MaccOSX
as well.

parsync needs to be installed only on the SOURCE end of the
transfer and only works in local SOURCE -> remote TARGET mode
(it won't allow remote local SOURCE <- remote TARGET, emitting an
error and exiting if attempted).

It uses whatever rsync is available on the TARGET.  It uses a number
of Linux-specific utilities so if you're transferring between Linux
and a FreeBSD host, install parsync on the Linux side.

The only native rsync option that parsync uses is '-a' (archive) &
'-s' (respect bizarro characters in filenames).
If you need more, then it's up to you to provide them via
'--rsyncopts'. parsync checks to see if the current system load is
too heavy and tries to throttle the rsyncs during the run by
monitoring and suspending / continuing them as needed.

It uses the very efficient (also Perl-based) kdirstat-cache-writer
from kdirstat to generate lists of files which are summed and then
crudely divided into NP jobs by size.

It appropriates rsync's bandwidth throttle mechanism, using '--maxbw'
as a passthru to rsync's 'bwlimit' option, but divides it by NP so
as to keep the total bw the same as the stated limit.  It monitors and
shows network bandwidth, but can't change the bw allocation mid-job.
It can only suspend rsyncs until the load decreases below the cutoff.
If you suspend parsync (^Z), all rsync children will suspend as well,
regardless of current state.

Unless changed by '--interface', it tried to figure out how to set the
interface to monitor.  The transfer will use whatever interface routing
provides, normally set by the name of the target.  It can also be used for
non-host-based transfers (between mounted filesystems) but the network
bandwidth continues to be (usually pointlessly) shown.

[[NB: Between mounted filesystems, parsync sometimes works very poorly for
reasons still mysterious.  In such cases (monitor with 'ifstat'), use 'cp'
or 'tnc' (https://goo.gl/5FiSxR) for the initial data movement and a single
rsync to finalize.  I believe the multiple rsync chatter is interfering with
the transfer.]]

It only works on dirs and files that originate from the current dir (or
specified via "--rootdir").  You cannot include dirs and files from
discontinuous or higher-level dirs.

** the ~/.parsync files **
The ~/.parsync dir contains the cache (*.gz), the chunk files (kds*), and the
time-stamped log files. The cache files can be re-used with '--reusecache'
(which will re-use ALL the cache and chunk files.  The log files are
datestamped and are NOT overwritten.

** Odd characters in names **
parsync will sometimes refuse to transfer some oddly named files, altho
recent versions of rsync allow the '-s' flag (now a parsync default)
which tries to respect names with spaces and properly escaped shell
characters.  Filenames with embedded newlines, DOS EOLs, and other
odd chars will be recorded in the log files in the ~/.parsync dir.

** Because of the crude way that files are chunked, NP may be
adjusted slightly to match the file chunks. ie '--NP 8' -> '--NP 7'.
If so, a warning will be issued and the rest of the transfer will be

OPTIONS
=======
[i] = integer number
[f] = floating point number
[s] = "quoted string"
( ) = the default if any

--NP [i] (sqrt(#CPUs)) ...............  number of rsync processes to start
optimal NP depends on many vars.  Try the default and incr as needed
--startdir [s] (pwd)  .. the directory it works relative to. If you omit
it, the default is the CURRENT dir. You DO have
to specify target dirs.  See the examples below.
--maxbw [i] (unlimited) ..........  in KB/s max bandwidth to use (--bwlimit
passthru to rsync).  maxbw is the total BW to be used, NOT per rsync.
sleeps an rsync proc for 10s
--checkperiod [i] (5) .......... sets the period in seconds between updates
--rsyncopts [s]  ...  options passed to rsync as a quoted string (CAREFUL!)
this opt triggers a pause before executing to verify the command.
--interface [s]  .............  network interface to /monitor/, not nec use.
default: /sbin/route -n | grep "^0.0.0.0" | rev | cut -d' ' -f1 | rev
above works on most simple hosts, but complex routes will confuse it.
--reusecache  ..........  don't re-read the dirs; re-use the existing caches
--email [s]  .....................  email address to send completion message
(requires working mail system on host)
--barefiles   .....  set to allow rsync of individual files, as oppo to dirs
--nowait  ................  for scripting, sleep for a few s instead of wait
--version  .................................  dumps version string and exits
--help  .........................................................  this help

Examples
========
-- Good example 1 --
% parsync  --maxload=5.5 --NP=4 --startdir='/home/hjm' dir1 dir2 dir3
hjm@remotehost:~/backups

where
= "--startdir='/home/hjm'" sets the working dir of this operation to
'/home/hjm' and dir1 dir2 dir3 are subdirs from '/home/hjm'
= the target "hjm@remotehost:~/backups" is the same target rsync would use
= "--NP=4" forks 4 instances of rsync
= -"-maxload=5.5" will start suspending rsync instances when the 5m system
load gets to 5.5 and then unsuspending them when it goes below it.

It uses 4 instances to rsync dir1 dir2 dir3 to hjm@remotehost:~/backups

-- Good example 2 --
% parsync --rsyncopts="--ignore-existing" --reusecache  --NP=3
--barefiles  *.txt   /mount/backups/txt

where
=  "--rsyncopts='--ignore-existing'" is an option passed thru to rsync
telling it not to disturb any existing files in the target directory.
= "--reusecache" indicates that the filecache shouldn't be re-generated,
uses the previous filecache in ~/.parsync
= "--NP=3" for 3 copies of rsync (with no "--maxload", the default is 4)
= "--barefiles" indicates that it's OK to transfer barefiles instead of
recursing thru dirs.
= "/mount/backups/txt" is the target - a local disk mount instead of a network host.

It uses 3 instances to rsync *.txt from the current dir to "/mount/backups/txt".

-- Error Example 1 --
% pwd
/home/hjm  # executing parsync from here

% parsync --NP4 --compress /usr/local  /media/backupdisk

why this is an error:
= '--NP4' is not an option (parsync will say "Unknown option: np4")
It should be '--NP=4'
= if you were trying to rsync '/usr/local' to '/media/backupdisk',
it will fail since there is no /home/hjm/usr/local dir to use as
a source. This will be shown in the log files in
~/.parsync/rsync-logfile-<datestamp>_#
as a spew of "No such file or directory (2)" errors
= the '--compress' is a native rsync option, not a native parsync option.
You have to pass it to rsync with "--rsyncopts='--compress'"

The correct version of the above command is:

% parsync --NP=4  --rsyncopts='--compress' --startdir=/usr  local
/media/backupdisk

-- Error Example 2 --
% parsync --start-dir /home/hjm  mooslocal  hjm@moo.boo.yoo.com:/usr/local

why this is an error:
= this command is trying to PULL data from a remote SOURCE to a
local TARGET.  parsync doesn't support that kind of operation yet.

The correct version of the above command is:

# ssh to hjm@moo, install parsync, then:
% parsync  --startdir=/usr  local  hjm@remote:/home/hjm/mooslocal

#### [Jun 02, 2018] How to run Linux commands simultaneously with GNU Parallel

###### Jun 02, 2018 | www.techrepublic.com

Scratching the surface

We've only just scratched the surface of GNU Parallel. I highly recommend you give the official GNU Parallel tutorial a read, and watch this video tutorial series on Yutube , so you can understand the complexities of the tool (of which there are many).

But this will get you started on a path to helping your data center Linux servers use commands with more efficiency.

#### [Jun 02, 2018] Parallelise rsync using GNU Parallel

###### Jun 02, 2018 | unix.stackexchange.com

Mandar Shinde ,Mar 13, 2015 at 6:51

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/


The contents of proj.lst are as follows:

+ proj1
+ proj1/*
+ proj1/*/*
+ proj1/*/*/*.tar
+ proj1/*/*/*.pdf
+ proj2
+ proj2/*
+ proj2/*/*
+ proj2/*/*/*.tar
+ proj2/*/*/*.pdf
...
...
...
- *


As a test, I picked up two of those projects (8.5GB of data) and I executed the command above. Being a sequential process, it tool 14 minutes 58 seconds to complete. So, for 1.2TB of data it would take several hours.

If I would could multiple rsync processes in parallel (using & , xargs or parallel ), it would save my time.

I tried with below command with parallel (after cd ing to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .


This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

Ole Tange ,Mar 13, 2015 at 7:25

Are you limited by network bandwidth? Disk iops? Disk bandwidth? – Ole Tange Mar 13 '15 at 7:25

Mandar Shinde ,Mar 13, 2015 at 7:32

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsync s is our first priority. – Mandar Shinde Mar 13 '15 at 7:32

Ole Tange ,Mar 13, 2015 at 7:41

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used? – Ole Tange Mar 13 '15 at 7:41

Mandar Shinde ,Mar 13, 2015 at 7:47

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsync s in parallel is the primary focus now. – Mandar Shinde Mar 13 '15 at 7:47

Mandar Shinde ,Apr 11, 2015 at 13:53

Following steps did the job for me:
1. Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

1. I fed the output of cat transfer.log to parallel in order to run 5 rsync s in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

Here, --relative option ( link ) ensured that the directory structure for the affected files, at the source and destination, remains the same (inside /data/ directory), so the command must be run in the source folder (in example, /data/projects ).

Sandip Bhattacharya ,Nov 17, 2016 at 21:22

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/ – Sandip Bhattacharya Nov 17 '16 at 21:22

Mike D ,Sep 19, 2017 at 16:42

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/ . – Mike D Sep 19 '17 at 16:42

Cheetah ,Oct 12, 2017 at 5:31

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v , and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them. – Cheetah Oct 12 '17 at 5:31

Mikhail ,Apr 10, 2017 at 3:28

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1 .

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0


Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod


StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.62K      0   130M


This in synthetic benchmarks (crystal disk), performance for sequential write approaches 900 MB/s which means the link is saturated. 130MB/s is not very good, and the difference between waiting a weekend and two weeks.

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log


and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M
StoragePod  29.9T   144T      0  1.62K      0   130M
StoragePod  29.9T   144T      0  1.56K      0   129M


As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell


This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M
StoragePod  30.1T   144T     24  5.11K   184K   469M
StoragePod  30.1T   144T     25  4.30K   196K   373M


In conclusion, as @Sandip Bhattacharya brought up, write a small script to get the directories and parallel that. Alternatively, pass a file list to rsync. But don't create new instances for each file.

Julien Palard ,May 25, 2016 at 14:15

I personally use this simple one:
ls -1 | parallel rsync -a {} /destination/directory/


Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

Ole Tange ,Mar 13, 2015 at 7:25

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver:

cd src-dir; find . -type f -size +100000 | \
parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
rsync -s -Havessh {} fooserver:/dest-dir/{}


The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

rsync -Havessh src-dir/ fooserver:/dest-dir/


If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:

seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/


Mandar Shinde ,Mar 13, 2015 at 7:34

Any other alternative in order to avoid find ? – Mandar Shinde Mar 13 '15 at 7:34

Ole Tange ,Mar 17, 2015 at 9:20

Limit the -maxdepth of find. – Ole Tange Mar 17 '15 at 9:20

Mandar Shinde ,Apr 10, 2015 at 3:47

If I use --dry-run option in rsync , I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process? – Mandar Shinde Apr 10 '15 at 3:47

Ole Tange ,Apr 10, 2015 at 5:51

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; rsync -s -Havessh {} fooserver:/dest-dir/{} – Ole Tange Apr 10 '15 at 5:51

Mandar Shinde ,Apr 10, 2015 at 9:49

Can you please explain the mkdir -p /dest-dir/{//}\; part? Especially the {//} thing is a bit confusing. – Mandar Shinde Apr 10 '15 at 9:49

,

For multi destination syncs, I am using
parallel rsync -avi /path/to/source ::: host1: host2: host3:


Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

#### [Jun 02, 2018] Parallelizing rsync

###### Jun 02, 2018 | www.gnu.org

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver :

  cd src-dir; find . -type f -size +100000 | \
parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
rsync -s -Havessh {} fooserver:/dest-dir/{}


The dirs created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

  rsync -Havessh src-dir/ fooserver:/dest-dir/


If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:

  seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/


#### [May 28, 2018] TIP 7-zip s XZ compression on a multiprocessor system is often faster and compresses better than gzip linuxadmin

###### May 28, 2018 | www.reddit.com

TyIzaeL line"> [–] kristopolous 4 years ago (4 children)

I did this a while back also. Here's a graph: http://i.imgur.com/gPOQBfG.png

X axis is compression level (min to max) Y is the size of the file that was compressed

I forget what the file was.

TyIzaeL 4 years ago (3 children)
That is a great start (probably better than what I am doing). Do you have time comparisons as well?
kristopolous 4 years ago (1 child)
TyIzaeL 4 years ago (0 children)
Very nice. I might work on something similar to this soon next time I'm bored.
kristopolous 4 years ago (0 children)
nope.
TyIzaeL 4 years ago (0 children)
That's a great point to consider among all of this. Compression is always a tradeoff between how much CPU and memory you want to throw at something and how much space you would like to save. In my case, hammering the server for 3 minutes in order to take a backup is necessary because the uncompressed data would bottleneck at the LAN speed.
randomfrequency 4 years ago (0 children)
You might want to play with 'pigz' - it's gzip, multi-threaded. You can 'pv' to restrict the rate of the output, and it accepts signals to control the rate limiting.
rrohbeck 4 years ago (1 child)
Also pbzip2 -1 to -9 and pigz -1 to -9.

With -9 you can surely make backup CPU bound. I've given up on compression though: rsync is much faster than straight backup and I use btrfs compression/deduplication/snapshotting on the backup server.

TyIzaeL 4 years ago (0 children)
pigz -9 is already on the chart as pigz --best. I'm working on adding the others though.
TyIzaeL 4 years ago (0 children)
I'm running gzip, bzip2, and pbzip2 now (not at the same time, of course) and will add results soon. But in my case the compression keeps my db dumps from being IO bound by the 100mbit LAN connection. For example, lzop in the results above puts out 6041.632 megabits in 53.82 seconds for a total compressed data rate of 112 megabits per second, which would make the transfer IO bound. Whereas the pigz example puts out 3339.872 megabits in 81.892 seconds, for an output data rate of 40.8 megabits per second. This is just on my dual-core box with a static file, on the 8-core server I see the transfer takes a total of about three minutes. It's probably being limited more by the rate at which the MySQL server can dump text from the database, but if there was no compression it'd be limited by the LAN speed. If we were dumping 2.7GB over the LAN directly, we would need 122mbit/s of real throughput to complete it in three minutes.
Shammyhealz 4 years ago (2 children)
I thought the best compression was supposed to be LZMA? Which is what the .7z archives are. I have no idea of the relative speed of LZMA and gzip
TyIzaeL 4 years ago (1 child)
xz archives use the LZMA2 format (which is also used in 7z archives). LZMA2 speed seems to range from a little slower than gzip to much slower than bzip2, but results in better compression all around.
primitive_screwhead 4 years ago (0 children)
However LZMA2 decompression speed is generally much faster than bzip2, in my experience, though not as fast as gzip. This is why we use it, as we decompress our data much more often than we compress it, and the space saving/decompression speed tradeoff is much more favorable for us than either gzip of bzip2.
crustang 4 years ago (2 children)
I mentioned how 7zip was superior to all other zip programs in /r/osx a few days ago and my comment was burried in favor of the the osx circlejerk .. it feels good seeing this data.

I love 7zip

RTFMorGTFO 4 years ago (1 child)
Why... Tar supports xz, lzma, lzop, lzip, and any other kernel based compression algorithms. Its also much more likely to be preinstalled on your given distro.
crustang 4 years ago (0 children)
I've used 7zip at my old job for a backup of our business software's database. We needed speed, high level of compression, and encryption. Portability wasn't high on the list since only a handful of machines needed access to the data. All machines were multi-processor and 7zip gave us the best of everything given the requirements. I haven't really looked at anything deeply - including tar, which my old boss didn't care for.

#### [May 28, 2018] RPM RedHat EL 6 p7zip 9.20.1 x86_64 rpm

###### May 28, 2018 | rpm.pbone.net
p7zip rpm build for : RedHat EL 6 . For other distributions click p7zip .
 Name : p7zip Version : 9.20.1 Vendor : Dag Apt Repository, http://dag_wieers_com/apt/ Release : 1.el6.rf Date : 2011-04-20 15:23:34 Group : Applications/Archiving Source RPM : p7zip-9.20.1-1.el6.rf.src.rpm Size : 14.84 MB Packager : Dag Wieers < dag_wieers_com> Summary : Very high compression ratio file archiver Description : p7zip is a port of 7za.exe for Unix. 7-Zip is a file archiver with a very high compression ratio. The original version can be found at http://www.7-zip.org/. RPM found in directory: /mirror/apt.sw.be/redhat/el6/en/x86_64/rpmforge/RPMS

Content of RPM Changelog Provides Requires
 ftp.univie.ac.at p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.rediris.es p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.icm.edu.pl p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.pbone.net p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.pbone.net p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.pbone.net p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.is.co.za p7zip-9.20.1-1.el6.rf.x86_64.rpm

#### [May 28, 2018] TIL pigz exists A parallel implementation of gzip for modern multi-processor, multi-core machines linux

###### May 28, 2018 | www.reddit.com

submitted 3 years ago by

msiekkinen y unvoted">

[–] tangre 3 years ago (74 children)

Why wouldn't gzip be updated with this functionality instead? Is there a point in keeping it separate?
ilikerackmounts 3 years ago (59 children)
There are certain file sizes were pigz makes no difference, in general you need at least 2 cores to feel the benefits, there are quite a few reasons. That being said, pigz and its bzip counterpart pbzip2 can be symlinked in place when emerged with gentoo and using the "symlink" use flag.
adam@eggsbenedict ~ $eix pigz [I] app-arch/pigz Available versions: 2.2.5 2.3 2.3.1 (~)2.3.1-r1 {static symlink |test} Installed versions: 2.3.1-r1(02:06:01 01/25/14)(symlink -static -|test) Homepage: http://www.zlib.net/pigz/ Description: A parallel implementation of gzip  msiekkinen 3 years ago (38 children) in general you need at least 2 cores to feel the benefits Is it even possible to buy any single core cpus outside of some kind of specialized embedded system these days? exdirrk 3 years ago (5 children) Virtualization. tw4 3 years ago (2 children) Yes, but nevertheless it's possible to allocate only one. too_many_secrets 3 years ago (0 children) Giving a VM more than one CPU is quite a rare circumstance. Depends on your circumstances. It's rare that we have any VMs with a single CPU, but we have thousands of servers and a lot of things going on. FakingItEveryDay 3 years ago (0 children) You can, but often shouldn't. I can only speak for vmware here, other hypervisors may work differently. Generally you want to size your VMware vm's so that they are around 80% cpu utilization. When any VM with multiple cores needs compute power the hypervisor will make it wait to until it can free that number of CPUs, even if the task in the VM only needs one core. This makes the multi-core VM slower by having to wait longer to do it's work, as well as makes other VMs on the hypervisor slower as they must all wait for it to finish before they can get a core allocated. #### [May 28, 2018] Solaris: Parallel Compression/Decompression ##### Notable quotes: ##### "... the following prstat, vmstat outputs show that gzip is compressing the ..." ##### "... tar file using a single thread – hence low CPU utilization. ..." ##### "... wall clock time is 25s compared to gzip's 3m 27s ..." ##### "... the following prstat, vmstat outputs show that pigz is compressing the ..." ##### "... tar file using many threads – hence busy system with high CPU utilization. ..." ##### "... shows that the pigz compressed file is ..." ##### "... compatible with gzip/gunzip ..." ##### "... compare gzip's 52s decompression time with pigz's 18s ..." ###### May 28, 2018 | hadafq8.wordpress.com Posted on January 26, 2015 by Sandeep Shenoy This topic is not Solaris specific, but certainly helps Solaris users who are frustrated with the single threaded implementation of all officially supported compression tools such as compress, gzip, zip. pigz (pig-zee) is a parallel implementation of gzip that suits well for the latest multi-processor, multi-core machines. By default, pigz breaks up the input into multiple chunks of size 128 KB, and compress each chunk in parallel with the help of light-weight threads. The number of compress threads is set by default to the number of online processors. The chunk size and the number of threads are configurable. Compressed files can be restored to their original form using -d option of pigz or gzip tools. As per the man page, decompression is not parallelized out of the box, but may show some improvement compared to the existing old tools. The following example demonstrates the advantage of using pigz over gzip in compressing and decompressing a large file. eg., Original file, and the target hardware.$ ls -lh PT8.53.04.tar -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar
$psrinfo -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7) The core has 8 virtual processors (56-63) SPARC-T5 (chipid 0, clock 3600 MHz) gzip compression.$ time gzip –fast PT8.53.04.tar
real 3m40.125s user 3m27.105s sys 0m13.008s
$ls -lh PT8.53* -rw-r–r– 1 psft dba 3.1G Feb 28 14:03 PT8.53.04.tar.gz /* the following prstat, vmstat outputs show that gzip is compressing the tar file using a single thread – hence low CPU utilization. */$ prstat -p 42510 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 42510 psft 2616K 2200K cpu16 10 0 0:01:00 1.5% gzip/ 1

$prstat -m -p 42510 PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 42510 psft 95 4.6 0.0 0.0 0.0 0.0 0.0 0.0 0 35 7K 0 gzip/1$ vmstat 2 r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 776242104 917016008 0 7 0 0 0 0 0 0 0 52 52 3286 2606 2178 2 0 98
1 0 0 776242104 916987888 0 14 0 0 0 0 0 0 0 0 0 3851 3359 2978 2 1 97
0 0 0 776242104 916962440 0 0 0 0 0 0 0 0 0 0 0 3184 1687 2023 1 0 98
0 0 0 775971768 916930720 0 0 0 0 0 0 0 0 0 39 37 3392 1819 2210 2 0 98
0 0 0 775971768 916898016 0 0 0 0 0 0 0 0 0 0 0 3452 1861 2106 2 0 98

pigz compression. $time ./pigz PT8.53.04.tar real 0m25.111s <== wall clock time is 25s compared to gzip's 3m 27s user 17m18.398s sys 0m37.718s /* the following prstat, vmstat outputs show that pigz is compressing the tar file using many threads – hence busy system with high CPU utilization. */$ prstat -p 49734 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 49734 psft 59M 58M sleep 11 0 0:12:58 38% pigz/ 66

$vmstat 2 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 778097840 919076008 6 113 0 0 0 0 0 0 0 40 36 39330 45797 74148 61 4 35 0 0 0 777956280 918841720 0 1 0 0 0 0 0 0 0 0 0 38752 43292 71411 64 4 32 0 0 0 777490336 918334176 0 3 0 0 0 0 0 0 0 17 15 46553 53350 86840 60 4 35 1 0 0 777274072 918141936 0 1 0 0 0 0 0 0 0 39 34 16122 20202 28319 88 4 9 1 0 0 777138800 917917376 0 0 0 0 0 0 0 0 0 3 3 46597 51005 86673 56 5 39$ ls -lh PT8.53.04.tar.gz -rw-r–r– 1 psft dba 3.0G Feb 28 14:03 PT8.53.04.tar.gz
$gunzip PT8.53.04.tar.gz <== shows that the pigz compressed file is compatible with gzip/gunzip$ ls -lh PT8.53* -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar
Decompression. $time ./pigz -d PT8.53.04.tar.gz real 0m18.068s user 0m22.437s sys 0m12.857s$ time gzip -d PT8.53.04.tar.gz real 0m52.806s <== compare gzip's 52s decompression time with pigz's 18s
user 0m42.068s sys 0m10.736s
$ls -lh PT8.53.04.tar -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar Of course, there are other tools such as Parallel BZIP2 (PBZIP2), which is a parallel implementation of the bzip2 tool are worth a try too. The idea here is to highlight the fact that there are better tools out there to get the job done in a quick manner compared to the existing/old tools that are bundled with the operating system distribution. #### [Apr 22, 2018] Happy Sysadmin Appreciation Day 2016 Opensource.com ###### Apr 22, 2018 | opensource.com Necessity is frequently the mother of invention. I knew very little about BASH scripting but that was about to change rapidly. Working with the existing script and using online help forums, search engines, and some printed documentation, I setup Linux network attached storage computer running on Fedora Core. I learned how to create an SSH keypair and configure that along with rsync to move the backup file from the email server to the storage server. That worked well for a few days until I noticed that the storage servers disk space was rapidly disappearing. What was I going to do? That's when I learned more about Bash scripting. I modified my rsync command to delete backed up files older than ten days. In both cases I learned that a little knowledge can be a dangerous thing but in each case my experience and confidence as Linux user and system administrator grew and due to that I functioned as a resource for other. On the plus side, we soon realized that the disk to disk backup system was superior to tape when it came to restoring email files. In the long run it was a win but there was a lot of uncertainty and anxiety along the way. #### [Apr 04, 2018] The gzip Recovery Toolkit ###### Apr 04, 2018 | www.urbanophile.com So you thought you had your files backed up - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well. I'm very eager for feedback on this program . If you download and try it, I'd appreciate and email letting me know what your results were. My email is arenn@urbanophile.com . Thanks. ATTENTION 99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted. Disclaimer and Warning This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified. Downloading and Installing Note that version 0.8 contains major bug fixes and improvements. See the ChangeLog for details. Upgrading is recommended. The old version is provided in the event you run into troubles with the new release. You need the following packages: First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make . Install manually by copying to the directory of your choice. Usage Run gzrecover on a corrupted .gz file. If you leave the filename blank, gzrecover will read from the standard input. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. The default filename when reading from the standard input is "stdin.recovered". To write recovered data to the standard output, use the -p option. (Note that -p and -o cannot be used together). To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect the stderr stream to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. Note that gzrecover will take longer than regular gunzip. The more corrupt your data the longer it takes. If your archive is a tarball, read on. For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.6 or higher) handles corrupted files out of the box. Here's an example: $ ls *.gz
my-corrupted-backup.tar.gz
$gzrecover my-corrupted-backup.tar.gz$ ls *.recovered
my-corrupted-backup.tar.recovered
$cpio -F my-corrupted-backup.tar.recovered -i -v  Note that newer versions of cpio can spew voluminous error messages to your terminal. You may want to redirect the stderr stream to /dev/null. Also, cpio might take quite a long while to run. Copyright The gzip Recovery Toolkit v0.8 Copyright (c) 2002-2013 Aaron M. Renn ( arenn@urbanophile.com ) #### [Jan 14, 2018] How to remount filesystem in read write mode under Linux ###### Jan 14, 2018 | kerneltalks.com Most of the time on newly created file systems of NFS filesystems we see error like below :  1 2 3 4 root @ kerneltalks # touch file1 touch : cannot touch ' file1 ' : Read - only file system This is because file system is mounted as read only. In such scenario you have to mount it in read-write mode. Before that we will see how to check if file system is mounted in read only mode and then we will get to how to re mount it as a read write filesystem. How to check if file system is read only To confirm file system is mounted in read only mode use below command –  1 2 3 4 # cat /proc/mounts | grep datastore / dev / xvdf / datastore ext3 ro , seclabel , relatime , data = ordered 0 0 Grep your mount point in cat /proc/mounts and observer third column which shows all options which are used in mounted file system. Here ro denotes file system is mounted read-only. You can also get these details using mount -v command  1 2 3 4 root @ kerneltalks # mount -v |grep datastore / dev / xvdf on / datastore type ext3 ( ro , relatime , seclabel , data = ordered ) In this output. file system options are listed in braces at last column. Re-mount file system in read-write mode To remount file system in read-write mode use below command –  1 2 3 4 5 6 root @ kerneltalks # mount -o remount,rw /datastore root @ kerneltalks # mount -v |grep datastore / dev / xvdf on / datastore type ext3 ( rw , relatime , seclabel , data = ordered ) Observe after re-mounting option ro changed to rw . Now, file system is mounted as read write and now you can write files in it. Note : It is recommended to fsck file system before re mounting it. You can check file system by running fsck on its volume.  1 2 3 4 5 6 7 8 9 10 root @ kerneltalks # df -h /datastore Filesystem Size Used Avail Use % Mounted on / dev / xvda2 10G 881M 9.2G 9 % / root @ kerneltalks # fsck /dev/xvdf fsck from util - linux 2.23.2 e2fsck 1.42.9 ( 28 - Dec - 2013 ) / dev / xvdf : clean , 12 / 655360 files , 79696 / 2621440 blocks Sometimes there are some corrections needs to be made on file system which needs reboot to make sure there are no processes are accessing file system. #### [Jan 14, 2018] Linux yes Command Tutorial for Beginners (with Examples) ###### Jan 14, 2018 | www.howtoforge.com You can see that user has to type 'y' for each query. It's in situation like these where yes can help. For the above scenario specifically, you can use yes in the following way: yes | rm -ri test Q3. Is there any use of yes when it's used alone? Yes, there's at-least one use: to tell how well a computer system handles high amount of loads. Reason being, the tool utilizes 100% processor for systems that have a single processor. In case you want to apply this test on a system with multiple processors, you need to run a yes process for each processor. #### [Dec 09, 2017] How to rsync only a specific list of files - Stack Overflow ##### Notable quotes: ##### "... The filenames that are read from the FILE are all relative to the source dir ..." ###### Dec 09, 2017 | stackoverflow.com ash, May 11, 2015 at 20:05 There is a flag --files-from that does exactly what you want. From man rsync : --files-from=FILE Using this option allows you to specify the exact list of files to transfer (as read from the specified FILE or - for standard input). It also tweaks the default behavior of rsync to make transferring just the specified files and directories easier: • The --relative (-R) option is implied, which preserves the path information that is specified for each item in the file (use --no-relative or --no-R if you want to turn that off). • The --dirs (-d) option is implied, which will create directories specified in the list on the destination rather than noisily skipping them (use --no-dirs or --no-d if you want to turn that off). • The --archive (-a) option's behavior does not imply --recursive (-r), so specify it explicitly, if you want it. • These side-effects change the default state of rsync, so the position of the --files-from option on the command-line has no bearing on how other options are parsed (e.g. -a works the same before or after --files-from, as does --no-R and all other options). The filenames that are read from the FILE are all relative to the source dir -- any leading slashes are removed and no ".." references are allowed to go higher than the source dir. For example, take this command: rsync -a --files-from=/tmp/foo /usr remote:/backup If /tmp/foo contains the string "bin" (or even "/bin"), the /usr/bin directory will be created as /backup/bin on the remote host. If it contains "bin/" (note the trailing slash), the immediate contents of the directory would also be sent (without needing to be explicitly mentioned in the file -- this began in version 2.6.4). In both cases, if the -r option was enabled, that dir's entire hierarchy would also be transferred (keep in mind that -r needs to be specified explicitly with --files-from, since it is not implied by -a). Also note that the effect of the (enabled by default) --relative option is to duplicate only the path info that is read from the file -- it does not force the duplication of the source-spec path (/usr in this case). In addition, the --files-from file can be read from the remote host instead of the local host if you specify a "host:" in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of ":" to mean "use the remote end of the transfer". For example: rsync -a --files-from=:/path/file-list src:/ /tmp/copy This would copy all the files specified in the /path/file-list file that was located on the remote "src" host. If the --iconv and --protect-args options are specified and the --files-from filenames are being sent from one host to another, the filenames will be translated from the sending host's charset to the receiving host's charset. NOTE: sorting the list of files in the --files-from input helps rsync to be more efficient, as it will avoid re-visiting the path elements that are shared between adjacent entries. If the input is not sorted, some path elements (implied directories) may end up being scanned multiple times, and rsync will eventually unduplicate them after they get turned into file-list elements. Nicolas Mattia, Feb 11, 2016 at 11:06 Note that you still have to specify the directory where the files listed are located, for instance: rsync -av --files-from=file-list . target/ for copying files from the current dir. – Nicolas Mattia Feb 11 '16 at 11:06 ash, Feb 12, 2016 at 2:25 Yes, and to reiterate: The filenames that are read from the FILE are all relative to the source dir . – ash Feb 12 '16 at 2:25 Michael ,Nov 2, 2016 at 0:09 if the files-from file has anything starting with .. rsync appears to ignore the .. giving me an error like rsync: link_stat "/home/michael/test/subdir/test.txt" failed: No such file or directory (in this case running from the "test" dir and trying to specify "../subdir/test.txt" which does exist. – Michael Nov 2 '16 at 0:09 xxx, --files-from= parameter needs trailing slash if you want to keep the absolute path intact. So your command would become something like below: rsync -av --files-from=/path/to/file / /tmp/ This could be done like there are a large number of files and you want to copy all files to x path. So you would find the files and throw output to a file like below: find /var/* -name *.log > file #### [Nov 13, 2017] 20 Sed (Stream Editor) Command Examples for Linux Users ###### Nov 13, 2017 | www.linuxtechi.com 20 Sed (Stream Editor) Command Examples for Linux Users by Pradeep Kumar · Published November 9, 2017 · Updated November 9, 2017 Sed command or Stream Editor is very powerful utility offered by Linux/Unix systems. It is mainly used for text substitution , find & replace but it can also perform other text manipulations like insertion deletion search etc. With SED, we can edit complete files without actually having to open it. Sed also supports the use of regular expressions, which makes sed an even more powerful test manipulation tool In this article, we will learn to use SED command with the help some examples. Basic syntax for using sed command is, sed OPTIONS [SCRIPT] [INPUTFILE ] Now let's see some examples. Example :1) Displaying partial text of a file With sed, we can view only some part of a file rather than seeing whole file. To see some lines of the file, use the following command, [linuxtechi@localhost ~]$ sed -n 22,29p testfile.txt


here, option 'n' suppresses printing of whole file & option 'p' will print only line lines from 22 to 29.

Example :2) Display all except some lines

To display all content of a file except for some portion, use the following command,

[linuxtechi@localhost ~]$sed 22,29d testfile.txt  Option 'd' will remove the mentioned lines from output. Example :3) Display every 3rd line starting with Nth line Do display content of every 3rd line starting with line number 2 or any other line, use the following command [linuxtechi@localhost ~]$ sed -n '2-3p' file.txt

Example :4 ) Deleting a line using sed command

To delete a line with sed from a file, use the following command,

[linuxtechi@localhost ~]$sed Nd testfile.txt  where 'N' is the line number & option 'd' will delete the mentioned line number. To delete the last line of the file, use [linuxtechi@localhost ~]$ sed $d testfile.txt  Example :5) Deleting a range of lines To delete a range of lines from the file, run [linuxtechi@localhost ~]$ sed '29-34d' testfile.txt


This will delete lines 29 to 34 from testfile.txt file.

Example :6) Deleting lines other than the mentioned

To delete lines other than the mentioned lines from a file, we will use '!'

[linuxtechi@localhost ~]$sed '29-34!d' testfile.txt  here '!' option is used as not, so it will reverse the condition i.e. will not delete the lines mentioned. All the lines other 29-34 will be deleted from the files testfile.txt. Example :7) Adding Blank lines/spaces To add a blank line after every non-blank line, we will use option 'G', [linuxtechi@localhost ~]$ sed G testfile.txt

Example :8) Search and Replacing a string using sed

To search & replace a string from the file, we will use the following example,

[linuxtechi@localhost ~]$sed 's/danger/safety/' testfile.txt  here option 's' will search for word 'danger' & replace it with 'safety' on every line for the first occurrence only. Example :9) Search and replace a string from whole file using sed To replace the word completely from the file, we will use option 'g' with 's', [linuxtechi@localhost ~]$ sed 's/danger/safety/g' testfile.txt

Example :10) Replace the nth occurrence of string pattern

We can also substitute a string on nth occurrence from a file. Like replace 'danger' with 'safety' only on second occurrence,

[linuxtechi@localhost ~]$sed 's/danger/safety/2' testfile.txt  To replace 'danger' on 2nd occurrence of every line from whole file, use [linuxtechi@localhost ~]$ sed 's/danger/safety/2g' testfile.txt

Example :11) Replace a string on a particular line

To replace a string only from a particular line, use

[linuxtechi@localhost ~]$sed '4 s/danger/safety/' testfile.txt  This will only substitute the string from 4th line of the file. We can also mention a range of lines instead of a single line, [linuxtechi@localhost ~]$  sed '4-9 s/danger/safety/' testfile.txt

Example :12) Add a line after/before the matched search

To add a new line with some content after every pattern match, use option 'a' ,

[linuxtechi@localhost ~]$sed '/danger/a "This is new line with text after match"' testfile.txt  To add a new line with some content a before every pattern match, use option 'i', [linuxtechi@localhost ~]$ sed '/danger/i "This is new line with text before match" ' testfile.txt

Example :13) Change a whole line with matched pattern

To change a whole line to a new line when a search pattern matches we need to use option 'c' with sed,

[linuxtechi@localhost ~]$sed '/danger/c "This will be the new line" ' testfile.txt  So when the pattern matches 'danger', whole line will be changed to the mentioned line. Advanced options with sed Up until now we were only using simple expressions with sed, now we will discuss some advanced uses of sed with regex, Example :14) Running multiple sed commands If we need to perform multiple sed expressions, we can use option 'e' to chain the sed commands, [linuxtechi@localhost ~]$  sed -e 's/danger/safety/g' -e 's/hate/love/' testfile.txt

Example :15) Making a backup copy before editing a file

To create a backup copy of a file before we edit it, use option '-i.bak',

[linuxtechi@localhost ~]$sed -i.bak -e 's/danger/safety/g' testfile.txt  This will create a backup copy of the file with extension .bak. You can also use other extension if you like. Example :16) Delete a file line starting with & ending with a pattern To delete a file line starting with a particular string & ending with another string, use [linuxtechi@localhost ~]$ sed -e 's/danger.*stops//g' testfile.txt


This will delete the line with 'danger' on start & 'stops' in the end & it can have any number of words in between , '.*' defines that part.

Example :17) Appending lines

To add some content before every line with sed & regex, use

[linuxtechi@localhost ~]$sed -e 's/.*/testing sed &/' testfile.txt  So now every line will have 'testing sed' before it. Example :18) Removing all commented lines & empty lines To remove all commented lines i.e. lines with # & all the empty lines, use [linuxtechi@localhost ~]$ sed -e 's/#.*//;/^$/d' testfile.txt  To only remove commented lines, use [linuxtechi@localhost ~]$ sed -e 's/#.*//' testfile.txt

Example :19) Get list of all usernames from /etc/passwd

To get the list of all usernames from /etc/passwd file, use

[linuxtechi@localhost ~]$sed 's/$$[^:]*$$.*/\1/' /etc/passwd a complete list all usernames will be generated on screen as output. Example :20) Prevent overwriting of system links with sed command 'sed -i' command has been know to remove system links & create only regular files in place of the link file. So to avoid such a situation & prevent ' sed -i ' from destroying the links, use ' –follow-symklinks ' options with the command being executed. Let's assume i want to disable SELinux on CentOS or RHEL Severs [linuxtechi@localhost ~]# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux These were some examples to show sed, we can use these reference to employ them as & when needed. If you guys have any queries related to this or any article, do share with us. #### [Nov 09, 2017] TERM strings by Tom Ryder ###### Jan 26, 2013 | sanctum.geek.nz A certain piece of very misleading advice is often given online to users having problems with the way certain command-line applications are displaying in their terminals. This is to suggest that the user change the value of their TERM environment variable from within the shell, doing something like this: $ TERM=xterm-256color

This misinformation sometimes extends to suggesting that users put the forced TERM change into their shell startup scripts. The reason this is such a bad idea is that it forces your shell to assume what your terminal is, and thereby disregards the initial terminal identity string sent by the emulator. This leads to a lot of confusion when one day you need to connect with a very different terminal emulator.

Accounting for differences

All terminal emulators are not created equal. Certainly, not all of them are xterm(1) , although many other terminal emulators do a decent but not comprehensive job of copying it. The value of the TERM environment variable is used by the system running the shell to determine what the terminal connecting to it can and cannot do, what control codes to send to the program to use those features, and how the shell should understand the input of certain key codes, such as the Home and End keys. These things in particular are common causes of frustration for new users who turn out to be using a forced TERM string.

Instead, focus on these two guidelines for setting TERM :

1. Avoid setting TERM from within the shell, especially in your startup scripts like .bashrc or .bash_profile . If that ever seems like the answer, then you are probably asking the wrong question! The terminal identification string should always be sent by the terminal emulator you are using; if you do need to change it, then change it in the settings for the emulator.
2. Always use an appropriate TERM string that accurately describes what your choice of terminal emulator can and cannot display. Don't make an rxvt(1) terminal identify itself as xterm ; don't make a linux console identify itself as vt100 ; and don't make an xterm(1) compiled without 256 color support refer to itself as xterm-256color .

In particular, note that sometimes for compatibility reasons, the default terminal identification used by an emulator is given as something generic like xterm , when in fact a more accurate or comprehensive terminal identity file is more than likely available for your particular choice of terminal emulator with a little searching.

An example that surprises a lot of people is the availability of the putty terminal identity file, when the application defaults to presenting itself as an imperfect xterm(1) emulator.

Before you change your terminal string in its settings, check whether the default it uses is already the correct one, with one of these:

$echo$TERM
$tset -q  Most builds of rxvt(1) , for example, should already use the correct TERM string by default, such as rxvt-unicode-256color for builds with 256 colors and Unicode support. Where to configure which TERM string your terminal uses will vary depending on the application. For xterm(1) , your .Xresources file should contain a definition like the below: XTerm*termName: xterm-256color  For rxvt(1) , the syntax is similar: URxvt*termName: rxvt-unicode-256color  Other GTK and Qt emulators sometimes include the setting somewhere in their preferences. Look for mentions of xterm , a common fallback default. For Windows PuTTY, it's configurable under the "'Connections > Data"' section: More detail about configuring PuTTY for connecting to modern systems can be found in my article on configuring PuTTY . Testing your TERM string On GNU/Linux systems, an easy way to test the terminal capabilities (particularly effects like colors and reverse video) is using the msgcat(1) utility: $ msgcat --color=test


This will output a large number of tests of various features to the terminal, so that you can check their appearance is what you expect.

Finding appropriate terminfo(5) definitions

On GNU/Linux systems, the capabilities and behavior of various terminal types is described using terminfo(5) files, usually installed as part of the ncurses package. These files are often installed in /lib/terminfo or /usr/share/terminfo , in subdirectories by first letter.

In order to use a particular TERM string, an appropriate file must exist in one of these directories. On Debian-derived systems, a large collection of terminal types can be installed to the system with the ncurses-term package.

For example, the following variants of the rxvt terminal emulator are all available:

$cd /usr/share/terminfo/r$ ls rxvt*
rxvt-16color  rxvt-256color  rxvt-88color  rxvt-color  rxvt-cygwin
rxvt-cygwin-native  rxvt+pcfkeys  rxvt-unicode-256color  rxvt-xpm

Private and custom terminfo(5) files

If you connect to a system that doesn't have a terminfo(5) definition to match the TERM definition for your particular terminal, you might get a message similar to this on login:

setterm: rxvt-unicode-256color: unknown terminal type
tput: unknown terminal "rxvt-unicode-256color"
$ If you're not able to install the appropriate terminal definition system-wide, one technique is to use a private .terminfo directory in your home directory containing the definitions you need: $ cd ~/.terminfo
$find . ./x ./x/xterm-256color ./x/xterm ./r ./r/rxvt-256color ./r/rxvt-unicode-256color ./r/rxvt ./s ./s/screen ./s/screen-256color ./p ./p/putty-256color ./p/putty  You can copy this to your home directory on the servers you manage with a tool like scp : $ scp -r .terminfo server:
TERM and multiplexers

Terminal multiplexers like screen(1) and tmux(1) are special cases, and they cause perhaps the most confusion to people when inaccurate TERM strings are used. The tmux FAQ even opens by saying that most of the display problems reported by people are due to incorrect TERM settings, and a good portion of the codebase in both multiplexers is dedicated to negotiating the differences between terminal capacities.

This is because they are "terminals within terminals", and provide their own functionality only within the bounds of what the outer terminal can do. In addition to this, they have their own type for terminals within them; both of them use screen and its variants, such as screen-256color .

It's therefore very important to check that both the outer and inner definitions for TERM are correct. In .screenrc it usually suffices to use a line like the following:

term screen

Or in .tmux.conf :

set-option -g default-terminal screen

If the outer terminals you use consistently have 256 color capabilities, you may choose to use the screen-256color variant instead.

If you follow all of these guidelines, your terminal experience will be much smoother, as your terminal and your system will understand each other that much better. You may find that this fixes a lot of struggles with interactive tools like vim(1) , for one thing, because if the application is able to divine things like the available color space directly from terminal information files, it saves you from having to include nasty hacks on the t_Co variable in your .vimrc . Posted in Terminal Tagged term strings , terminal types , terminfo

#### [Nov 09, 2017] PuTTY configuration by Tom Ryder

###### Dec 22, 2012 | sanctum.geek.nz

Posted on PuTTY is a terminal emulator with a free software license, including an SSH client. While it has cross-platform ports, it's used most frequently on Windows systems, because they otherwise lack a built-in terminal emulator that interoperates well with Unix-style TTY systems.

While it's very popular and useful, PuTTY's defaults are quite old, and are chosen for compatibility reasons rather than to take advantage of all the features of a more complete terminal emulator. For new users, this is likely an advantage as it can avoid confusion, but more advanced users who need to use a Windows client to connect to a modern GNU/Linux system may find the defaults frustrating, particularly when connecting to a more capable and custom-configured server.

Here are a few of the problems with the default configuration:

• It identifies itself as an xterm(1) , when terminfo(5) definitions are available named putty and putty-256color , which more precisely define what the terminal can and cannot do, and their various custom escape sequences.
• It only allows 16 colors, where most modern terminals are capable of using 256; this is partly tied into the terminal type definition.
• It doesn't use UTF-8 by default, which should be used whenever possible for reasons of interoperability and compatibility, and is well-supported by modern locale definitions on GNU/Linux.
• It uses Courier New, a workable but rather harsh monospace font, which should be swapped out for something more modern if available.
• It uses audible terminal bells, which tend to be annoying.
• Its default palette based on xterm(1) is rather garish and harsh; softer colors are more pleasant to read.

All of these things are fixable.

Terminal type

Usually the most important thing in getting a terminal working smoothly is to make sure it identifies itself correctly to the machine to which it's connecting, using an appropriate $TERM string. By default, PuTTY identifies itself as an xterm(1) terminal emulator, which most systems will support. However, there's a terminfo(5) definition for putty and putty-256color available as part of ncurses , and if you have it available on your system then you should use it, as it slightly more precisely describes the features available to PuTTY as a terminal emulator. You can check that you have the appropriate terminfo(5) definition installed by looking in /usr/share/terminfo/p : $ ls -1 /usr/share/terminfo/p/putty*
/usr/share/terminfo/p/putty
/usr/share/terminfo/p/putty-256color
/usr/share/terminfo/p/putty-sco
/usr/share/terminfo/p/putty-vt100


On Debian and Ubuntu systems, these files can be installed with:

# apt-get install ncurses-term


If you can't install the files via your system's package manager, you can also keep a private repository of terminfo(5) files in your home directory, in a directory called  .terminfo :

$ls -1$HOME/.terminfo/p
putty
putty-256color


Once you have this definition installed, you can instruct PuTTY to identify with that $TERM string in the Connection > Data section: Here, I've used putty-256color ; if you don't need or want a 256 color terminal you could just use putty . Once connected, make sure that your $TERM string matches what you specified, and hasn't been mangled by any of your shell or terminal configurations:

$echo$TERM
putty-256color

Color space

Certain command line applications like Vim and Tmux can take advantage of a full 256 colors in the terminal. If you'd like to use this, set PuTTY's $TERM string to putty-256color as outlined above, and select Allow terminal to use xterm 256-colour mode in Window > Colours You can test this is working by using a 256 color application, or by trying out the terminal colours directly in your shell using tput : $ for ((color = 0; color <= 255; color++)); do
> tput setaf "$color" > printf "test" > done  If you see the word test in many different colors, then things are probably working. Type reset to fix your terminal after this: $ reset

Using UTF-8

If you're connecting to a modern GNU/Linux system, it's likely that you're using a UTF-8 locale. You can check which one by typing locale . In my case, I'm using the en_NZ locale with UTF-8 character encoding:

$locale LANG=en_NZ.UTF-8 LANGUAGE=en_NZ:en LC_CTYPE="en_NZ.UTF-8" LC_NUMERIC="en_NZ.UTF-8" LC_TIME="en_NZ.UTF-8" LC_COLLATE="en_NZ.UTF-8" LC_MONETARY="en_NZ.UTF-8" LC_MESSAGES="en_NZ.UTF-8" LC_PAPER="en_NZ.UTF-8" LC_NAME="en_NZ.UTF-8" LC_ADDRESS="en_NZ.UTF-8" LC_TELEPHONE="en_NZ.UTF-8" LC_MEASUREMENT="en_NZ.UTF-8" LC_IDENTIFICATION="en_NZ.UTF-8" LC_ALL=  If the output of locale does show you're using a UTF-8 character encoding, then you should configure PuTTY to interpret terminal output using that character set; it can't detect it automatically (which isn't PuTTY's fault; it's a known hard problem). You do this in the Window > Translation section: While you're in this section, it's best to choose the Use Unicode line drawing code points option as well. Line-drawing characters are most likely to work properly with this setting for UTF-8 locales and modern fonts: If Unicode and its various encodings is new to you, I highly recommend Joel Spolsky's classic article about what programmers should know about both. Fonts Courier New is a workable monospace font, but modern Windows systems include Consolas , a much nicer terminal font. You can change this in the Window > Appearance section: There's no reason you can't use another favourite Bitmap or TrueType font instead once it's installed on your system; DejaVu Sans Mono , Inconsolata , and Terminus are popular alternatives. I personally favor Ubuntu Mono . Bells Terminal bells by default in PuTTY emit the system alert sound. Most people find this annoying; some sort of visual bell tends to be much better if you want to use the bell at all. Configure this in Terminal > Bell Given the purpose of the alert is to draw attention to the window, I find that using a flashing taskbar icon works well; I use this to draw my attention to my prompt being displayed after a long task completes, or if someone mentions my name or directly messages me in irssi(1) . Another option is using the Visual bell (flash window) option, but I personally find this even worse than the audible bell. Default palette The default colours for PuTTY are rather like those used in xterm(1) , and hence rather harsh, particularly if you're used to the slightly more subdued colorscheme of terminal emulators like gnome-terminal(1) , or have customized your palette to something like Solarized . If you have decimal RGB values for the colours you'd prefer to use, you can enter those in the Window > Colours section, making sure that Use system colours and Attempt to use logical palettes are unchecked: There are a few other default annoyances in PuTTY, but the above are the ones that seem to annoy advanced users most frequently. Dag Wieers has a similar post with a few more defaults to fix. #### [Nov 09, 2017] Searching files ##### Notable quotes: ##### "... With all this said, there's a very popular alternative to grep called ack , which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep , and being a Perl script it's otherwise very simple to install. ..." ##### "... Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep , but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available. ..." ###### sanctum.geek.nz More often than attributes of a set of files, however, you want to find files based on their contents, and it's no surprise that grep, in particular grep -R, is useful here. This searches the current directory tree recursively for anything matching 'someVar': $ grep -FR someVar .


Don't forget the case insensitivity flag either, since by default grep works with fixed case:

$grep -iR somevar .  Also, you can print a list of files that match without printing the matches themselves with grep -l: $ grep -lR someVar .


If you write scripts or batch jobs using the output of the above, use a while loop with read to handle spaces and other special characters in filenames:

grep -lR someVar | while IFS= read -r file; do
head "$file" done  If you're using version control for your project, this often includes metadata in the .svn, .git, or .hg directories. This is dealt with easily enough by excluding (grep -v) anything matching an appropriate fixed (grep -F) string: $ grep -R someVar . | grep -vF .svn


Some versions of grep include --exclude and --exclude-dir options, which may be tidier.

With all this said, there's a very popular alternative to grep called ack, which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep, and being a Perl script it's otherwise very simple to install.

Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep, but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available.




#### [Nov 01, 2017] Cron best practices by Tom Ryder

###### May 08, 2016 | sanctum.geek.nz

The time-based job scheduler cron(8) has been around since Version 7 Unix, and its crontab(5) syntax is familiar even for people who don't do much Unix system administration. It's standardised , reasonably flexible, simple to configure, and works reliably, and so it's trusted by both system packages and users to manage many important tasks.

However, like many older Unix tools, cron(8) 's simplicity has a drawback: it relies upon the user to know some detail of how it works, and to correctly implement any other safety checking behaviour around it. Specifically, all it does is try and run the job at an appropriate time, and email the output. For simple and unimportant per-user jobs, that may be just fine, but for more crucial system tasks it's worthwhile to wrap a little extra infrastructure around it and the tasks it calls.

There are a few ways to make the way you use cron(8) more robust if you're in a situation where keeping track of the running job is desirable.

Apply the principle of least privilege

The sixth column of a system crontab(5) file is the username of the user as which the task should run:

0 * * * *  root  cron-task

To the extent that is practical, you should run the task as a user with only the privileges it needs to run, and nothing else. This can sometimes make it worthwhile to create a dedicated system user purely for running scheduled tasks relevant to your application.

0 * * * *  myappcron  cron-task

This is not just for security reasons, although those are good ones; it helps protect you against nasties like scripting errors attempting to remove entire system directories .

Similarly, for tasks with database systems such as MySQL, don't use the administrative root user if you can avoid it; instead, use or even create a dedicated user with a unique random password stored in a locked-down ~/.my.cnf file, with only the needed permissions. For a MySQL backup task, for example, only a few permissions should be required, including SELECT , SHOW VIEW , and LOCK TABLES .

In some cases, of course, you really will need to be root . In particularly sensitive contexts you might even consider using sudo(8) with appropriate NOPASSWD options, to allow the dedicated user to run only the appropriate tasks as root , and nothing else.

Before placing a task in a crontab(5) file, you should test it on the command line, as the user configured to run the task and with the appropriate environment set. If you're going to run the task as root , use something like su or sudo -i to get a root shell with the user's expected environment first:

$sudo -i -u cronuser$ cron-task


Once the task works on the command line, place it in the crontab(5) file with the timing settings modified to run the task a few minutes later, and then watch /var/log/syslog with tail -f to check that the task actually runs without errors, and that the task itself completes properly:

May  7 13:30:01 yourhost CRON[20249]: (you) CMD (cron-task)

This may seem pedantic at first, but it becomes routine very quickly, and it saves a lot of hassles down the line as it's very easy to make an assumption about something in your environment that doesn't actually hold in the one that cron(8) will use. It's also a necessary acid test to make sure that your crontab(5) file is well-formed, as some implementations of cron(8) will refuse to load the entire file if one of the lines is malformed.

If necessary, you can set arbitrary environment variables for the tasks at the top of the file:

MYVAR=myvalue

0 * * * *  you  cron-task
Don't throw away errors or useful output

You've probably seen tutorials on the web where in order to keep the crontab(5) job from sending standard output and/or standard error emails every five minutes, shell redirection operators are included at the end of the job specification to discard both the standard output and standard error. This kluge is particularly common for running web development tasks by automating a request to a URL with curl(1) or wget(1) :

*/5 * * *  root  curl https://example.com/cron.php >/dev/null 2>&1

Ignoring the output completely is generally not a good idea, because unless you have other tasks or monitoring ensuring the job does its work, you won't notice problems (or know what they are), when the job emits output or errors that you actually care about.

In the case of curl(1) , there are just way too many things that could go wrong, that you might notice far too late:

• The script could get broken and return 500 errors.
• The URL of the cron.php task could change, and someone could forget to add a HTTP 301 redirect.
• Even if a HTTP 301 redirect is added, if you don't use -L or --location for curl(1) , it won't follow it.
• The client could get blacklisted, firewalled, or otherwise impeded by automatic or manual processes that falsely flag the request as spam.
• If using HTTPS, connectivity could break due to cipher or protocol mismatch.

The author has seen all of the above happen, in some cases very frequently.

As a general policy, it's worth taking the time to read the manual page of the task you're calling, and to look for ways to correctly control its output so that it emits only the output you actually want. In the case of curl(1) , for example, I've found the following formula works well:

curl -fLsS -o /dev/null http://example.com/

• -f : If the HTTP response code is an error, emit an error message rather than the 404 page.
• -L : If there's an HTTP 301 redirect given, try to follow it.
• -sS : Don't show progress meter ( -S stops -s from also blocking error messages).
• -o /dev/null : Send the standard output (the actual page returned) to /dev/null .

This way, the curl(1) request should stay silent if everything is well, per the old Unix philosophy Rule of Silence .

You may not agree with some of the choices above; you might think it important to e.g. log the complete output of the returned page, or to fail rather than silently accept a 301 redirect, or you might prefer to use wget(1) . The point is that you take the time to understand in more depth what the called program will actually emit under what circumstances, and make it match your requirements as closely as possible, rather than blindly discarding all the output and (worse) the errors. Work with Murphy's law ; assume that anything that can go wrong eventually will.

Send the output somewhere useful

Another common mistake is failing to set a useful MAILTO at the top of the  crontab(5) file, as the specified destination for any output and errors from the tasks. cron(8) uses the system mail implementation to send its messages, and typically, default configurations for mail agents will simply send the message to an mbox file in  /var/mail/$USER , that they may not ever read. This defeats much of the point of mailing output and errors. This is easily dealt with, though; ensure that you can send a message to an address you actually do check from the server, perhaps using mail(1) : $ printf '%s\n' 'Test message' | mail -s 'Test subject' you@example.com


Once you've verified that your mail agent is correctly configured and that the mail arrives in your inbox, set the address in a MAILTO variable at the top of your file:

MAILTO=you@example.com

0 * * * *    you  cron-task-1
*/5 * * * *  you  cron-task-2


If you don't want to use email for routine output, another method that works is sending the output to syslog with a tool like logger(1) :

0 * * * *   you  cron-task | logger -it cron-task


Alternatively, you can configure aliases on your system to forward system mail destined for you on to an address you check. For Postfix, you'd use an aliases(5) file.

I sometimes use this setup in cases where the task is expected to emit a few lines of output which might be useful for later review, but send stderr output via MAILTO as normal. If you'd rather not use syslog , perhaps because the output is high in volume and/or frequency, you can always set up a log file /var/log/cron-task.log but don't forget to add a logrotate(8) rule for it!

Put the tasks in their own shell script file

Ideally, the commands in your crontab(5) definitions should only be a few words, in one or two commands. If the command is running off the screen, it's likely too long to be in the crontab(5) file, and you should instead put it into its own script. This is a particularly good idea if you want to reliably use features of bash or some other shell besides POSIX/Bourne /bin/sh for your commands, or even a scripting language like Awk or Perl; by default, cron(8) uses the system's /bin/sh implementation for parsing the commands.

Because crontab(5) files don't allow multi-line commands, and have other gotchas like the need to escape percent signs % with backslashes, keeping as much configuration out of the actual crontab(5) file as you can is generally a good idea.

If you're running cron(8) tasks as a non-system user, and can't add scripts into a system bindir like /usr/local/bin , a tidy method is to start your own, and include a reference to it as part of your PATH . I favour ~/.local/bin , and have seen references to ~/bin as well. Save the script in ~/.local/bin/cron-task , make it executable with chmod +x , and include the directory in the PATH environment definition at the top of the file:

PATH=/home/you/.local/bin:/usr/local/bin:/usr/bin:/bin
MAILTO=you@example.com

0 * * * *  you  cron-task


Having your own directory with custom scripts for your own purposes has a host of other benefits, but that's another article

Avoid /etc/crontab

If your implementation of cron(8) supports it, rather than having an /etc/crontab file a mile long, you can put tasks into separate files in /etc/cron.d :

$ls /etc/cron.d system-a system-b raid-maint  This approach allows you to group the configuration files meaningfully, so that you and other administrators can find the appropriate tasks more easily; it also allows you to make some files editable by some users and not others, and reduces the chance of edit conflicts. Using sudoedit(8) helps here too. Another advantage is that it works better with version control; if I start collecting more than a few of these task files or to update them more often than every few months, I start a Git repository to track them: $ cd /etc/cron.d
$sudo git init$ sudo git add --all
$sudo git commit -m "First commit"  If you're editing a crontab(5) file for tasks related only to the individual user, use the crontab(1) tool; you can edit your own crontab(5) by typing crontab -e , which will open your $EDITOR to edit a temporary file that will be installed on exit. This will save the files into a dedicated directory, which on my system is /var/spool/cron/crontabs .

On the systems maintained by the author, it's quite normal for /etc/crontab never to change from its packaged template.

Include a timeout

cron(8) will normally allow a task to run indefinitely, so if this is not desirable, you should consider either using options of the program you're calling to implement a timeout, or including one in the script. If there's no option for the command itself, the timeout(1) command wrapper in coreutils is one possible way of implementing this:

0 * * * *  you  timeout 10s cron-task


Greg's wiki has some further suggestions on ways to implement timeouts .

Include file locking to prevent overruns

cron(8) will start a new process regardless of whether its previous runs have completed, so if you wish to avoid locking for long-running task, on GNU/Linux you could use the flock(1) wrapper for the flock(2) system call to set an exclusive lockfile, in order to prevent the task from running more than one instance in parallel.

0 * * * *  you  flock -nx /var/lock/cron-task cron-task


Greg's wiki has some more in-depth discussion of the file locking problem for scripts in a general sense, including important information about the caveats of "rolling your own" when flock(1) is not available.

If it's important that your tasks run in a certain order, consider whether it's necessary to have them in separate tasks at all; it may be easier to guarantee they're run sequentially by collecting them in a single shell script.

Do something useful with exit statuses

If your cron(8) task or commands within its script exit non-zero, it can be useful to run commands that handle the failure appropriately, including cleanup of appropriate resources, and sending information to monitoring tools about the current status of the job. If you're using Nagios Core or one of its derivatives, you could consider using send_nsca to send passive checks reporting the status of jobs to your monitoring server. I've written a simple script called nscaw to do this for me:

0 * * * *  you  nscaw CRON_TASK -- cron-task

Consider alternatives to cron(8)

If your machine isn't always on and your task doesn't need to run at a specific time, but rather needs to run once daily or weekly, you can install anacron and drop scripts into the cron.hourly , cron.daily , cron.monthly , and cron.weekly directories in /etc , as appropriate. Note that on Debian and Ubuntu GNU/Linux systems, the default /etc/crontab contains hooks that run these, but they run only if anacron(8) is not installed.

If you're using cron(8) to poll a directory for changes and run a script if there are such changes, on GNU/Linux you could consider using a daemon based on inotifywait(1) instead.

Finally, if you require more advanced control over when and how your task runs than cron(8) can provide, you could perhaps consider writing a daemon to run on the server consistently and fork processes for its task. This would allow running a task more often than once a minute, as an example. Don't get too bogged down into thinking that cron(8) is your only option for any kind of asynchronous task management!

#### [Nov 01, 2017] Listing files

###### www.tecmint.com

Using ls is probably one of the first commands an administrator will learn for getting a simple list of the contents of the directory. Most administrators will also know about the -a and -l switches, to show all files including dot files and to show more detailed data about files in columns, respectively.

There are other switches to GNU ls which are less frequently used, some of which turn out to be very useful for programming:

• -t - List files in order of last modification date, newest first. This is useful for very large directories when you want to get a quick list of the most recent files changed, maybe piped through head or sed 10q. Probably most useful combined with -l. If you want the oldest files, you can add -r to reverse the list.
• -X - Group files by extension; handy for polyglot code, to group header files and source files separately, or to separate source files from directories or build files.
• -v - Naturally sort version numbers in filenames.
• -S - Sort by filesize.
• -R - List files recursively. This one is good combined with -l and piped through a pager like less.

Since the listing is text like anything else, you could, for example, pipe the output of this command into a vim process, so you could add explanations of what each file is for and save it as an inventory file or add it to a README:

$ls -XR | vim -  This kind of stuff can even be automated by make with a little work, which I'll cover in another article later in the series. #### [Nov 01, 2017] Default grep options by Tom Ryder ###### May 18, 2012 | sanctum.geek.nz When you're searching a set of version-controlled files for a string with grep , particularly if it's a recursive search, it can get very annoying to be presented with swathes of results from the internals of the hidden version control directories like .svn or .git , or include metadata you're unlikely to have wanted in files like .gitmodules . GNU grep uses an environment variable named GREP_OPTIONS to define a set of options that are always applied to every call to grep . This comes in handy when exported in your .bashrc file to set a "standard" grep environment for your interactive shell. Here's an example of a definition of GREP_OPTIONS that excludes a lot of patterns which you'd very rarely if ever want to search with grep : GREP_OPTIONS= for pattern in .cvs .git .hg .svn; do GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern done export GREP_OPTIONS  Note that --exclude-dir is a relatively recent addition to the options for GNU grep , but it should only be missing on very legacy GNU/Linux machines by now. If you want to keep your .bashrc file compatible, you could apply a little extra hackery to make sure the option is available before you set it up to be used: GREP_OPTIONS= if grep --help | grep -- --exclude-dir &>/dev/null; then for pattern in .cvs .git .hg .svn; do GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern" done fi export GREP_OPTIONS  Similarly, you can ignore single files with --exclude . There's also --exclude-from=FILE if your list of excluded patterns starts getting too long. Other useful options available in GNU grep that you might wish to add to this environment variable include: • --color -- On appropriate terminal types, highlight the pattern matches in output, among other color changes that make results more readable • -s -- Suppresses error messages about files not existing or being unreadable; helps if you find this behaviour more annoying than useful. • -E, -F, or -P -- Pick a favourite "mode" for grep ; devotees of PCRE may find adding -P for grep 's experimental PCRE support makes grep behave in a much more pleasing way, even though it's described in the manual as being experimental and incomplete If you don't want to use GREP_OPTIONS , you could instead simply set up an  alias : alias grep='grep --exclude-dir=.git'  You may actually prefer this method as it's essentially functionally equivalent, but if you do it this way, when you want to call grep without your standard set of options, you only have to prepend a backslash to its call: $ \grep pattern file


Commenter Andy Pearce also points out that using this method can avoid some build problems where GREP_OPTIONS would interfere.

Of course, you could solve a lot of these problems simply by using ack but that's another post. Posted in Bash Tagged ack , alias , color , default , environment , exclude , grep , grep_options , options , pcre , variable , version control

#### [Oct 31, 2017] Bash job control by Tom Ryder

###### Jan 31, 2012 | sanctum.geek.nz

Oftentimes you may wish to start a process on the Bash shell without having to wait for it to actually complete, but still be notified when it does. Similarly, it may be helpful to temporarily stop a task while it's running without actually quitting it, so that you can do other things with the terminal. For these kinds of tasks, Bash's built-in job control is very useful. Backgrounding processes

If you have a process that you expect to take a long time, such as a long cp or scp operation, you can start it in the background of your current shell by adding an ampersand to it as a suffix:

$cp -r /mnt/bigdir /home & [1] 2305  This will start the copy operation as a child process of your bash instance, but will return you to the prompt to enter any other commands you might want to run while that's going. The output from this command shown above gives both the job number of 1, and the process ID of the new task, 2305. You can view the list of jobs for the current shell with the builtin jobs : $ jobs
[1]+  Running  cp -r /mnt/bigdir /home &


If the job finishes or otherwise terminates while it's backgrounded, you should see a message in the terminal the next time you update it with a newline:

[1]+  Done  cp -r /mnt/bigdir /home &

Foregrounding processes

If you want to return a job in the background to the foreground, you can type fg :

$fg cp -r /mnt/bigdir /home &  If you have more than one job backgrounded, you should specify the particular job to bring to the foreground with a parameter to fg : $ fg %1


In this case, for shorthand, you can optionally omit fg and it will work just the same:

$%1  Suspending processes To temporarily suspend a process, you can press Ctrl+Z: $ cp -r /mnt/bigdir /home
^Z
[1]+  Stopped  cp -r /mnt/bigdir /home


You can then continue it in the foreground or background with fg %1 or bg %1 respectively, as above.

This is particularly useful while in a text editor; instead of quitting the editor to get back to a shell, or dropping into a subshell from it, you can suspend it temporarily and return to it with fg once you're ready.

Dealing with output

While a job is running in the background, it may still print its standard output and standard error streams to your terminal. You can head this off by redirecting both streams to /dev/null for verbose commands:

$cp -rv /mnt/bigdir /home &>/dev/null  However, if the output of the task is actually of interest to you, this may be a case where you should fire up another terminal emulator, perhaps in GNU Screen or tmux , rather than using simple job control. Suspending SSH sessions As a special case, you can suspend an SSH session using an SSH escape sequence . Type a newline followed by a ~ character, and finally press Ctrl+Z to background your SSH session and return to the terminal from which you invoked it. tom@conan:~$ ssh crom
tom@crom:~$~^Z [suspend ssh] [1]+ Stopped ssh crom tom@conan:~$


You can then resume it as you would any job by typing fg :

tom@conan:~$fg %1 ssh crom tom@crom:~$


#### [Oct 31, 2017] Elegant Awk usage by Tom Ryder

##### It's better to use Perl for this pupose...
###### Feb 06, 2012 | sanctum.geek.nz

For many system administrators, Awk is used only as a way to print specific columns of data from programs that generate columnar output, such as netstat or ps .

For example, to get a list of all the IP addresses and ports with open TCP connections on a machine, one might run the following:

# netstat -ant | awk '{print $5}' This works pretty well, but among the data you actually wanted it also includes the fifth word of the opening explanatory note, and the heading of the fifth column: and Address 0.0.0.0:* 205.188.17.70:443 172.20.0.236:5222 72.14.203.125:5222 There are varying ways to deal with this. Matching patterns One common way is to pipe the output further through a call to grep , perhaps to only include results with at least one number: # netstat -ant | awk '{print$5}' | grep '[0-9]'

In this case, it's instructive to use the awk call a bit more intelligently by setting a regular expression which the applicable line must match in order for that field to be printed, with the standard / characters as delimiters. This eliminates the need for the call to grep :

# netstat -ant | awk '/[0-9]/ {print $5}' We can further refine this by ensuring that the regular expression should only match data in the fifth column of the output, using the ~ operator: # netstat -ant | awk '$5 ~ /[0-9]/ {print $5}'  Skipping lines Another approach you could take to strip the headers out might be to use sed to skip the first two lines of the output: # netstat -ant | awk '{print$5}' | sed 1,2d


However, this can also be incorporated into the awk call, using the NR variable and making it part of a conditional checking the line number is greater than two:

# netstat -ant | awk 'NR>2 {print $5}'  Combining and excluding patterns Another common idiom on systems that don't have the special pgrep command is to filter ps output for a string, but exclude the grep process itself from the output with grep -v grep : # ps -ef | grep apache | grep -v grep | awk '{print$2}'


If you're using Awk to get columnar data from the output, in this case the second column containing the process ID, both calls to grep can instead be incorporated into the awk call:

# ps -ef | awk '/apache/ && !/awk/ {print $2}'  Again, this can be further refined if necessary to ensure you're only matching the expressions against the command name by specifying the field number for each comparison: # ps -ef | awk '$8 ~ /apache/ && $8 !~ /awk/ {print$2}'


If you're used to using Awk purely as a column filter, the above might help to increase its utility for you and allow you to write shorter and more efficient command lines. The Awk Primer on Wikibooks is a really good reference for using Awk to its fullest for the sorts of tasks for which it's especially well-suited.

#### [Oct 31, 2017] Counting with grep and uniq by Tom Ryder

###### Feb 18, 2012 | sanctum.geek.nz

A common idiom in Unix is to count the lines of output in a file or pipe with wc -l :

$wc -l example.txt 43$ ps -e | wc -l
97


Sometimes you want to count the number of lines of output from a grep call, however. You might do it this way:

$ps -ef | grep apache | wc -l 6  But grep has built-in counting of its own, with the -c option: $ ps -ef | grep -c apache
6

The above is more a matter of good style than efficiency, but another tool with a built-in counting option that could save you time is the oft-used uniq . The below example shows a use of uniq to filter a sorted list into unique rows:

$ps -ef | awk '{print$1}' | sort | uniq
105
daemon
lp
mysql
nagios
postfix
root
snmp
tom
UID
www-data


If it would be useful to know in this case how many processes were being run by each of these users, you can include the -c option for uniq :

$ps -ef | awk '{print$1}' | sort | uniq -c
1 105
1 daemon
1 lp
1 mysql
1 nagios
2 postfix
78 root
1 snmp
7 tom
1 UID
5 www-data


You could even sort this output itself to show the users running the most processes first with sort -rn :

$ps -ef | awk '{print$1}' | sort | uniq -c | sort -rn
78 root
8 tom
5 www-data
2 postfix
1 UID
1 snmp
1 nagios
1 mysql
1 lp
1 daemon
1 105


Incidentally, if you're not counting results and really do just want a list of unique users, you can leave out the uniq and just add the -u flag to sort :

$ps -ef | awk '{print$1}' | sort -u
105
daemon
lp
mysql
nagios
postfix
root
snmp
tom
UID
www-data


The above means I actually find myself using uniq with no options quite seldom.

#### [Oct 31, 2017] 256 colour terminals by Tom Ryder

##### "... Similarly, to use 256 colours in GNU Screen, add the following to your .screenrc : ..."
###### February 23, 2012 | sanctum.geek.nz

Using 256 colours in terminals is well-supported in GNU/Linux distributions these days, and also in Windows terminal emulators like PuTTY. Using 256 colours is great for Vim colorschemes in particular, but also very useful for Tmux colouring or any other terminal application where a slightly wider colour space might be valuable. Be warned that once you get this going reliably, there's no going back if you spend a lot of time in the terminal. Xterm

done


#### [Aug 28, 2017] rsync over SSH preserve ownership only for www-data owned files

###### Aug 28, 2017 | stackoverflow.com
up vote 10 down vote favorite 4

jeffery_the_wind , asked Mar 6 '12 at 15:36

I am using rsync to replicate a web folder structure from a local server to a remote server. Both servers are ubuntu linux. I use the following command, and it works well:
rsync -az /var/www/ user@10.1.1.1:/var/www/


The usernames for the local system and the remote system are different. From what I have read it may not be possible to preserve all file and folder owners and groups. That is OK, but I would like to preserve owners and groups just for the www-data user, which does exist on both servers.

Is this possible? If so, how would I go about doing that?

Thanks!

** EDIT **

There is some mention of rsync being able to preserve ownership and groups on remote file syncs here: http://lists.samba.org/archive/rsync/2005-August/013203.html

** EDIT 2 **

I ended up getting the desired affect thanks to many of the helpful comments and answers here. Assuming the IP of the source machine is 10.1.1.2 and the IP of the destination machine is 10.1.1.1. I can use this line from the destination machine:

sudo rsync -az user@10.1.1.2:/var/www/ /var/www/


This preserves the ownership and groups of the files that have a common user name, like www-data. Note that using rsync  without sudo  does not preserve these permissions.

ghoti , answered Mar 6 '12 at 19:01

You can also sudo the rsync on the target host by using the --rsync-path  option:
# rsync -av --rsync-path="sudo rsync" /path/to/files user@targethost:/path


This lets you authenticate as user  on targethost, but still get privileged write permission through sudo  . You'll have to modify your sudoers file on the target host to avoid sudo's request for your password. man sudoers  or run sudo visudo  for instructions and samples.

You mention that you'd like to retain the ownership of files owned by www-data, but not other files. If this is really true, then you may be out of luck unless you implement chown  or a second run of rsync  to update permissions. There is no way to tell rsync to preserve ownership for just one user .

That said, you should read about rsync's --files-from  option.

rsync -av /path/to/files user@targethost:/path
find /path/to/files -user www-data -print | \
rsync -av --files-from=- --rsync-path="sudo rsync" /path/to/files user@targethost:/path


I haven't tested this, so I'm not sure exactly how piping find's output into --files-from=-  will work. You'll undoubtedly need to experiment.

xato , answered Mar 6 '12 at 15:39

As far as I know, you cannot chown  files to somebody else than you, if you are not root. So you would have to rsync  using the www-data  account, as all files will be created with the specified user as owner. So you need to chown  the files afterwards.

user2485267 , answered Jun 14 '13 at 8:22

I had a similar problem and cheated the rsync command,

rsync -avz --delete root@x.x.x.x:/home//domains/site/public_html/ /home/domains2/public_html && chown -R wwwusr:wwwgrp /home/domains2/public_html/

the && runs the chown against the folder when the rsync completes successfully (1x '&' would run the chown regardless of the rsync completion status)

Graham , answered Mar 6 '12 at 15:51

The root users for the local system and the remote system are different.

What does this mean? The root user is uid 0. How are they different?

Any user with read permission to the directories you want to copy can determine what usernames own what files. Only root can change the ownership of files being written .

You're currently running the command on the source machine, which restricts your writes to the permissions associated with user@10.1.1.1. Instead, you can try to run the command as root on the target machine. Your read access on the source machine isn't an issue.

So on the target machine (10.1.1.1), assuming the source is 10.1.1.2:

# rsync -az user@10.1.1.2:/var/www/ /var/www/


Make sure your groups match on both machines.

Also, set up access to user@10.1.1.2 using a DSA or RSA key, so that you can avoid having passwords floating around. For example, as root on your target machine, run:

# ssh-keygen -d


Then take the contents of the file /root/.ssh/id_dsa.pub  and add it to ~user/.ssh/authorized_keys  on the source machine. You can ssh user@10.1.1.2  as root from the target machine to see if it works. If you get a password prompt, check your error log to see why the key isn't working.

ghoti , answered Mar 6 '12 at 18:54

Well, you could skip the challenges of rsync altogether, and just do this through a tar tunnel.
sudo tar zcf - /path/to/files | \
ssh user@remotehost "cd /some/path; sudo tar zxf -"


You'll need to set up your SSH keys as Graham described.

Note that this handles full directory copies, not incremental updates like rsync.

The idea here is that:

• you tar up your directory,
• instead of creating a tar file, you send the tar output to stdout,
• that stdout is piped through an SSH command to a receiving tar on the other host,
• but that receiving tar is run by sudo, so it has privileged write access to set usernames.

#### [Aug 28, 2017] rsync and file permissions

###### Aug 28, 2017 | superuser.com
up vote down vote favorite I'm trying to use rsync to copy a set of files from one system to another. I'm running the command as a normal user (not root). On the remote system, the files are owned by apache and when copied they are obviously owned by the local account (fred).

My problem is that every time I run the rsync command, all files are re-synched even though they haven't changed. I think the issue is that rsync sees the file owners are different and my local user doesn't have the ability to change ownership to apache, but I'm not including the -a  or -o  options so I thought this would not be checked. If I run the command as root, the files come over owned by apache and do not come a second time if I run the command again. However I can't run this as root for other reasons. Here is the command:

/usr/bin/rsync --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose root@server.example.com:/src/dir/ /local/dir

unix rsync
 share improve this question edited May 2 '11 at 23:53 Gareth 13.9k 11 44 58 asked May 2 '11 at 23:43 Fred Snertz 11
Why can't you run rsync as root? On the remote system, does fred have read access to the apache-owned files? – chrishiestand May 3 '11 at 0:32
Ah, I left out the fact that there are ssh keys set up so that local fred can become remote root, so yes fred/root can read them. I know this is a bit convoluted but its real. – Fred Snertz May 3 '11 at 14:50
Always be careful when root can ssh into the machine. But if you have password and challenge response authentication disabled it's not as bad. – chrishiestand May 3 '11 at 17:32
-c, --checksum
This changes the way rsync checks if the files have been changed and are in need of a  transfer.   Without  this  option,
rsync  uses  a "quick check" that (by default) checks if each file's size and time of last modification match between the
sender and receiver.  This option changes this to compare a 128-bit checksum for each file  that  has  a  matching  size.
Generating  the  checksums  means  that both sides will expend a lot of disk I/O reading all the data in the files in the
transfer (and this is prior to any reading that will be done to transfer changed files), so this  can  slow  things  down
significantly.

The  sending  side  generates  its checksums while it is doing the file-system scan that builds the list of the available
files.  The receiver generates its checksums when it is scanning for changed files, and will checksum any file  that  has
the  same  size  as the corresponding sender's file:  files with either a changed size or a changed checksum are selected
for transfer.

Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by  checking
a  whole-file  checksum  that is generated as the file is transferred, but that automatic after-the-transfer verification
has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check.

For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5.  For older protocols, the checksum  used
is MD4.


So run:

/usr/bin/rsync -c --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose root@server.example.com:/src/dir/ /local/dir


Note there may be a time+disk churn tradeoff by using this option. Personally, I'd probably just sync the file's mtimes too:

/usr/bin/rsync -t --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose root@server.example.com:/src/dir/ /local/dir

 share improve this answer edited May 3 '11 at 17:55 answered May 3 '11 at 17:48 chrishiestand 1,098 10
Awesome. Thank you. Looks like the second option is going to work for me and I found the first very interesting. – Fred Snertz May 3 '11 at 18:40
psst, hit the green checkbox to give my answer credit ;-) Thx. – chrishiestand May 12 '11 at 1:56

#### [Aug 28, 2017] Why does rsync fail to copy files from /sys in Linux?

##### "... pseudo filesystems ..."
###### Aug 28, 2017 | unix.stackexchange.com

Eugene Yarmash , asked Apr 24 '13 at 16:35

I have a bash script which uses rsync  to backup files in Archlinux. I noticed that rsync  failed to copy a file from /sys  , while cp  worked just fine:
# rsync /sys/class/net/enp3s1/address /tmp
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]

# cp  /sys/class/net/enp3s1/address /tmp   ## this works


I wonder why does rsync  fail, and is it possible to copy the file with it?

mattdm , answered Apr 24 '13 at 18:20

Rsync has code which specifically checks if a file is truncated during read and gives this error ! ENODATA  . I don't know why the files in /sys  have this behavior, but since they're not real files, I guess it's not too surprising. There doesn't seem to be a way to tell rsync to skip this particular check.

I think you're probably better off not rsyncing /sys  and using specific scripts to cherry-pick out the particular information you want (like the network card address).

Runium , answered Apr 25 '13 at 0:23

First off /sys  is a pseudo file system . If you look at /proc/filesystems  you will find a list of registered file systems where quite a few has nodev  in front. This indicates they are pseudo filesystems . This means they exists on a running kernel as a RAM-based filesystem. Further they do not require a block device.
$cat /proc/filesystems nodev sysfs nodev rootfs nodev bdev ...  At boot the kernel mount this system and updates entries when suited. E.g. when new hardware is found during boot or by udev  . In /etc/mtab  you typically find the mount by: sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0  For a nice paper on the subject read Patric Mochel's – The sysfs Filesystem . stat of /sys files If you go into a directory under /sys  and do a ls -l  you will notice that all files has one size. Typically 4096 bytes. This is reported by sysfs  . :/sys/devices/pci0000:00/0000:00:19.0/net/eth2$ ls -l
-r--r--r-- 1 root root 4096 Apr 24 20:09 addr_assign_type
-r--r--r-- 1 root root 4096 Apr 24 20:09 address
-r--r--r-- 1 root root 4096 Apr 24 20:09 addr_len
...


Further you can do a stat  on a file and notice another distinct feature; it occupies 0 blocks. Also inode of root (stat /sys) is 1. /stat/fs  typically has inode 2. etc.

rsync vs. cp

The easiest explanation for rsync failure of synchronizing pseudo files is perhaps by example.

Say we have a file named address  that is 18 bytes. An ls  or stat  of the file reports 4096 bytes.

rsync
1. Opens file descriptor, fd.
2. Uses fstat(fd) to get information such as size.
3. Set out to read size bytes, i.e. 4096. That would be line 253 of the code linked by @mattdm . read_size == 4096 
2. A short string is read i.e. 18 bytes. nread == 18 
3. read_size = read_size - nread (4096 - 18 = 4078) 
5. 0 bytes read (as first read consumed all bytes in file).
6. nread == 0  , line 255
7. Unable to read 4096  bytes. Zero out buffer.
8. Set error ENODATA  .
9. Return.
5. Retry. (Above loop).
6. Fail.
8. FINE.

During this process it actually reads the entire file. But with no size available it cannot validate the result – thus failure is only option.

cp
1. Opens file descriptor, fd.
2. Uses fstat(fd) to get information such as st_size (also uses lstat and stat).
3. Check if file is likely to be sparse. That is the file has holes etc.
copy.c:1010
/* Use a heuristic to determine whether SRC_NAME contains any sparse
* blocks.  If the file has fewer blocks than would normally be
* needed for a file of its size, then at least one of the blocks in
* the file is a hole.  */
sparse_src = is_probably_sparse (&src_open_sb);


As stat  reports file to have zero blocks it is categorized as sparse.

4. Tries to read file by extent-copy (a more efficient way to copy normal sparse files), and fails.
5. Copy by sparse-copy.
1. Starts out with max read size of MAXINT.
Typically 18446744073709551615  bytes on a 32 bit system.
2. Ask; read 4096 bytes. (Buffer size allocated in memory from stat information.)
3. A short string is read i.e. 18 bytes.
4. Check if a hole is needed, nope.
5. Write buffer to target.
6. Subtract 18 from max read size.
8. 0 bytes as all got consumed in first read.
9. Return success.
6. All OK. Update flags for file.
7. FINE.

,

Might be related, but extended attribute calls will fail on sysfs:

[root@hypervisor eth0]#

Looking at my strace it looks like rsync tries to pull in extended attributes by default:

22964 <... getxattr resumed> , 0x7fff42845110, 132) = -1 ENODATA (No data available)

I tried finding a flag to give rsync to see if skipping extended attributes resolves the issue but wasn't able to find anything ( --xattrs  turns them on at the destination).

#### [Aug 28, 2017] Rsync doesn't copy everyting s

###### Aug 28, 2017 | ubuntuforums.org

View Full Version : [ubuntu] Rsync doesn't copy everyting

Scormen May 31st, 2009, 10:09 AM Hi all,

I'm having some trouble with rsync. I'm trying to sync my local /etc directory to a remote server, but this won't work.

The problem is that it seems he doesn't copy all the files.
The local /etc dir contains 15MB of data, after a rsync, the remote backup contains only 4.6MB of data.

Rsync is running by root. I'm using this command:

rsync --rsync-path="sudo rsync" -e "ssh -i /root/.ssh/backup" -avz --delete --delete-excluded -h --stats /etc kris@192.168.1.3:/home/kris/backup/laptopkris

I hope someone can help.
Thanks!

Kris

Scormen May 31st, 2009, 11:05 AM I found that if I do a local sync, everything goes fine.
But if I do a remote sync, it copies only 4.6MB.

Any idea?

LoneWolfJack May 31st, 2009, 05:14 PM never used rsync on a remote machine, but "sudo rsync" looks wrong. you probably can't call sudo like that so the ssh connection needs to have the proper privileges for executing rsync.

just an educated guess, though.

In /etc/sudoers I have added next line, so "sudo rsync" will work.

kris ALL=NOPASSWD: /usr/bin/rsync

I also tried without --rsync-path="sudo rsync", but without success.

I have also tried on the server to pull the files from the laptop, but that doesn't work either.

LoneWolfJack May 31st, 2009, 05:30 PM in the rsync help file it says that --rsync-path is for the path to rsync on the remote machine, so my guess is that you can't use sudo there as it will be interpreted as a path.

so you will have to do --rsync-path="/path/to/rsync" and make sure the ssh login has root privileges if you need them to access the files you want to sync.

--rsync-path="sudo rsync" probably fails because
a) sudo is interpreted as a path
b) the space isn't escaped
c) sudo probably won't allow itself to be called remotely

again, this is not more than an educated guess.

Scormen May 31st, 2009, 05:45 PM I understand what you mean, so I tried also:

rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@192.168.1.3:/home/kris/backup/laptopkris

Then I get this error:

sending incremental file list
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/pap": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/provider": Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.crt" -> "/etc/ssl/certs/ssl-cert-snakeoil.pem" failed: Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.key" -> "/etc/ssl/private/ssl-cert-snakeoil.key" failed: Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ppp/peers/provider": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ssl/private/ssl-cert-snakeoil.key": Permission denied (13)

sent 86.85K bytes received 306 bytes 174.31K bytes/sec
total size is 8.71M speedup is 99.97
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1058) [sender=3.0.5]

And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.

Scormen June 1st, 2009, 09:00 AM Sorry for this bump.
I'm still having the same problem.

Any idea?

Thanks.

binary10 June 1st, 2009, 10:36 AM I understand what you mean, so I tried also:

rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@192.168.1.3:/home/kris/backup/laptopkris

Then I get this error:

And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.

Maybe there's a nicer way but you could place /usr/bin/rsync into a private protected area and set the owner to root place the sticky bit on it and change your rsync-path argument such like:

# on the remote side, aka kris@192.168.1.3
mkdir priv-area
# protect it from normal users running a priv version of rsync
chmod 700 priv-area
cd priv-area
cp -p /usr/local/bin/rsync ./rsync-priv
sudo chown 0:0 ./rsync-priv
sudo chmod +s ./rsync-priv
ls -ltra # rsync-priv should now be 'bold-red' in bash

Looking at your flags, you've specified a cvs ignore factor, ignore files that are updated on the target, and you're specifying a backup of removed files.

rsync -Cavuhzb --rsync-path="/home/kris/priv-area/rsync-priv" -e "ssh -i /root/.ssh/backup" /etc kris@192.168.1.3:/home/kris/backup/laptopkris

From those qualifiers you're not going to be getting everything sync'd. It's doing what you're telling it to do.

If you really wanted to perform a like for like backup.. (not keeping stuff that's been changed/deleted from the source. I'd go for something like the following.

rsync --archive --delete --hard-links --one-file-system --acls --xattrs --dry-run -i --rsync-path="/home/kris/priv-area/rsync-priv" --rsh="ssh -i /root/.ssh/backup" /etc/ kris@192.168.1.3:/home/kris/backup/laptopkris/etc/

Remove the --dry-run and -i when you're happy with the output, and it should do what you want. A word of warning, I get a bit nervous when not seeing trailing (/) on directories as it could lead to all sorts of funnies if you end up using rsync on softlinks.

Scormen June 1st, 2009, 12:19 PM Thanks for your help, binary10.

I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!

Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...

Thanks.

binary10 June 1st, 2009, 01:22 PM Thanks for your help, binary10.

I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!

Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...

Thanks.

Ok so I've gone back and looked at your original post, how are you calculating 15MB of data under etc - via a du -hsx /etc/ ??

I do daily drive to drive backup copies via rsync and drive to network copies.. and have used them recently for restoring.

Sure my du -hsx /etc/ reports 17MB of data of which 10MB gets transferred via an rsync. My backup drives still operate.

rsync 3.0.6 has some fixes to do with ACLs and special devices rsyncing between solaris. but I think 3.0.5 is still ok with ubuntu to ubuntu systems.

Here is my test doing exactly what you you're probably trying to do. I even check the remote end..

binary10@jsecx25:~/bin-priv$./rsync --archive --delete --hard-links --one-file-system --stats --acls --xattrs --human-readable --rsync-path="~/bin/rsync-priv-os-specific" --rsh="ssh" /etc/ rsyncbck@10.0.0.21:/home/kris/backup/laptopkris/etc/ Number of files: 3121 Number of files transferred: 1812 Total file size: 10.04M bytes Total transferred file size: 10.00M bytes Literal data: 10.00M bytes Matched data: 0 bytes File list size: 109.26K File list generation time: 0.002 seconds File list transfer time: 0.000 seconds Total bytes sent: 10.20M Total bytes received: 38.70K sent 10.20M bytes received 38.70K bytes 4.09M bytes/sec total size is 10.04M speedup is 0.98 binary10@jsecx25:~/bin-priv$ sudo du -hsx /etc/
17M /etc/
binary10@jsecx25:~/bin-priv$And then on the remote system I do the du -hsx binary10@lenovo-n200:/home/kris/backup/laptopkris/etc$ cd ..
binary10@lenovo-n200:/home/kris/backup/laptopkris$sudo du -hsx etc 17M etc binary10@lenovo-n200:/home/kris/backup/laptopkris$

Scormen June 1st, 2009, 01:35 PM ow are you calculating 15MB of data under etc - via a du -hsx /etc/ ??
Indeed, on my laptop I see:

root@laptopkris:/home/kris# du -sh /etc/
15M /etc/

If I do the same thing after a fresh sync to the server, I see:

root@server:/home/kris# du -sh /home/kris/backup/laptopkris/etc/
4.6M /home/kris/backup/laptopkris/etc/

On both sides, I have installed Ubuntu 9.04, with version 3.0.5 of rsync.
So strange...

binary10 June 1st, 2009, 01:45 PM it does seem a bit odd.

I'd start doing a few diffs from the outputs find etc/ -printf "%f %s %p %Y\n" | sort

And see what type of files are missing.

- edit - Added the %Y file type.

Scormen June 1st, 2009, 01:58 PM Hmm, it's going stranger.
Now I see that I have all my files on the server, but they don't have their full size (bytes).

I have uploaded the files, so you can look into them.

Laptop: http://www.linuxontdekt.be/files/laptop.files
Server: http://www.linuxontdekt.be/files/server.files

binary10 June 1st, 2009, 02:16 PM If you look at the files that are different aka the ssl's they are links to local files else where aka linked to /usr and not within /etc/

aka they are different on your laptop and the server

Scormen June 1st, 2009, 02:25 PM I understand that soft links are just copied, and not the "full file".

But, you have run the same command to test, a few posts ago.
How is it possible that you can see the full 15MB?

binary10 June 1st, 2009, 02:34 PM I was starting to think that this was a bug with du.

The de-referencing is a bit topsy.

If you rsync copy the remote backup back to a new location back onto the laptop and do the du command. I wonder if you'll end up with 15MB again.

Scormen June 1st, 2009, 03:20 PM Good tip.

On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.

If I go on the laptop to that new directory and do a du, it says 15MB.

binary10 June 1st, 2009, 03:34 PM Good tip.

On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.

If I go on the laptop to that new directory and do a du, it says 15MB.

I think you've now confirmed that RSYNC DOES copy everything.. just tht du confusing what you had expected by counting the end link sizes.

It might also think about what you're copying, maybe you need more than just /etc of course it depends on what you are trying to do with the backup :)

enjoy.

Scormen June 1st, 2009, 03:37 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?
binary10 June 1st, 2009, 04:23 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?

The links were copied as links as per the design of the --archive in rsync.

The contents of the pointing links were different between your two systems. These being that that reside outside of /etc/ in /usr And so DU reporting them differently.

Scormen June 1st, 2009, 05:36 PM Okay, I got it.
Many thanks for the support, binarty10!
Scormen June 1st, 2009, 05:59 PM Just to know, is it possible to copy the data from these links as real, hard data?
Thanks.
binary10 June 2nd, 2009, 09:54 AM Just to know, is it possible to copy the data from these links as real, hard data?
Thanks.

Yep absolutely

You should then look at other possibilities of:

but then you'll have to start questioning why you are backing them up like that especially stuff under /etc/. If you ever wanted to restore it you'd be restoring full files and not symlinks the restore result could be a nightmare as well as create future issues (upgrades etc) let alone your backup will be significantly larger, could be 150MB instead of 4MB.

Scormen June 2nd, 2009, 10:04 AM Okay, now I'm sure what its doing :)
Is it also possible to show on a system the "real disk usage" of e.g. that /etc directory? So, without the links, that we get a output of 4.6MB.

Thank you very much for your help!

binary10 June 2nd, 2009, 10:22 AM What does the following respond with.

sudo du --apparent-size -hsx /etc

If you want the real answer then your result from a dry-run rsync will only be enough for you.

sudo rsync --dry-run --stats -h --archive /etc/ /tmp/etc/

#### [Aug 27, 2017] Diff A Directory Recursively, Ignoring All Binary Files

##### It is now possible to use -r to recursively compare directories
###### Aug 27, 2017 | stackoverflow.com
diff -r dir1/ dir2/ | sed '/Binary\ files\ /d' >outputfile

This recursively compares dir1 to dir2, sed removes the lines for binary files (begins with " Binary files "), then it's redirected to the outputfile.

#### [Aug 14, 2017] Cut command on RHEL 6.8 compatibility issues Unix Linux Forums Shell Programming and Scripting

##### "... Much better: change your scripts. Run the following fix_cut script on your scripts: ..."
###### Aug 14, 2017 | www.unix.com

06-29-2016Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts

Cut command on RHEL 6.8 compatibility issues

We have a lot of scripts using cut as :
cut -c 0-8 --works for cut (GNU coreutils) 5.97, but does not work for cut (GNU coreutils) 8.4.
Gives error -

Code:

cut: fields and positions are numbered from 1

The position needs to start with 1 for later version of cut and this is causing an issue.

Is there a way where I can have multiple cut versions installed and use the older version of cut for the user which runs the script?

or any other work around without having to change the scripts?

Thanks.

Last edited by RudiC; 06-30-2016 at 04:53 AM .. Reason: Added code tags.

Vikram Jain

Don Cragun AdministratorJoin Date: Jul 2012 Last Activity: 14 August 2017, 3:59 PM EDT Location: San Jose, CA, USA Posts: 10,455 Thanks: 533 Thanked 3,654 Times in 3,118 Posts

What are you trying to do when you invoke

Code:

cut -c 0-8


with your old version of cut

With that old version of cut , is there any difference in the output produced by the two pipelines:

Code:

echo 0123456789abcdef | cut -c 0-8


and:

Code:

echo 0123456789abcdef | cut -c 1-8


or do they produce the same output?

Don Cragun

# 06-30-2016

Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts

I am trying to get a value from the 1st line of the file and check if that value is a valid date or not.
------------------------------------------------------------------
Below is the output for the cut command from new version

Code:

 $echo 0123456789abcdef | cut -c 0-8 cut: fields and positions are numbered from 1 Try cut --help' for more information.$ echo 0123456789abcdef | cut -c 1-8
01234567
-------------------------------------------------------------------
With old version, both have same results: 

Code:

$echo 0123456789abcdef | cut -c 0-8 01234567$ echo 0123456789abcdef | cut -c 1-8
01234567


Please wrap all code, files, input & output/errors in CODE tags
It makes them far easier to read and preserves spaces for indenting or fixed-width data.

Last edited by rbatte1; 06-30-2016 at 11:38 AM .. Reason: Code tags

Vikram Jain

06-30-2016

Scrutinizer ModeratorJoin Date: Nov 2008 Last Activity: 14 August 2017, 2:48 PM EDT Location: Amsterdam Posts: 11,509 Thanks: 497 Thanked 3,326 Times in 2,934 Posts

The use of 0 is not according to specification. Alternatively, you can just omit it, which should work across versions

Code:

$echo 0123456789abcdef | cut -c -8 01234567 If you cannot adjust the scripts, you could perhaps create a wrapper script for cut, so that the 0 gets stripped.. Last edited by Scrutinizer; 07-02-2016 at 02:28 AM .. Scrutinizer 06-30-2016 Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts Yes, don't want to adjust my scripts. Wrapper for cut looks like something that would work. could you please tell me how would I use it, as in, how would I make sure that the wrapper is called and not the cut command which causes the issue. Vikram Jain Don Cragun AdministratorJoin Date: Jul 2012 Last Activity: 14 August 2017, 3:59 PM EDT Location: San Jose, CA, USA Posts: 10,455 Thanks: 533 Thanked 3,654 Times in 3,118 Posts The only way to make sure that your wrapper is always called instead of the OS supplied utility is to move the OS supplied utility to a different location and install your wrapper in the location where your OS installed cut originally. Of course, once you have installed this wrapper, your code might or might not work properly (depending on the quality of your wrapper) and no one else on your system will be able to look at the diagnostics produced by scripts that have bugs in the way they specify field and character ranges so they can identify and fix their code. My personal opinion is that you should spend time fixing your scripts that call cut -c 0.... , cut -f 0... , and lots of other possible misuses of 0 that are now correctly diagnosed as errors by the new version of cut instead of debugging code to be sure that it changes all of the appropriate 0 characters in its argument list to 1 characters and doesn't change any 0 characters that are correctly specified and do not reference a character 0 or field 0. vgersh99 (06-30-2016), Vikram Jain (06-30-2016) 06-30-2016 MadeInGermany ModeratorJoin Date: May 2012 Last Activity: 14 August 2017, 2:33 PM EDT Location: Simplicity Posts: 3,666 Thanks: 295 Thanked 1,226 Times in 1,108 Posts An update of "cut" will overwrite your wrapper. Much better: change your scripts. Run the following fix_cut script on your scripts: Code: #!/bin/sh # fix_cut PATH=/bin:/usr/bin PRE="\b(cut\s+(-\S*\s+)*-[cf]\s*0*)0-" for arg do perl -ne 'exit 1 if m/'"$PRE"'/' "$arg" || { perl -i -pe 's/'"$PRE"'/${1}1-/g' "$arg"
}
done
Example: fix all .sh scripts

Code:

fix_cut *.sh

The Following User Says Thank You to MadeInGermany For This Useful Post:

Vikram Jain (07-08-2016)

#### [Jul 17, 2017] Setup Centralized Rsyslog Server On CentOS 7

###### Jul 17, 2017 | www.linuxtoday.com

Install and configure Rsyslog server and client configuration on CentOS 7 server.

YUM configuration in Linux (Mar 24, 2017, 06:00)
kerneltalks: Learn YUM configuration in Linux.

 How to Install a DHCP Server in CentOS, RHEL and Fedora (Mar 26, 2017, 14:00) tecmint: In this tutorial, we will cover how to install and configure a DHCP server in CentOS/RHEL and Fedora distributions. How to find process using high memory in Linux (Mar 26, 2017, 10:00) KernelTalks: Learn how to find process using high memory on linux server.

8 Practical Examples of Linux Xargs Command for Beginners (Mar 27, 2017, 13:00)
HowToForge: The Linux xargs command may not be a hugely popular command line tool, but this doesn't take away the fact that it's extremely useful

 Using vi-mode in your shell (Mar 27, 2017, 11:00) opensource.com: Get an introduction to using vi-mode for line editing at the command line. Notepadqq Source Code Editor for Linux (Mar 27, 2017, 10:00) Notepadqq is a free, open source code editor and Notepad replacement, that helps developers to work more efficiently.

14 Practical Examples of Linux Find Command for Beginners (Mar 27, 2017, 04:00)
HowToForge: Find is one of the most frequently used Linux commands, and it offers a plethora of features in the form of command line options.

#### [Jul 16, 2017] How to use a man page Faster than a Google search

###### Jul 16, 2017 | opensource.com
It's easy to get into the habit of googling anything you want to know about a command or operation in Linux, but I'd argue there's something even better: a living and breathing, complete reference, the man pages , which is short for manual pages.

The history of man pages predates Linux, all the way back to the early days of Unix. According to Wikipedia , Dennis Ritchie and Ken Thompson wrote the first man pages in 1971, well before the days of personal computers, around the time when many calculators in use were the size of toaster ovens. Man pages also have a reputation of being terse and, in a way, have a language of their own. Just like Unix and Linux, the man pages have not been static, and they continue to be developed and maintained just like the kernel.

Man pages are divided into sections referenced by numbers:

1. General user commands
2. System calls
3. Library functions
4. Special files and drivers
5. File formats
6. Games and screensavers
7. Miscellanea
8. System administration commands and daemons

Even so, users generally don't need to know the section where a particular command lies to find what they need.

The files are formatted in a way that may look odd to many users today. Originally, they were written in in an old form of markup called troff because they were designed to be printed through a PostScript printer, so they included formatting for headers and other layout aspects. In Linux, groff is used instead.

In my Fedora, the man pages are located in /usr/share/man with subdirectories (like man1 for Section 1 commands) as well as additional subdirectories for translations of the man pages.

If you look up the man page for the command man , you'll see the file man.1.gz , which is the man pages compressed with the gzip utility. To access a man page, type a command such as:


man
man



for example, to show the man page for man . This uncompresses the man page, interprets the formatting commands, and displays the results with less , so navigation is the same as when you use less .

All man pages should have the following subsections: Name , Synopsis , Description , Examples , and See Also . Many have additional sections, like Options , Exit Status , Environment , Bugs , Files , Author , Reporting Bugs , History , and Copyright .

Breaking down a man page

To explain how to interpret a typical man page, let's use the man page for ls as an example. Under Name , we see


ls
- list directory contents



which tells us what ls means in the simplest terms.

Under Synopsis , we begin to see the terseness:


ls
[
OPTION
]
...
[
FILE
]



Any element that occurs inside brackets is optional. The above command means you can legitimately type ls and nothing else. The ellipsis after each element indicates that you can include as many options as you want (as long as they're compatible with each other) and as many files as you want. You can specify a directory name, and you can also use * as a wildcard. For example:


ls
Documents
/*
.txt



Under Description , we see a more verbose description of what the command does, followed by a list of the available options for the command. The first option for ls is

-a, --all
do not ignore entries starting with .

If we want to use this option, we can either type the short form syntax, -a , or the long form --all . Not all options have two forms (e.g., --author ), and even when they do, they aren't always so obviously related (e.g., - F, --classify ). When you want to use multiple options, you can either type the short forms with spaces in between or type them with a single hyphen and no spaces (as long as they do not require further sub-options). Therefore,


ls
-a
-d
-l



and


ls



are equivalent.

The command tar is somewhat unique, presumably due to its long history, in that it doesn't require a hyphen at all for the short form. Therefore,


tar
-cvf
filearchive.tar thisdirectory
/



and


tar
cvf filearchive.tar thisdirectory
/



are both legitimate.

On the ls man page, after Description are Author , Reporting Bugs , Copyright , and See Also .

The See Also section will often suggest related man pages, so it is generally worth a glance. After all, there is much more to man pages than just commands.

Certain commands that are specific to Bash and not system commands, like alias , cd , and a number of others, are listed together in a single BASH_BUILTINS man page. While the documentation for these is even more terse and compact, overall it contains similar information.

I find that man pages offer a lot of good, usable information, especially when I need a command I haven't used recently, and I need to brush up on the options and requirements. This is one place where the man pages' much-maligned terseness is actually very beneficial. Topics Linux About the author Greg Pittman - Greg is a retired neurologist in Louisville, Kentucky, with a long-standing interest in computers and programming, beginning with Fortran IV in the 1960s. When Linux and open source software came along, it kindled a commitment to learning more, and eventually contributing. He is a member of the Scribus Team.

#### [Feb 25, 2017] 5 basic cURL command examples

###### Feb 25, 2017 | www.rosehosting.com
cURL is very useful command line tool to transfer data from or to a server. cURL supports various protocols like FILE, HTTP, HTTPS, IMAP, IMAPS, LDAP, DICT, LDAPS, TELNET, FTP, FTPS, GOPHER, RTMP, RTSP, SCP, SFTP, POP3, POP3S, SMB, SMBS, SMTP, SMTPS, and TFTP.

1. Check URL

One of the most common and simplest uses of cURL is typing the command itself, followed by the URL you want to check

curl https://domain.com


This command will display the content of the URL on your terminal

2. Save the output of the URL to a file

The output of the cURL command can be easily saved to a file by adding the -o option to the command, as shown below

curl -o website https://domain.com

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
100 41793    0 41793    0     0   275k      0 --:--:-- --:--:-- --:--:--  2.9M


In this example, output will be save to a file named 'website' in the current working directory.

You can downlaod files with cURL by adding the -O option to the command. It is used for saving files on the local server with the same names as on the remote server

curl -O https://domain.com/file.zip


In this example, the 'file.zip' zip archive will be downloaded to the current working directory.

You can also download the file with a different name by adding the -o option to cURL.

curl -o archive.zip https://domain.com/file.zip


This way the 'file.zip' archive will be downloaded and saved as 'archive.zip'.

cURL can be also used to download multiple files simultaneously, as shown in the example below

curl -O https://domain.com/file.zip -O https://domain.com/file2.zip


cURL can be also used to download files securely via SSH using the following command

curl -u user sftp://server.domain.com/path/to/file


Note that you have to use the full path of the file you want to download

4. Get HTTP header information from a website

You can easily get HTTP header information from any website you want by adding the -I option (capital 'i') to cURL.

curl -I http://domain.com

HTTP/1.1 200 OK
Date: Sun, 16 Oct 2016 23:37:15 GMT
Server: Apache/2.4.23 (Unix)
X-Powered-By: PHP/5.6.24
Connection: close
Content-Type: text/html; charset=UTF-8

5. Access an FTP server

To access your FTP server with cURL use the following command

curl ftp://ftp.domain.com --user username:password


cURL will connect to the FTP server and list all files and directories in user's home directory

curl ftp://ftp.domain.com/file.zip --user username:password


and upload a file ot the FTP server

curl -T file.zip ftp://ftp.domain.com/ --user username:password


You can check cURL manual page to see all available cURL options and functionalities

man curl


Of course, if you use one of our Linux VPS Hosting services, you can always contact and ask our expert Linux admins (via chat or ticket) about cURL and anything related to cURL. They are available 24×7 and will provide information or assistance immediately.

#### [Feb 20, 2017] Using rsync to back up your Linux system

###### Feb 20, 2017 | opensource.com
Another interesting option, and my personal favorite because it increases the power and flexibility of rsync immensely, is the --link-dest option. The --link-dest option allows a series of daily backups that take up very little additional space for each day and also take very little time to create.

Specify the previous day's target directory with this option and a new directory for today. rsync then creates today's new directory and a hard link for each file in yesterday's directory is created in today's directory. So we now have a bunch of hard links to yesterday's files in today's directory. No new files have been created or duplicated. Just a bunch of hard links have been created. Wikipedia has a very good description of hard links . After creating the target directory for today with this set of hard links to yesterday's target directory, rsync performs its sync as usual, but when a change is detected in a file, the target hard link is replaced by a copy of the file from yesterday and the changes to the file are then copied from the source to the target.

So now our command looks like the following.

rsync -aH --delete --link-dest=yesterdaystargetdir sourcedir todaystargetdir 

There are also times when it is desirable to exclude certain directories or files from being synchronized. For this, there is the --exclude option. Use this option and the pattern for the files or directories you want to exclude. You might want to exclude browser cache files so your new command will look like this.

rsync -aH --delete --exclude Cache --link-dest=yesterdaystargetdir sourcedir todaystargetdir 

Note that each file pattern you want to exclude must have a separate exclude option.

rsync can sync files with remote hosts as either the source or the target. For the next example, let's assume that the source directory is on a remote computer with the hostname remote1 and the target directory is on the local host. Even though SSH is the default communications protocol used when transferring data to or from a remote host, I always add the ssh option. The command now looks like this.

rsync -aH -e ssh --delete --exclude Cache --link-dest=yesterdaystargetdir remote1:sourcedir todaystargetdir 

This is the final form of my rsync backup command.

rsync has a very large number of options that you can use to customize the synchronization process. For the most part, the relatively simple commands that I have described here are perfect for making backups for my personal needs. Be sure to read the extensive man page for rsync to learn about more of its capabilities as well as the options discussed here.

#### [Feb 14, 2017] switching from gnu screen to tmux (updated) Linux~ized

##### Ability to watch the other user screen is a very valuable option...
###### Feb 14, 2017 | www.linuxized.com

ed says: June 16, 2010 at 15:15

screen is really cool, and does somethings that I've yet to find counterparts to with tmux, such as the -x option:

#### [Feb 12, 2017] HowTo Use rsync For Transferring Files Under Linux or UNIX

###### Feb 12, 2017 | www.cyberciti.biz
So what is unique about the rsync command?

It can perform differential uploads and downloads (synchronization) of files across the network, transferring only data that has changed. The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network connection.

How do I install rsync?

Use any one of the following commands to install rsync. If you are using Debian or Ubuntu Linux, type the following command:
# apt-get install rsync
OR
$sudo apt-get install rsync  If you are using Red Hat Enterprise Linux (RHEL) / CentOS 4.x or older version, type the following command: # up2date rsync  RHEL / CentOS 5.x or newer (or Fedora Linux) user type the following command: # yum install rsync  Always use rsync over ssh Since rsync does not provide any security while transferring data it is recommended that you use rsync over ssh session. This allows a secure remote connection. Now let us see some examples of rsync command. Comman rsync command options • --delete : delete files that don't exist on sender (system) • -v : Verbose (try -vv for more detailed information) • -e "ssh options" : specify the ssh as remote shell • -a : archive mode • -r : recurse into directories • -z : compress file data Task : Copy file from a local computer to a remote server Copy file from /www/backup.tar.gz to a remote server called openbsd.nixcraft.in $ rsync -v -e ssh /www/backup.tar.gz jerry@openbsd.nixcraft.in:~
Output:

Password:
sent 19099 bytes  received 36 bytes  1093.43 bytes/sec
total size is 19014  speedup is 0.99


Please note that symbol ~ indicate the users home directory (/home/jerry).

Task : Copy file from a remote server to a local computer

Copy file /home/jerry/webroot.txt from a remote server openbsd.nixcraft.in to a local computer's /tmp directory:
$rsync -v -e ssh jerry@openbsd.nixcraft.in:~/webroot.txt /tmp  Task: Synchronize a local directory with a remote directory $ rsync -r -a -v -e "ssh -l jerry" --delete /local/webroot openbsd.nixcraft.in:/webroot 

Task: Synchronize a remote directory with a local directory

$rsync -r -a -v -e "ssh -l jerry" --delete openbsd.nixcraft.in:/webroot/ /local/webroot  Task: Synchronize a local directory with a remote rsync server or vise-versa $ rsync -r -a -v --delete rsync://rsync.nixcraft.in/cvs /home/cvs
OR
$rsync -r -a -v --delete /home/cvs rsync://rsync.nixcraft.in/cvs  Task: Mirror a directory between my "old" and "new" web server/ftp You can mirror a directory between my "old" (my.old.server.com) and "new" web server with the command (assuming that ssh keys are set for password less authentication) $ rsync -zavrR --delete --links --rsh="ssh -l vivek" my.old.server.com:/home/lighttpd /home/lighttpd 

Other options – rdiff and rdiff-backup

The rdiff command uses the rsync algorithm. A utility called rdiff-backup has been created which is capable of maintaining a backup mirror of a file or directory over the network, on another server. rdiff-backup stores incremental rdiff deltas with the backup, with which it is possible to recreate any backup point. Next time I will write about these utilities.

rsync for Windows Server/XP/7/8

Please note if you are using MS-Windows, try any one of the program:

=> Official rsync documentation

#### [Feb 12, 2017] How to Sync Two Apache Web Servers-Websites Using Rsync

###### Feb 12, 2017 | www.tecmint.com
The purpose of creating a mirror of your Web Server with Rsync is if your main web server fails, your backup server can take over to reduce downtime of your website. This way of creating a web server backup is very good and effective for small and medium size web businesses. Advantages of Syncing Web Servers

The main advantages of creating a web server backup with rsync are as follows:

1. Rsync syncs only those bytes and blocks of data that have changed.
2. Rsync has the ability to check and delete those files and directories at backup server that have been deleted from the main web server.
3. It takes care of permissions, ownerships and special attributes while copying data remotely.
4. It also supports SSH protocol to transfer data in an encrypted manner so that you will be assured that all data is safe.
5. Rsync uses compression and decompression method while transferring data which consumes less bandwidth.
How To Sync Two Apache Web Servers

Let's proceed with setting up rsync to create a mirror of your web server. Here, I'll be using two servers.

Main Server
2. Hostname : webserver.example.com
Backup Server
2. Hostname : backup.example.com
Step 1: Install Rsync Tool

Here in this case web server data of webserver.example.com will be mirrored on backup.example.com . And to do so first, we need to install Rsync on both the server with the help of following command.

[root@tecmint]# yum install rsync        [On
Red Hat
based systems]
[root@tecmint]# apt-get install rsync    [On
Debian
based systems]

Step 2: Create a User to run Rsync

We can setup rsync with root user, but for security reasons, you can create an unprivileged user on main webserver i.e webserver.example.com to run rsync.

[root@tecmint]# useradd tecmint
[root@tecmint]# passwd tecmint


Here I have created a user " tecmint " and assigned a password to user.

Step 3: Test Rsync Setup

It's time to test your rsync setup on your backup server (i.e. backup.example.com ) and to do so, please type following command.

[root@backup www]# rsync -avzhe ssh tecmint@webserver.example.com:/var/www/ /var/www

Sample Output
tecmint@webserver.example.com's password:
receiving incremental file list
sent 128 bytes  received 32.67K bytes  5.96K bytes/sec
total size is 12.78M  speedup is 389.70


You can see that your rsync is now working absolutely fine and syncing data. I have used " /var/www " to transfer; you can change the folder location according to your needs.

Now, we are done with rsync setups and now its time to setup a cron for rsync. As we are going to use rsync with SSH protocol, ssh will be asking for authentication and if we won't provide a password to cron it will not work. In order to work cron smoothly, we need to setup passwordless ssh logins for rsync.

Here in this example, I am doing it as root to preserve file ownerships as well, you can do it for alternative users too.

First, we'll generate a public and private key with following commands on backups server (i.e. backup.example.com ).

[root@backup]# ssh-keygen -t rsa -b 2048


When you enter this command, please don't provide passphrase and click enter for Empty passphrase so that rsync cron will not need any password for syncing data.

Sample Output
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
9a:33:a9:5d:f4:e1:41:26:57:d0:9a:68:5b:37:9c:23 root@backup.exmple.com
The key's randomart image is:
+--[ RSA 2048]----+
|          .o.    |
|           ..    |
|        ..++ .   |
|        o=E *    |
|       .Sooo o   |
|       =.o o     |
|      * . o      |
|     o +         |
|    . .          |
+-----------------+


Now, our Public and Private key has been generated and we will have to share it with main server so that main web server will recognize this backup machine and will allow it to login without asking any password while syncing data.

[root@backup html]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@webserver.example.com


Now try logging into the machine, with " ssh 'root@webserver.example.com '", and check in .ssh/authorized_keys .

[root@backup html]# root@webserver.example.com


Now, we are done with sharing keys. To know more in-depth about SSH password less login , you can read our article on it.

Step 5: Schedule Cron To Automate Sync

Let's setup a cron for this. To setup a cron, please open crontab file with the following command.

[root@backup ~]# crontab –e


It will open up /etc/crontab file to edit with your default editor. Here In this example, I am writing a cron to run it every 5 minutes to sync the data.

*/5        *        *        *        *   rsync -avzhe ssh root@webserver.example.com:/var/www/ /var/www/


The above cron and rsync command simply syncing " /var/www/ " from the main web server to a backup server in every 5 minutes . You can change the time and folder location configuration according to your needs. To be more creative and customize with Rsync and Cron command, you can check out our more detailed articles at:

#### [Feb 12, 2017] How to Use rsync to Synchronize Files Between Servers Linux Server Training 101

###### Feb 12, 2017 | www.youtube.com
soundtraining.net

Great demonstration and very easy to follow Don! Just a note to anyone who might come across this and start using it in production based systems is that you certainly would not want to be rsyncing with root accounts. In addition you would use key based auth with SSH as an additional layer of security. Just my 2cents ;-) curtis shaw 11 months ago Best rsync tutorial on the web. Thanks.

#### [Feb 12, 2017] An Easy Way To Monitor A Website From Command Line In Linux

###### OSTechNix

We all know that ping command will tell you instantly whether the website is live or down. Usually, we all check whether a website is up or down like below.

ping ostechnix.com -c 3

Sample output:

PING ostechnix.com (64.90.37.180) 56(84) bytes of data.
64 bytes from ostechnix.com (64.90.37.180): icmp_seq=1 ttl=51 time=376 ms
64 bytes from ostechnix.com (64.90.37.180): icmp_seq=2 ttl=51 time=374 ms

--- ostechnix.com ping statistics ---
3 packets transmitted, 2 received, 33% packet loss, time 2000ms
rtt min/avg/max/mdev = 374.828/375.471/376.114/0.643 ms

But, Would you run this command every time to check whether your website is live or down? You may create a script to check your website status at periodic intervals. But wait. It's not necessary! Here is simple command that will watch or monitor on a regular interval.

watch -n 1 curl -I http://DOMAIN_NAME/

For those who don't know, watch command is used to run any command on a particular intervals.

Example:

Let us check if ostechnix.com site is live or down. To do so, run:

watch -n 1 curl -I https://www.ostechnix.com/

Sample output:

Every 1.0s: curl -I https://www.ostechnix.com/ sk: Thu Dec 22 17:37:24 2016

% Total % Received % Xferd Average Speed Time Time Time Current
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
HTTP/1.1 200 OK
Date: Thu, 22 Dec 2016 12:07:09 GMT
Server: ApacheD
Content-Type: text/html; charset=UTF-8


The above command will monitor our site ostechnix.com at every one second interval. You can change the monitoring time as you wish. Unlike ping command, it will keep watching your site status until you stop it. To stop this command, press CTRL+C.

If you got HTTP/1.1 200 OK in the output, great? It means your website is working and live.

#### [Aug 01, 2014] Getting Back To Coding

###### Slashdot

New submitter rrconan writes I always feel like I'm getting old because of the constant need to learn a new tools to do the same job. At the end of projects, I get the impression that nothing changes — there are no real benefits to the new tools, and the only result is a lot of time wasted learning them instead of doing the work. We discussed this last week with Andrew Binstock's "Just Let Me Code" article, and now he's written a follow-up about reducing tool complexity and focusing on writing code. He says, "Tool vendors have several misperceptions that stand in the way. The first is a long-standing issue, which is 'featuritis': the tendency to create the perception of greater value in upgrades by adding rarely needed features. ... The second misperception is that many tool vendors view the user experience they offer as already pretty darn good. Compared with tools we had 10 years ago or more, UIs have indeed improved significantly. But they have not improved as fast as complexity has increased. And in that gap lies the problem.' Now I understand that what I thought of as "getting old" was really "getting smart."

#### 10 most rated linux commands for last past weeks at commandlinefu.

1- Save man-page as pdf

 man -t awk | ps2pdf - awk.pdf

2- Duplicate installed packages from one machine to the other (RPM-based systems)

ssh root@remote.host "rpm -qa" | xargs yum -y install

3- Stamp a text line on top of the pdf pages to quickly add some remark, comment, stamp text, … on top of (each of) the pages of the input pdf file

echo "This text gets stamped on the top of the pdf pages." | enscript -B -f Courier-Bold16 -o- | ps2pdf - | pdftk input.pdf stamp - output output.pdf

4- Display the number of connections to a MySQL Database

Count the number of active connections to a MySQL database.
The MySQL command “show processlist” gives a list of all the active clients.
However, by using the processlist table, in the information_schema database, we can sort and count the results within MySQL.

mysql -u root -p -BNe "select host,count(host) from processlist group by host;" information_schema

5- Create a local compressed tarball from remote host directory

ssh user@host "tar -zcf - /path/to/dir" > dir.tar.gz

This improves on #9892 by compressing the directory on the remote machine so that the amount of data transferred over the network is much smaller. The command uses ssh(1) to get to a remote host, uses tar(1) to archive and compress a remote directory, prints the result to STDOUT, which is written to a local file. In other words, we are archiving and compressing a remote directory to our local box.

6- tail a log over ssh

This is also handy for taking a look at resource usage of a remote box.

ssh -t remotebox "tail -f /var/log/remote.log"

7- Print diagram of user/groups

Parses /etc/group to “dot” format and pases it to “display” (imagemagick) to show a usefull diagram of users and groups (don’t show empty groups).

awk 'BEGIN{FS=":"; print "digraph{"}{split($4, a, ","); for (i in a) printf "\"%s\" [shape=box]\n\"%s\" -> \"%s\"\n",$1, a[i], $1}END{print "}"}' /etc/group|display 8- Draw kernel module dependancy graph. Parse lsmod’ output and pass to dot’ drawing utility then finally pass it to an image viewer lsmod | perl -e 'print "digraph \"lsmod\" {";<>;while(<>){@_=split/\s+/; print "\"$_[0]\" -> \"$_\"\n" for split/,/,$_[3]}print "}"' | dot -Tpng | display -

9- Create strong, but easy to remember password

Why remember? Generate!
Up to 48 chars, works on any unix-like system

read -s pass; echo $pass | md5sum | base64 | cut -c -16 10- Find all files larger than 500M and less than 1GB find / -type f -size +500M -size -1G 11- Limit the cpu usage of a process This will limit the average amount of CPU it consumes. sudo cpulimit -p pid -l 50 #### [Jul 26, 2011] ivarch.com Pipe Viewer Online Man Page pv allows a user to see the progress of data through a pipeline, by giving information such as time elapsed, percentage completed (with progress bar), current throughput rate, total data transferred, and ETA. To use it, insert it in a pipeline between two processes, with the appropriate options. Its standard input will be passed through to its standard output and progress will be shown on standard error. pv will copy each supplied FILE in turn to standard output (- means standard input), or if no FILEs are specified just standard input is copied. This is the same behaviour as cat(1). A simple example to watch how quickly a file is transferred using nc(1): pv file | nc -w 1 somewhere.com 3000 A similar example, transferring a file from another process and passing the expected size to pv: cat file | pv -s 12345 | nc -w 1 somewhere.com 3000 A more complicated example using numeric output to feed into the dialog(1) program for a full-screen progress display: (tar cf - . \ | pv -n -s$(du -sb . | awk '{print $1}') \ | gzip -9 > out.tgz) 2>&1 \ | dialog --gauge 'Progress' 7 70 Frequent use of this third form is not recommended as it may cause the programmer to overheat. #### [Jan 24, 2008] freshmeat.net Project details for cgipaf ##### The package also contain Solaris binary of chpasswd clone, which is extremely useful for mass changes of passwords in corporate environments which include Solaris and other Unixes that does not have chpasswd utility (HP-UX is another example in this category). Version 1.3.2 now includes Solaris binary of chpasswd which works on Solaris 9 and 10. cgipaf is a combination of three CGI programs. • passwd.cgi, which allow users to update their password, • viewmailcfg.cgi, which allows users to view their current mail configuration, • mailcfg.cgi, which updates the mail configuration. All programs use PAM for user authentication. It is possible to run a script to update SAMBA passwords or NIS configuration when a password is changed. mailcfg.cgi creates a .procmailrc in the user's home directory. A user with too many invalid logins can be locked. The minimum and maximum UID can be set in the configuration file, so you can specify a range of UIDs that are allowed to use cgipaf. #### [Aug 7, 2007] Expect plays a crucial role in network management by Cameron Laird 31 Jul 2007 | www.ibm.com/developerworks If you manage systems and networks, you need Expect. More precisely, why would you want to be without Expect? It saves hours common tasks otherwise demand. Even if you already depend on Expect, though, you might not be aware of the capabilities described below. Expect automates command-line interactions You don't have to understand all of Expect to begin profiting from the tool; let's start with a concrete example of how Expect can simplify your work on AIX® or other operating systems: Suppose you have logins on several UNIX® or UNIX-like hosts and you need to change the passwords of these accounts, but the accounts are not synchronized by Network Information Service (NIS), Lightweight Directory Access Protocol (LDAP), or some other mechanism that recognizes you're the same person logging in on each machine. Logging in to a specific host and running the appropriate passwd command doesn't take long—probably only a minute, in most cases. And you must log in "by hand," right, because there's no way to script your password? Wrong. In fact, the standard Expect distribution (full distribution) includes a command-line tool (and a manual page describing its use!) that precisely takes over this chore. passmass (see Resources) is a short script written in Expect that makes it as easy to change passwords on twenty machines as on one. Rather than retyping the same password over and over, you can launch passmass once and let your desktop computer take care of updating each individual host. You save yourself enough time to get a bit of fresh air, and multiple opportunities for the frustration of mistyping something you've already entered. The limits of Expect This passmass application is an excellent model—it illustrates many of Expect's general properties: • It's a great return on investment: The utility is already written, freely downloadable, easy to install and use, and saves time and effort. • Its contribution is "superficial," in some sense. If everything were "by the book"—if you had NIS or some other domain authentication or single sign-on system in place—or even if login could be scripted, there'd be no need for passmass. The world isn't polished that way, though, and Expect is very handy for grabbing on to all sorts of sharp edges that remain. Maybe Expect will help you create enough free time to rationalize your configuration so that you no longer need Expect. In the meantime, take advantage of it. • As distributed, passmass only logs in by way of telnet, rlogin, or slogin. I hope all current developerWorks readers have abandoned these protocols for ssh, which  passmasss does not fully support. • On the other hand, almost everything having to do with Expect is clearly written and freely available. It only takes three simple lines (at most) to enhance passmass to respect ssh and other options. You probably know enough already to begin to write or modify your own Expect tools. As it turns out, the passmass distribution actually includes code to log in by means of ssh, but omits the command-line parsing to reach that code. Here's one way you might modify the distribution source to put ssh on the same footing as telnet and the other protocols: Listing 1. Modified passmass fragment that accepts the -ssh argument ... } "-rlogin" { set login "rlogin" continue } "-slogin" { set login "slogin" continue } "-ssh" { set login "ssh" continue } "-telnet" { set login "telnet" continue ... In my own code, I actually factor out more of this "boilerplate." For now, though, this cascade of tests, in the vicinity of line #100 of passmass, gives a good idea of Expect's readability. There's no deep programming here—no need for object-orientation, monadic application, co-routines, or other subtleties. You just ask the computer to take over typing you usually do for yourself. As it happens, this small step represents many minutes or hours of human effort saved. #### [April 23, 2006] Port25 Running Windows Command Line Applications from a Linux Box What is interesting comments does not mention that ssh server is available under SFU 3.5. Research and Analysis Wednesday, April 19, 2006 5:42 PM by admin Running Command Line Applications on Windows XP/2000 from a Linux Box: Question: -----Original Message----- From: swagner@******** Sent: Thursday, April 13, 2006 2:35 PM To: Port25 Feedback Subject: (Port25) : You guys should look into _____ Importance: High Can you recommend anything for running command line applications on a Windows XP/2000 box from within a program that runs on Linux? For example I want a script to run on a Linux server that will connect to a Windows server, on our network, and run certain commands. Answer: One way to do this would be to install an SSH daemon on the Windows machine and run commands via the ssh client on the Linux machine. Simply search the web for information on setting up the Cygwin SSH daemon as a service in Windows (there are docs about this everywhere). You can then run commands with ssh, somewhat like: ssh administrator@<hostname> 'touch /cygdrive/c/blar' That will create a file in C:\ called "blar". You can also access Windows commands if you alter the path in the Cygwin environment or use the full path to the command: ssh administrator@<hostname> '/cygdrive/c/windows/system32/net.exe view' #### re: Running Windows Command Line Applications from a Linux Box Sunday, April 23, 2006 3:44 AM by nektar I am disappointed that Microsoft does not offer an SSH implementation with Services for Unix or with SUA. #### re: Running Windows Command Line Applications from a Linux Box Sunday, April 23, 2006 4:36 PM by szlwzl I would also very much like to see this as a built in feature - cygwin is great and I use it all the time but why not build something like this into the OS? #### re: Running Windows Command Line Applications from a Linux Box Sunday, April 23, 2006 6:05 PM by breiter I'm stunned that you didn't recommend OpenSSH running on Interix from SFU 3.5 or SUA 5.2. I would much rather rely upon Interix than Cygwin. Interopsystems maintains an both a free straight OpenSSH package and an commercial enhanced version with an MMC-based GUI configurator. #### re: Running Windows Command Line Applications from a Linux Box Monday, April 24, 2006 1:12 AM by vox Of course if there was an RDP client that could access Windows full screen using a browser (the same way as Virtual Labs work) you could run GUI programs as well #### Replies to all Monday, April 24, 2006 1:30 AM by einhverfr Hi all. Nektar wrote: " I am disappointed that Microsoft does not offer an SSH implementation with Services for Unix or with SUA." When I was at Microsoft, the legal department raised objections. Not sure if they were trademark related or what. But a good substitute would be a kerberized telnet client and server that would be capable of session encryption as per the Kerberos specification. People usually don't know that this is possible using Kerberos and telnet but it is. And given the architecture of AD, this would lead to close integration. Vox wrote: " Of course if there was an RDP client that could access Windows full screen using a browser (the same way as Virtual Labs work) you could run GUI programs as well" Ever use rdesktop? It doesn;t use a browser, but it close enough you can easily run GUI apps. Best Wishes, Chris Travers Metatron Technology Consulting #### re: Running Windows Command Line Applications from a Linux Box Monday, April 24, 2006 12:42 PM by docsmooth rdesktop -0 -f <servername> will work the same as mstsc /console with the fullscreen switch set. As Chris said, it's not a browser, but it's a 100% replacement for MSTSC, and fits every single option, security and otherwise, that is in MSTSC. Also, KDE users have "krdc" which wraps around rdesktop and VNC, so you can connect to either, and save off your settings, just like saving .RDP files in Windows. Rob #### re: Running Windows Command Line Applications from a Linux Box Monday, April 24, 2006 4:42 PM by docsmooth I completely forgot this portion to my previous comment: Is there anyone who has experience running Windows Resource Kit tools or Windows 2003 Support Tools from Wine or similar directly off of the Linux box? It would be fantastic to be able to run those and the MMC tools, perhaps with WinBind as the authentication path? As things sit right now, I have to run a VMWare WinXP instance, or dual-boot to get access to those tools that I run less frequently than certain FOSS tools, but still need. #### re: Running Windows Command Line Applications from a Linux Box Thursday, April 27, 2006 4:39 PM by smither Simply install vncserver from, for example, realvnc.com, then use vncviewer on the Linux box. You have your complete Windows desktop within a window in your X server. Open the terminal from the start menu. #### re: Running Windows Command Line Applications from a Linux Box Friday, April 28, 2006 2:49 PM by remdotc You can either purchase a copy of Cross Over Office and/or Cedega, which allow you to run windows native binarys on linux (directX) or you can under wine get these to work, though you need to install IE 6.1 You need to set your 0/S in wine.conf to 2000 you need to copy most of the files contained in sysroot/system32 to your winex install performance is horrible The better sollution is to install a ssh server on the windows box and then remote in via command line. If you can not afford a commerical one, you can always use cygwin [Jan 25, 2005] Tool of the Month: ManEdit by Joe "Zonker" Brockmeier ManEdit is provided by WolfPack Entertainment. I know, that doesn't sound like a company that would be releasing a manual page editor, but they did — and under the GNU General Public License, no less. It's not terribly difficult to create manual pages using an editor like Emacs or Vim (see my December 2003 column if you'd like to start from scratch) but it's yet another skill that developers and admins have to tackle to learn how to write in *roff format. ManEdit actually uses an XML format that it converts to groff format when saving, so it's not necessary to delve into groff if you don't want to. (I would recommend having at least a passing familiarity with groff if you're going to be doing much development, but it's not absolutely necessary.) ManEdit is an easy-to-use manual page editor and viewer that takes all the hassle out of creating manual pages (well, the formatting hassle, anyway — you still have to actually write the manual itself). The ManEdit homepage has source and packages for Debian, Mandrake, Slackware, and SUSE Linux. The source should compile on FreeBSD and Solaris as well, so long as you have GTK 1.2.0. I used the SUSE packages without any problem on a SUSE 9.2 system. #### Sys Admin Magazine Useful Solaris Commands truss -c (Solaris >= 8): This astounding option to truss provides a profile summary of the command being trussed: $ truss -c grep asdf work.doc
syscall              seconds   calls  errors
_exit                    .00       1
open                     .00       8      4
close                    .00       5
brk                      .00      15
stat                     .00       1
fstat                    .00       4
execve                   .00       1
mmap                     .00      10
munmap                   .01       3
memcntl                  .00       2
llseek                   .00       1
open64                   .00       1
----     ---    ---
sys totals:              .02      76      4
usr time:                .00
elapsed:                 .05

It can also show profile data on a running process. In this case, the data shows what the process did between when truss was started and when truss execution was terminated with a control-c. It’s ideal for determining why a process is hung without having to wade through the pages of truss output.

truss -d and truss -D (Solaris >= 8): These truss options show the time associated with each system call being shown by truss and is excellent for finding performance problems in custom or commercial code. For example:

$truss -d who Base time stamp: 1035385727.3460 [ Wed Oct 23 11:08:47 EDT 2002 ] 0.0000 execve(“/usr/bin/who”, 0xFFBEFD5C, 0xFFBEFD64) argc = 1 0.0032 stat(“/usr/bin/who”, 0xFFBEFA98) = 0 0.0037 open(“/var/ld/ld.config”, O_RDONLY) Err#2 ENOENT 0.0042 open(“/usr/local/lib/libc.so.1”, O_RDONLY) Err#2 ENOENT 0.0047 open(“/usr/lib/libc.so.1”, O_RDONLY) = 3 0.0051 fstat(3, 0xFFBEF42C) = 0 . . . truss -D is even more useful, showing the time delta between system calls: Dilbert> truss -D who 0.0000 execve(“/usr/bin/who”, 0xFFBEFD5C, 0xFFBEFD64) argc = 1 0.0028 stat(“/usr/bin/who”, 0xFFBEFA98) = 0 0.0005 open(“/var/ld/ld.config”, O_RDONLY) Err#2 ENOENT 0.0006 open(“/usr/local/lib/libc.so.1”, O_RDONLY) Err#2 ENOENT 0.0005 open(“/usr/lib/libc.so.1”, O_RDONLY) = 3 0.0004 fstat(3, 0xFFBEF42C) = 0 In this example, the stat system call took a lot longer than the others. truss -T: This is a great debugging help. It will stop a process at the execution of a specified system call. (“-U” does the same, but with user-level function calls.) A core could then be taken for further analysis, or any of the /proc tools could be used to determine many aspects of the status of the process. truss -l (improved in Solaris 9): Shows the thread number of each call in a multi-threaded processes. Solaris 9 truss -l finally makes it possible to watch the execution of a multi-threaded application. Truss is truly a powerful tool. It can be used on core files to analyze what caused the problem, for example. It can also show details on user-level library calls (either system libraries or programmer libraries) via the “-u” option. pkg-get: This is a nice tool (http://www.bolthole.com/solaris) for automatically getting freeware packages. It is configured via /etc/pkg-get.conf. Once it’s up and running, execute pkg-get -a to get a list of available packages, and pkg-get -i to get and install a given package. plimit (Solaris >= 8): This command displays and sets the per-process limits on a running process. This is handy if a long-running process is running up against a limit (for example, number of open files). Rather than using limit and restarting the command, plimit can modify the running process. coreadm (Solaris >= 8): In the “old” days (before coreadm), core dumps were placed in the process’s working directory. Core files would also overwrite each other. All this and more has been addressed by coreadm, a tool to manage core file creation. With it, you can specify whether to save cores, where cores should be stored, how many versions should be retained, and more. Settings can be retained between reboots by coreadm modifying /etc/coreadm.conf. pgrep (Solaris >= 8): pgrep searches through /proc for processes matching the given criteria, and returns their process-ids. A great option is “-n”, which returns the newest process that matches. preap (Solaris >= 9): Reaps zombie processes. Any processes stuck in the “z” state (as shown by ps), can be removed from the system with this command. pargs (Solaris >= 9): Shows the arguments and environment variables of a process. nohup -p (Solaris >= 9): The nohup command can be used to start a process, so that if the shell that started the process closes (i.e., the process gets a “SIGHUP” signal), the process will keep running. This is useful for backgrounding a task that should continue running no matter what happens around it. But what happens if you start a process and later want to HUP-proof it? With Solaris 9, nohup -p takes a process-id and causes SIGHUP to be ignored. prstat (Solaris >= 8): prstat is top and a lot more. Both commands provide a screen’s worth of process and other information and update it frequently, for a nice window on system performance. prstat has much better accuracy than top. It also has some nice options. “-a” shows process and user information concurrently (sorted by CPU hog, by default). “-c” causes it to act like vmstat (new reports printed below old ones). “-C” shows processes in a processor set. “-j” shows processes in a “project”. “-L” shows per-thread information as well as per-process. “-m” and “-v” show quite a bit of per-process performance detail (including pages, traps, lock wait, and CPU wait). The output data can also be sorted by resident-set (real memory) size, virtual memory size, execute time, and so on. prstat is very useful on systems without top, and should probably be used instead of top because of its accuracy (and some sites care that it is a supported program). trapstat (Solaris >= 9): trapstat joins lockstat and kstat as the most inscrutable commands on Solaris. Each shows gory details about the innards of the running operating system. Each is indispensable in solving strange happenings on a Solaris system. Best of all, their output is good to send along with bug reports, but further study can reveal useful information for general use as well. vmstat -p (Solaris >= 8): Until this option became available, it was almost impossible (see the “se toolkit”) to determine what kind of memory demand was causing a system to page. vmstat -p is key because it not only shows whether your system is under memory stress (via the “sr” column), it also shows whether that stress is from application code, application data, or I/O. “-p” can really help pinpoint the cause of any mysterious memory issues on Solaris. pmap -x (Solaris >= 8, bugs fixed in Solaris >= 9): If the process with memory problems is known, and more details on its memory use are needed, check out pmap -x. The target process-id has its memory map fully explained, as in: # pmap -x 1779 1779: -ksh Address Kbytes RSS Anon Locked Mode Mapped File 00010000 192 192 - - r-x-- ksh 00040000 8 8 8 - rwx-- ksh 00042000 32 32 8 - rwx-- [ heap ] FF180000 680 664 - - r-x-- libc.so.1 FF23A000 24 24 - - rwx-- libc.so.1 FF240000 8 8 - - rwx-- libc.so.1 FF280000 568 472 - - r-x-- libnsl.so.1 FF31E000 32 32 - - rwx-- libnsl.so.1 FF326000 32 24 - - rwx-- libnsl.so.1 FF340000 16 16 - - r-x-- libc_psr.so.1 FF350000 16 16 - - r-x-- libmp.so.2 FF364000 8 8 - - rwx-- libmp.so.2 FF380000 40 40 - - r-x-- libsocket.so.1 FF39A000 8 8 - - rwx-- libsocket.so.1 FF3A0000 8 8 - - r-x-- libdl.so.1 FF3B0000 8 8 8 - rwx-- [ anon ] FF3C0000 152 152 - - r-x-- ld.so.1 FF3F6000 8 8 8 - rwx-- ld.so.1 FFBFE000 8 8 8 - rw--- [ stack ] -------- ------- ------- ------- ------- total Kb 1848 1728 40 - Here we see each chunk of memory, what it is being used for, how much space it is taking (virtual and real), and mode information. df -h (Solaris >= 9): This command is popular on Linux, and just made its way into Solaris. df -h displays summary information about file systems in human-readable form: $ df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c0t0d0s0      4.8G   1.7G   3.0G    37%    /
/proc                    0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
fd                       0K     0K     0K     0%    /dev/fd
swap                   848M    40K   848M     1%    /var/run
swap                   849M   1.0M   848M     1%    /tmp
/dev/dsk/c0t0d0s7       13G    78K    13G     1%    /export/home

Conclusion

Each administrator has a set of tools used daily, and another set of tools to help in a pinch. This column included a wide variety of commands and options that are lesser known, but can be very useful. Do you have favorite tools that have saved you in a bind? If so, please send them to me so I can expand my tool set as well. Alternately, send along any tools that you hate or that you feel are dangerous, which could also turn into a useful column!

#### [Jan 13, 2004] The art of writing Linux utilities Peter Seebach

What makes a good utility?

There is a wonderful discussion of this question in The UNIX Programming Environment, by Kernighan & Pike. A good utility is one that does its job as well as possible. It has to play well with others; it has to be amenable to being combined with other utilities. A program that doesn't combine with others isn't a utility; it's an application.

Utilities are supposed to let you build one-off applications cheaply and easily from the materials at hand. A lot of people think of them as being like tools in a toolbox. The goal is not to have a single widget that does everything, but to have a handful of tools, each of which does one thing as well as possible.

Some utilities are reasonably useful on their own, whereas others imply cooperation in pipelines of utilities. Examples of the former include sort and grep. On the other hand, xargs is rarely used except with other utilities, most often  find.

 What language to write in? Most of the UNIX system utilities are written in C. The examples here are in Perl and sh. Use the right tool for the right job. If you use a utility heavily enough, the cost of writing it in a compiled language might be justified by the performance gain. On the other hand, for the fairly common case where a program's workload is light, a scripting language may offer faster development. If you aren't sure, you should use the language you know best. At least when you're prototyping a utility, or figuring out how useful it is, favor programmer efficiency over performance tuning. Most of the UNIX system utilities are in C, simply because they're heavily used enough to justify the development cost. Perl and sh (or ksh) can be good languages for a quick prototype. Utilities that tie other programs together may be easier to write in a shell than in a more conventional programming language. On the other hand, any time you want to interact with raw bytes, C is probably looming on your horizon.

Designing a utility

A good rule of thumb is to start thinking about the design of a utility the second time you have to solve a problem. Don't mourn the one-off hack you write the first time; think of it as a prototype. The second time, compare what you need to do with what you needed to do the first time. Around the third time, you should start thinking about taking the time to write a general utility. Even a merely repetitive task might merit the development of a utility; for instance, many generalized file-renaming programs have been written based on the frustration of trying to rename files in a generalized way.

Here are some design goals of utilities; each gets its own section, below.

• Do one thing well.
• Be a filter.
• Generalize.
• Be robust.
• Be new.

Do one thing well

Do one thing well; don't do multiple things badly. The best example of this doing one thing well is probably sort. No utilities other than sort have a sort feature. The idea is simple; if you only solve a problem once, you can take the time to do it well.

Imagine how frustrating it would be if most programs sorted data, but some supported only lexographic sorts, while others supported only numeric sorts, and a few even supported selection of keys rather than sorting by whole lines. It would be annoying at best.

When you find a problem to solve, try to break the problem up into parts, and don't duplicate the parts for which utilities already exist. The more you can focus on a tool that lets you work with existing tools, the better the chances that your utility will stay useful.

You may need to write more than one program. The best way to solve a specialized task is often to write one or two utilities and a bit of glue to tie them together, rather than writing a single program to solve the whole thing. It's fine to use a 20-line shell script to tie your new utility together with existing tools. If you try to solve the whole problem at once, the first change that comes along might require you to rethink everything.

I have occasionally needed to produce two-column or three-column output from a database. It is generally more efficient to write a program to build the output in a single column and then glue it to a program that puts things in columns. The shell script that combines these two utilities is itself a throwaway; the separate utilities have outlived it.

Some utilities serve very specialized needs. If the output of ls in a crowded directory scrolls off the screen very quickly, it might be because there's a file with a very long name, forcing  ls to use only a single column for output. Paging through it using more takes time. Why not just sort lines by length, and pipe the result through tail, as follows?

Listing 1. One of the smallest utilities anywhere, sl

 #/usr/bin/perl -w print sort { length $a <=> length$b } <>;

The script in Listing 1 does exactly one thing. It takes no options, because it needs no options; it only cares about the length of lines. Thanks to Perl's convenient  <> idiom, this automatically works either on standard input or on files named on the command line.

Be a filter

Almost all utilities are best conceived of as filters, although a few very useful utilities don't fit this model. (For instance, a program that counts might be very useful, even though it doesn't work well as a filter. Programs that take only command-line arguments as input, and produce potentially complicated output, can be very useful.) Most utilities, though, should work as filters. By convention, filters work on lines of text. Most filters should have some support for running on multiple input files.

Remember that a utility needs to work on the command line and in scripts. Sometimes, the ideal behavior varies a little. For instance, most versions of ls automatically sort input into columns when writing to a terminal. The default behavior of grep is to print the file name in which a match was found only if multiple files were specified. Such differences should have to do with how users will want the utility to work, not with other agendas. For instance, old versions of GNU bc displayed an intrusive copyright notice when started. Please don't do that. Make your utility stick to doing its job.

Utilities like to live in pipelines. A pipeline lets a utility focus on doing its job, and nothing else. To live in a pipeline, a utility needs to read data from standard input and write data to standard output. If you want to deal with records, it's best if you can make each line be a "record." Existing programs such as  sort and  join are already thinking that way. They'll thank you for it.

One utility I occasionally use is a program that calls other programs iteratively over a tree of files. This makes very good use of the standard UNIX utility filter model, but it only works with utilities that read input and write output; you can't use it with utilities that operate in place, or take input and output file names.

Most programs that can run from standard input can also reasonably be run on a single file, or possibly on a group of files. Note that this arguably violates the rule against duplicating effort; obviously, this could be managed by feeding  cat into the next program in the series. However, in practice, it seems to be justified.

Some programs may legitimately read records in one format but produce something entirely different. An example would be a utility to put material into columnar form. Such a utility might equate lines to records on input, but produce multiple records per line on output.

Not every utility fits entirely into this model. For instance, xargs takes not records but names of files as input, and all of the actual processing is done by some other program.

Generalize

Try to think of tasks similar to the one you're actually performing; if you can find a general description of these tasks, it may be best to try to write a utility that fits that description. For instance, if you find yourself sorting text lexicographically one day and numerically another day, it might make sense to consider attempting a general sort utility.

Generalizing functionality sometimes leads to the discovery that what seemed like a single utility is really two utilities used in concert. That's fine. Two well-defined utilities can be easier to write than one ugly or complicated one.

Doing one thing well doesn't mean doing exactly one thing. It means handling a consistent but useful problem space. Lots of people use grep. However, a great deal of its utility comes from the ability to perform related tasks. The various options to grep do the work of a handful of small utilities that would have ended up sharing, or duplicating, a lot of code.

This rule, and the rule to do one thing, are both corollaries of an underlying principle: avoid duplication of code whenever possible. If you write a half-dozen programs, each of which sorts lines, you can end up having to fix similar bugs half a dozen times instead of having one better-maintained sort program to work on.

This is the part of writing a utility that adds the most work to the process of getting it completed. You may not have time to generalize something fully at first, but it pays off when you get to keep using the utility.

Sometimes, it's very useful to add related functionality to a program, even when it's not quite the same task. For instance, a program to pretty-print raw binary data might be more useful if, when run on a terminal device, it threw the terminal into raw mode. This makes it a lot easier to test questions involving keymaps, new keyboards, and the like. Not sure why you're getting tildes when you hit the delete key? This is an easy way to find out what's really getting sent. It's not exactly the same task, but it's similar enough to be a likely addition.

The errno utility in Listing 2 below is a good example of generalizing, as it supports both numeric and symbolic names.

Be robust

It's important that a utility be durable. A utility that crashes easily or can't handle real data is not a useful utility. Utilities should handle arbitrarily long lines, huge files, and so on. It is perhaps tolerable for a utility to fail on a data set larger than it can hold in memory, but some utilities don't do this; for instance,  sort, by using temporary files, can generally sort data sets much larger than it can hold in memory.

Try to make sure you've figured out what data your utility can possibly run on. Don't just ignore the possibility of data you can't handle. Check for it and diagnose it. The more specific your error messages, the more helpful you are being to your users. Try to give the user enough information to know what happened and how to fix it. When processing data files, try to identify exactly what the malformed data was. When trying to parse a number, don't just give up; tell the user what you got, and if possible, what line of the input stream the data was on.

As a good example, consider the difference between two implementations of dc. If you run dc /home, one of them says "Cannot use directory as input!" The other just returns silently; no error message, no unusual exit code. Which of these would you rather have in your path when you make a typo on a cd command? Similarly, the former will give verbose error messages if you feed it the stream of data from a directory, perhaps by doing  dc < /home. On the other hand, it might be nice for it to give up early on when getting invalid data.

Security holes are often rooted in a program that isn't robust in the face of unexpected data. Keep in mind that a good utility might find its way into a shell script run as root. A buffer overflow in a program such as find is likely to be a risk to a great number of systems.

The better a program deals with unexpected data, the more likely it is to adapt well to varied circumstances. Often, trying to make a program more robust leads to a better understanding of its role, and better generalizations of it.

Be new

One of the worst kinds of utility to write is the one you already have. I wrote a wonderful utility called count. It allowed me to perform just about any counting task. It's a great utility, but there's a standard BSD utility called jot that does the same thing. Likewise, my very clever program for turning data into columns duplicates an existing utility,  rs, likewise found on BSD systems except that rs is much more flexible and better designed. See Resources below for more information on  jot and rs.

If you're about to start writing a utility, take a bit of time to browse around a few systems to see if there might be one already. Don't be afraid to steal Linux utilities for use on BSD, or BSD utilities for use on Linux; one of the joys of utility code is that almost all utilities are quite portable.

Don't forget to look at the possibility of combining existing applications to make a utility. It is possible, in theory, that you'll find stringing existing programs together is not fast enough, but it's very rare that writing a new utility is faster than waiting for a slightly slow pipeline.

An example utility

In a sense this program is a counterexample, in that it is never useful as a filter. It works very well as a command-line utility, however.

This program does one thing only. It prints out errno lines from /usr/include/sys/errno.h in a slightly pretty-printed format. For instance:

$errno 22 EINVAL [22]: Invalid argument  Listing 2. Errno finder   #!/bin/sh usage() { echo >&2 "usage: errno [numbers or error names]\n" exit 1 } for i do case "$i" in [0-9]*) awk '/^#define/ && $3 == '"$i"' { for (i = 5; i < NF; ++i) { foo = foo " " $i; } printf("%-22s%s\n",$2 " [" $3 "]:", foo); foo = "" }' < /usr/include/sys/errno.h ;; E*) awk '/^#define/ &&$2 == "'"$i"'" { for (i = 5; i < NF; ++i) { foo = foo " "$i; } printf("%-22s%s\n", $2 " ["$3 "]:", foo); foo = "" }' < /usr/include/sys/errno.h ;; *) echo >&2 "errno: can't figure out whether '\$i' is a name or a number." usage ;; esac done 

Does it generalize? Yes, nicely. It supports both numeric and symbolic names. On the other hand, it doesn't know about other files, such as /usr/include/sys/signal.h, that are likely in the same format. It could easily be extended to do that, but for a convenience utility like this, it's easier to just make a copy called "signal" that reads signal.h, and uses "SIG*" as the pattern to match a name.

This is just a tad more convenient than using grep on system header files, but it's less error-prone. It doesn't produce garbled results from ill-considered arguments. On the other hand, it produces no diagnostic if a given name or number is not found in the header. It also doesn't bother to correct some invalid inputs. Still, as a command-line utility never intended to be used in an automated context, it's okay.

Another example might be a program to unsort input (see Resources for a link to this utility). This is simple enough; read in input files, store them in some way, then generate a random order in which to print out the lines. This is a utility of nearly infinite applications. It's also a lot easier to write than a sorting program; for instance, you don't need to specify which keys you're not sorting on, or whether you want things in a random order alphabetically, lexicographically, or numerically. The tricky part comes in reading in potentially very long lines. In fact, the provided version cheats; it assumes there will be no null bytes in the lines it reads. It's a lot harder to get that right, and I was lazy when I wrote it.

Summary

If you find yourself performing a task repeatedly, consider writing a program to do it. If the program turns out to be reasonable to generalize a bit, generalize it, and you will have written a utility.

Don't design the utility the first time you need it. Wait until you have some experience. Feel free to write a prototype or two; a good utility is sufficiently better than a bad utility to justify a bit of time and effort on researching it. Don't feel bad if what you thought would be a great utility ends up gathering dust after you wrote it. If you find yourself frustrated by your new program's shortcomings, you just had another prototyping phase. If it turns out to be useless, well, that happens sometimes.

The thing you're looking for is a program that finds general application outside your initial usage patterns. I wrote unsort because I wanted an easy way to get a random series of colors out of an old X11 "rgb.txt" file. Since then, I've used it for an incredible number of tasks, not the least of which was producing test data for debugging and benchmarking sort routines.

One good utility can pay back the time you spent on all the near misses. The next thing to do is make it available for others, so they can experiment. Make your failed attempts available, too; other people may have a use for a utility you didn't need. More importantly, your failed utility may be someone else's prototype, and lead to a wonderful utility program for everyone.

#### [Jul 3, 2003] dunne.dyn.dhs.org/Using the m4 Macro Processor - updated link

"What is it about m4 that makes it so useful, and yet so overlooked? m4 -- a macro processor -- unfortunately has a dry name that disguises a great utility. A macro processor is basically a program that scans text and looks for defined symbols, which it replaces with other text or other symbols."