Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers May the source be with you, but remember the KISS principle ;-) Skepticism and critical thinking is not panacea, but can help to understand the world better

# Classic Unix Utilities

 News Recommended Books Recommended Links Pipable Tools Reference Shells bash Pipes Selected Manpages cut find sort awk tr dd tee ifconfig xargs uniq route cat tail grep sudo sar eval paste head teraterm screen split expr join rpm ln at wc ps YUM rpm diff touch alias cp diff_tools vnc expect logrotate netcat Curl tar cron chpasswd gcc chmod chown df du mkisofs logrotate tree kill nohup pushd popd dirs last e2label hostname nslookup dig basename dirname dos2unix dmesg Saferm -- wrapper for rm command OFM CVS date history Horror Stories Unix Sysadmin Tips Unix History Humor Etc
 There are many people who use UNIX or Linux but who IMHO do not understand UNIX. UNIX is not just an operating system, it is a way of doing things, and the shell plays a key role by providing the glue that makes it work. The UNIX methodology relies heavily on reuse of a set of tools rather than on building monolithic applications. Even perl programmers often miss the point, writing the heart and soul of the application as perl script without making use of the UNIX toolkit. David Korn(bold italic is mine -- BNN)

## Alphabetical list

 A B C D E F G H I J K L M N O P R S Q T U V W X Y Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

P

R

S

T

U

V

W

X

Y

Z

IMHO there are three Unix tools that can spell the difference between really good programmer or sysadmin and just above average one (even if the latter has solid knowledge of shell and Perl, knowledge of shell and Perl is necessary but not sufficient):

• OFM (Midnight Commander, Deco, XNC) - a unique class of file managers that greatly accelirate working with the classic command line Unix tools. Paradoxically came to Unix from DOS. See The Orthodox File Manager(OFM) Paradigm. Chapter 4.

• Expect - a unique Unix tool (that is now available for Windows too). BTW one of the earlier names for Expect was "sex" as it related to "intercourse" of programs ;-). I strongly recommend to learn how to use it. See TCL, TK & Expect for more information

• TCL -- Tool command language. This is a unique language that permits automating tasks that neither shell not Perl can do. It is used in Expect (see above). Unfortunately politics of Unix (forking efforts of Richard Stallman (see Guile, a Scheme-based GNU macro language :-( ) and, especially, Sun fascination with Java) prevented TCL from becoming a standard Unix macro language. As Wikipedia noted " Despite the enthusiasm of its users and developers, many novice programmers find Scheme intimidating - and the average skill level of scripting language programmers is substantially lower than for system and application programmers. Hence Guile, despite its many benefits, struggles for mainstream acceptance in the Linux/Unix world. ". For the dark side of RMS see The Tcl War and the second part of my RMS biography

This two tools can also be used as a fine text in interviews on advanced Unix-related positions if you have several similar candidates. Other things equal, their knowledge definitely demonstrate the level of Unix culture superior to the average "command line junkies" level ;-)

Overview of books about GNU/open source tools can be found in Unix tools bibliography. There not that much good books on the subject, still even average books can provide you with insight in usage of the tool that you might never get via daily practice.

Please note that Unix is a pretty complex system and some aspects of it are non-obvious even for those who have more than ten years of experience.

Dr. Nikolai Bezroukov

 Top Visited

Your browser does not support iframes.

Switchboard Latest Past week Past month

## Old News ;-)

#### [Dec 12, 2019] Use timedatectl to Control System Time and Date in Linux

###### Dec 12, 2019 | www.maketecheasier.com

Mastering the Command Line: Use timedatectl to Control System Time and Date in Linux By Himanshu Arora – Posted on Nov 11, 2014 Nov 9, 2014 in Linux

The timedatectl command in Linux allows you to query and change the system clock and its settings. It comes as part of systemd, a replacement for the sysvinit daemon used in the GNU/Linux and Unix systems.

In this article, we will discuss this command and the features it provides using relevant examples.

Timedatectl examples

Note – All examples described in this article are tested on GNU bash, version 4.3.11(1).

Display system date/time information

Simply run the command without any command line options or flags, and it gives you information on the system's current date and time, as well as time-related settings. For example, here is the output when I executed the command on my system:

$timedatectl Local time: Sat 2014-11-08 05:46:40 IST Universal time: Sat 2014-11-08 00:16:40 UTC Timezone: Asia/Kolkata (IST, +0530) NTP enabled: yes NTP synchronized: yes RTC in local TZ: no DST active: n/a  So you can see that the output contains information on LTC, UTC, and time zone, as well as settings related to NTP, RTC and DST for the localhost. Update the system date or time using the set-time option To set the system clock to a specified date or time, use the set-time option followed by a string containing the new date/time information. For example, to change the system time to 6:40 am, I used the following command: $ sudo timedatectl set-time "2014-11-08 06:40:00"


and here is the output:

$timedatectl Local time: Sat 2014-11-08 06:40:02 IST Universal time: Sat 2014-11-08 01:10:02 UTC Timezone: Asia/Kolkata (IST, +0530) NTP enabled: yes NTP synchronized: no RTC in local TZ: no DST active: n/a  Observe that the Local time field now shows the updated time. Similarly, you can update the system date, too. Update the system time zone using the set-timezone option To set the system time zone to the specified value, you can use the set-timezone option followed by the time zone value. To help you with the task, the timedatectl command also provides another useful option. list-timezones provides you with a list of available time zones to choose from. For example, here is the scrollable list of time zones the timedatectl command produced on my system: To change the system's current time zone from Asia/Kolkata to Asia/Kathmandu, here is the command I used: $ timedatectl set-timezone Asia/Kathmandu


and to verify the change, here is the output of the timedatectl command:

$timedatectl Local time: Sat 2014-11-08 07:11:23 NPT Universal time: Sat 2014-11-08 01:26:23 UTC Timezone: Asia/Kathmandu (NPT, +0545) NTP enabled: yes NTP synchronized: no RTC in local TZ: no DST active: n/a  You can see that the time zone was changed to the new value. Configure RTC You can also use the timedatectl command to configure RTC (real-time clock). For those who are unaware, RTC is a battery-powered computer clock that keeps track of the time even when the system is turned off. The timedatectl command offers a set-local-rtc option which can be used to maintain the RTC in either local time or universal time. This option requires a boolean argument. If 0 is supplied, the system is configured to maintain the RTC in universal time: $ timedatectl set-local-rtc 0


but in case 1 is supplied, it will maintain the RTC in local time instead.

$timedatectl set-local-rtc 1  A word of caution : Maintaining the RTC in the local time zone is not fully supported and will create various problems with time zone changes and daylight saving adjustments. If at all possible, use RTC in UTC. Another point worth noting is that if set-local-rtc is invoked and the --adjust-system-clock option is passed, the system clock is synchronized from the RTC again, taking the new setting into account. Otherwise the RTC is synchronized from the system clock. Configure NTP-based network time synchronization NTP, or Network Time Protocol, is a networking protocol for clock synchronization between computer systems over packet-switched, variable-latency data networks. It is intended to synchronize all participating computers to within a few milliseconds of UTC. The timedatectl command provides a set-ntp option that controls whether NTP based network time synchronization is enabled. This option expects a boolean argument. To enable NTP-based time synchronization, run the following command: $ timedatectl set-ntp true


To disable, run:

$timedatectl set-ntp false  Conclusion As evident from the examples described above, the timedatectl command is a handy tool for system administrators who can use it to to adjust various system clocks and RTC configurations as well as poll remote servers for time information. To learn more about the command, head over to its man page . #### [Dec 12, 2019] Set Time-Date-Timezone using Command Line in Linux ###### Dec 12, 2019 | linoxide.com Set Time/Date/Timezone in Ubuntu Linux February 5, 2019 Updated September 27, 2019 By Pungki Arianto LINUX COMMANDS , LINUX HOWTO Time is an important aspect in Linux systems especially in critical services such as cron jobs. Having the correct time on the server ensures that the server operates in a healthy environment that consists of distributed systems and maintains accuracy in the workplace. In this tutorial, we will focus on how to set time/date/time zone and to synchronize the server clock with your Ubuntu Linux machine. Check Current Time You can verify the current time and date using the date and the timedatectl commands. These linux commands can be executed straight from the terminal as a regular user or as a superuser. The commands are handy usefulness of the two commands is seen when you want to correct a wrong time from the command line. Using the date command Log in as a root user and use the command as follows $ date


Output

You can also use the same command to check a date 2 days ago

$date --date="2 days ago"  Output Using timedatectl command Checking on the status of the time on your system as well as the present time settings, use the command timedatectl as shown # timedatectl  or # timedatectl status  Changing Time We use the timedatectl to change system time using the format HH:MM: SS. HH stands for the hour in 24-hour format, MM stands for minutes and SS for seconds. Setting the time to 09:08:07 use the command as follows (using the timedatectl) # timedatectl set-time 09:08:07  using date command Changing time means all the system processes are running on the same clock putting the desktop and server at the same time. From the command line, use date command as follows # date +%T -s "10:13:13"  Where, • 10: Hour (hh) • 13: Minute (mm) • 13: Second (ss) To change the locale to either AM or PM use the %p in the following format. # date +%T%p -s "6:10:30AM"  # date +%T%p -s "12:10:30PM"  Change Date Generally, you want your system date and time is set automatically. If for some reason you have to change it manually using date command, we can use this command : # date --set="20140125 09:17:00" It will set your current date and time of your system into 'January 25, 2014' and '09:17:00 AM'. Please note, that you must have root privilege to do this. You can use timedatectl to set the time and the date respectively. The accepted format is YYYY-MM-DD, YYYY represents the year, MM the month in two digits and DD for the day in two digits. Changing the date to 15 January 2019, you should use the following command # timedatectl set-time 20190115  Create custom date format To create custom date format, use a plus sign (+)$ date +"Day : %d Month : %m Year : %Y"
Day: 05 Month: 12 Year: 2013

$date +%D 12/05/13 %D format follows Year/Month/Day format . You can also put the day name if you want. Here are some examples :$ date +"%a %b %d %y"
Fri 06 Dec 2013

$date +"%A %B %d %Y" Friday December 06 2013$ date +"%A %B %d %Y %T"
Friday December 06 2013 00:30:37

$date +"%A %B-%d-%Y %c" Friday December-06-2013 12:30:37 AM WIB List/Change time zone Changing the time zone is crucial when you want to ensure that everything synchronizes with the Network Time Protocol. The first thing to do is to list all the region's time zones using the list-time zones option or grep to make the command easy to understand # timedatectl list-timezones  The above command will present a scrollable format. Recommended timezone for servers is UTC as it doesn't have daylight savings. If you know, the specific time zones set it using the name using the following command # timedatectl set-timezone America/Los_Angeles  To display timezone execute # timedatectl | grep "Time"  Set the Local-rtc The Real-time clock (RTC) which is also referred to as the hardware clock is independent of the operating system and continues to run even when the server is shut down. Use the following command # timedatectl set-local-rtc 0  In addition, the following command for the local time # timedatectl set-local-rtc 1  Check/Change CMOS Time The computer CMOS battery will automatically synchronize time with system clock as long as the CMOS is working correctly. Use the hwclock command to check the CMOS date as follows # hwclock  To synchronize the CMOS date with system date use the following format # hwclock –systohc  To have the correct time for your Linux environment is critical because many operations depend on it. Such operations include logging events and corn jobs as well. we hope you found this article useful. Read Also: #### [Dec 12, 2019] command line - Reattaching to an existing screen session - Ask Ubuntu ###### Jan 01, 2013 | askubuntu.com Reattaching to an existing screen session Ask Question Asked 6 years, 6 months ago Active 1 year, 3 months ago Viewed 262k times JohnMerlino , 2013-06-01 01:39:54 I have a program running under screen. In fact, when I detach from the session and check netstat, I can see the program is still running (which is what I want): udp 0 0 127.0.0.1:1720 0.0.0.0:* 3759/ruby  Now I want to reattach to the session running that process. So I start up a new terminal, and type screen -r $ screen -r
There are several suitable screens on:
5169.pts-2.teamviggy    (05/31/2013 09:30:28 PM)    (Detached)
4872.pts-2.teamviggy    (05/31/2013 09:25:30 PM)    (Detached)
4572.pts-2.teamviggy    (05/31/2013 09:07:17 PM)    (Detached)
4073.pts-2.teamviggy    (05/31/2013 08:50:54 PM)    (Detached)
3600.pts-2.teamviggy    (05/31/2013 08:40:14 PM)    (Detached)
Type "screen [-d] -r [pid.]tty.host" to resume one of them.


But how do I know which one is the session running that process I created?

Now one of the documents I came across said:

"When you're using a window, type C-a A to give it a name. This name will be used in the window listing, and will help you remember what you're doing in each window when you start using a lot of windows."

The thing is when I am in a new screen session, I try to press control+a A and nothing happens.

Paul ,

There are two levels of "listings" involved here. First, you have the "window listing" within an individual session, which is what ctrl-A A is for, and second there is a "session listing" which is what you have pasted in your question and what can also be viewed with screen -ls .

You can customize the session names with the -S parameter, otherwise it uses your hostname (teamviggy), for example:

$screen  (ctrl-A d to detach) $ screen -S myprogramrunningunderscreen


(ctrl-A d to detach)

$screen -ls There are screens on: 4964.myprogramrunningunderscreen (05/31/2013 09:42:29 PM) (Detached) 4874.pts-1.creeper (05/31/2013 09:39:12 PM) (Detached) 2 Sockets in /var/run/screen/S-paul.  As a bonus, you can use an unambiguous abbreviation of the name you pass to -S later to reconnect: screen -r myprog  (I am reconnected to the myprogramrunningunderscreen session) njcwotx , I had a case where screen -r failed to reattach. Adding the -d flag so it looked like this screen -d -r  worked for me. It detached the previous screen and allowed me to reattach. See the Man Page for more information. Dr K , An easy way is to simply reconnect to an arbitrary screen with screen -r  Then once you are running screen, you can get a list of all active screens by hitting Ctrl-A " (i.e. control-A followed by a double quote). Then you can just select the active screens one at a time and see what they are running. Naming the screens will, of course, make it easier to identify the right one. Just my two cents Lefty G Balogh , I tend to use the following combo where I need to work on several machines in several clusters: screen -S clusterX  This creates the new screen session where I can build up the environment. screen -dRR clusterX  This is what I use subsequently to reattach to that screen session. The nifty bits are that if the session is attached elsewhere, it detaches that other display. Moreover, if there is no session for some quirky reason, like someone rebooted my server without me knowing, it creates one. Finally. if multiple sessions exist, it uses the first one. Much kudos to https://support.dvsus.com/hc/en-us/articles/212925186-Linux-GNU-Screen-instructions for this tip a while back. EDIT: Also here's few useful explanations from man screen on cryptic parameters  -d -r Reattach a session and if necessary detach it first. -d -R Reattach a session and if necessary detach or even create it first. -d -RR Reattach a session and if necessary detach or create it. Use the first session if more than one session is available. -D -r Reattach a session. If necessary detach and logout remotely first.  there is more with -D so be sure to check man screen tilnam , 2018-03-14 17:12:06 The output of screen -list is formatted like pid.tty.host . The pids can be used to get the first child process with pstree : screen -list|cut -f1 -d'.'|cut -f2|xargs -n 1 pstree -p|grep "^screen"  You will get a list like this screen(5169)---zsh(5170)---less(15268) screen(4872)---zsh(4873)-+-cat(11364) ...  > , screen -d -r 4964  or screen -d -r 4874  $ screen -ls
There are screens on:
4964.myprogramrunningunderscreen    (05/31/2013 09:42:29 PM)    (Detached)
4874.pts-1.creeper  (05/31/2013 09:39:12 PM)    (Detached)
2 Sockets in /var/run/screen/S-paul.


#### [Nov 09, 2019] chkservice Is A systemd Unit Manager With A Terminal User Interface

##### Looks like in version 0.3 the author increased the complexity by adding features which probably are not needed at all
###### Nov 07, 2019 | www.linuxuprising.com

chkservice, a terminal user interface (TUI) for managing systemd units, has been updated recently with window resize and search support.

chkservice is a simplistic systemd unit manager that uses ncurses for its terminal interface. Using it you can enable or disable, and start or stop a systemd unit. It also shows the units status (enabled, disabled, static or masked).

You can navigate the chkservice user interface using keyboard shortcuts:

• Up or l to move cursor up
• Down or j to move cursor down
• PgUp or b to move page up
• PgDown or f to move page down
To enable or disable a unit press Space , and to start or stop a unity press s . You can access the help screen which shows all available keys by pressing ? .

The command line tool had its first release in August 2017, with no new releases until a few days ago when version 0.2 was released, quickly followed by 0.3.

With the latest 0.3 release, chkservice adds a search feature that allows easily searching through all systemd units.

To search, type / followed by your search query, and press Enter . To search for the next item matching your search query you'll have to type / again, followed by Enter or Ctrl + m (without entering any search text).

Another addition to the latest chkservice is window resize support. In the 0.1 version, the tool would close when the user tried to resize the terminal window. That's no longer the case now, chkservice allowing the resize of the terminal window it runs in.

And finally, the last addition to the latest chkservice 0.3 is G-g navigation support . Press G ( Shift + g ) to navigate to the bottom, and g to navigate to the top.

The initial (0.1) chkservice version can be found in the official repositories of a few Linux distributions, including Debian and Ubuntu (and Debian or Ubuntu based Linux distribution -- e.g. Linux Mint, Pop!_OS, Elementary OS and so on).

There are some third-party repositories available as well, including a Fedora Copr, Ubuntu / Linux Mint PPA, and Arch Linux AUR, but at the time I'm writing this, only the AUR package was updated to the latest chkservice version 0.3.

You may also install chkservice from source. Use the instructions provided in the tool's readme to either create a DEB package or install it directly.

#### [Nov 08, 2019] How to use cron in Linux by David Both

###### Nov 06, 2017 | opensource.com
No time for commands? Scheduling tasks with cron means programs can run but you don't have to stay up late. 9 comments Image by : Internet Archive Book Images. Modified by Opensource.com. CC BY-SA 4.0 x Subscribe now

Get the highlights in your inbox every week.

https://opensource.com/eloqua-embedded-email-capture-block.html?offer_id=70160000000QzXNAA0

Instead, I use two service utilities that allow me to run commands, programs, and tasks at predetermined times. The cron and at services enable sysadmins to schedule tasks to run at a specific time in the future. The at service specifies a one-time task that runs at a certain time. The cron service can schedule tasks on a repetitive basis, such as daily, weekly, or monthly.

In this article, I'll introduce the cron service and how to use it.

Common (and uncommon) cron uses

I use the cron service to schedule obvious things, such as regular backups that occur daily at 2 a.m. I also use it for less obvious things.

• The system times (i.e., the operating system time) on my many computers are set using the Network Time Protocol (NTP). While NTP sets the system time, it does not set the hardware time, which can drift. I use cron to set the hardware time based on the system time.
• I also have a Bash program I run early every morning that creates a new "message of the day" (MOTD) on each computer. It contains information, such as disk usage, that should be current in order to be useful.
• Many system processes and services, like Logwatch , logrotate , and Rootkit Hunter , use the cron service to schedule tasks and run programs every day.

The crond daemon is the background service that enables cron functionality.

The cron service checks for files in the /var/spool/cron and /etc/cron.d directories and the /etc/anacrontab file. The contents of these files define cron jobs that are to be run at various intervals. The individual user cron files are located in /var/spool/cron , and system services and applications generally add cron job files in the /etc/cron.d directory. The /etc/anacrontab is a special case that will be covered later in this article.

Using crontab

The cron utility runs based on commands specified in a cron table ( crontab ). Each user, including root, can have a cron file. These files don't exist by default, but can be created in the /var/spool/cron directory using the crontab -e command that's also used to edit a cron file (see the script below). I strongly recommend that you not use a standard editor (such as Vi, Vim, Emacs, Nano, or any of the many other editors that are available). Using the crontab command not only allows you to edit the command, it also restarts the crond daemon when you save and exit the editor. The crontab command uses Vi as its underlying editor, because Vi is always present (on even the most basic of installations).

New cron files are empty, so commands must be added from scratch. I added the job definition example below to my own cron files, just as a quick reference, so I know what the various parts of a command mean. Feel free to copy it for your own use.

# crontab -e
SHELL = / bin / bash
MAILTO =root @ example.com
PATH = / bin: / sbin: / usr / bin: / usr / sbin: / usr / local / bin: / usr / local / sbin

# For details see man 4 crontabs

# Example of job definition:
# .---------------- minute (0 - 59)
# | .------------- hour (0 - 23)
# | | .---------- day of month (1 - 31)
# | | | .------- month (1 - 12) OR jan,feb,mar,apr ...
# | | | | .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# | | | | |
# * * * * * user-name command to be executed

# backup using the rsbu program to the internal 4TB HDD and then 4TB external
01 01 * * * / usr / local / bin / rsbu -vbd1 ; / usr / local / bin / rsbu -vbd2

# Set the hardware clock to keep it in sync with the more accurate system clock
03 05 * * * / sbin / hwclock --systohc

# Perform monthly updates on the first of the month
# 25 04 1 * * /usr/bin/dnf -y update

The crontab command is used to view or edit the cron files.

The first three lines in the code above set up a default environment. The environment must be set to whatever is necessary for a given user because cron does not provide an environment of any kind. The SHELL variable specifies the shell to use when commands are executed. This example specifies the Bash shell. The MAILTO variable sets the email address where cron job results will be sent. These emails can provide the status of the cron job (backups, updates, etc.) and consist of the output you would see if you ran the program manually from the command line. The third line sets up the PATH for the environment. Even though the path is set here, I always prepend the fully qualified path to each executable.

There are several comment lines in the example above that detail the syntax required to define a cron job. I'll break those commands down, then add a few more to show you some more advanced capabilities of crontab files.

01 01 * * * /usr/local/bin/rsbu -vbd1 ; /usr/local/bin/rsbu -vbd2


This line in my /etc/crontab runs a script that performs backups for my systems.

This line runs my self-written Bash shell script, rsbu , that backs up all my systems. This job kicks off at 1:01 a.m. (01 01) every day. The asterisks (*) in positions three, four, and five of the time specification are like file globs, or wildcards, for other time divisions; they specify "every day of the month," "every month," and "every day of the week." This line runs my backups twice; one backs up to an internal dedicated backup hard drive, and the other backs up to an external USB drive that I can take to the safe deposit box.

The following line sets the hardware clock on the computer using the system clock as the source of an accurate time. This line is set to run at 5:03 a.m. (03 05) every day.

03 05 * * * /sbin/hwclock --systohc


This line sets the hardware clock using the system time as the source.

I was using the third and final cron job (commented out) to perform a dnf or yum update at 04:25 a.m. on the first day of each month, but I commented it out so it no longer runs.

# 25 04 1 * * /usr/bin/dnf -y update


This line used to perform a monthly update, but I've commented it out.

Other scheduling tricks

Now let's do some things that are a little more interesting than these basics. Suppose you want to run a particular job every Thursday at 3 p.m.:

00 15 * * Thu /usr/local/bin/mycronjob.sh


This line runs mycronjob.sh every Thursday at 3 p.m.

Or, maybe you need to run quarterly reports after the end of each quarter. The cron service has no option for "The last day of the month," so instead you can use the first day of the following month, as shown below. (This assumes that the data needed for the reports will be ready when the job is set to run.)

02 03 1 1,4,7,10 * /usr/local/bin/reports.sh


This cron job runs quarterly reports on the first day of the month after a quarter ends.

The following shows a job that runs one minute past every hour between 9:01 a.m. and 5:01 p.m.

01 09-17 * * * /usr/local/bin/hourlyreminder.sh


Sometimes you want to run jobs at regular times during normal business hours.

I have encountered situations where I need to run a job every two, three, or four hours. That can be accomplished by dividing the hours by the desired interval, such as */3 for every three hours, or 6-18/3 to run every three hours between 6 a.m. and 6 p.m. Other intervals can be divided similarly; for example, the expression */15 in the minutes position means "run the job every 15 minutes."

*/5 08-18/2 * * * /usr/local/bin/mycronjob.sh


This cron job runs every five minutes during every hour between 8 a.m. and 5:58 p.m.

One thing to note: The division expressions must result in a remainder of zero for the job to run. That's why, in this example, the job is set to run every five minutes (08:05, 08:10, 08:15, etc.) during even-numbered hours from 8 a.m. to 6 p.m., but not during any odd-numbered hours. For example, the job will not run at all from 9 p.m. to 9:59 a.m.

I am sure you can come up with many other possibilities based on these examples.

Limiting cron access

More Linux resources

Regular users with cron access could make mistakes that, for example, might cause system resources (such as memory and CPU time) to be swamped. To prevent possible misuse, the sysadmin can limit user access by creating a /etc/cron.allow file that contains a list of all users with permission to create cron jobs. The root user cannot be prevented from using cron.

By preventing non-root users from creating their own cron jobs, it may be necessary for root to add their cron jobs to the root crontab. "But wait!" you say. "Doesn't that run those jobs as root?" Not necessarily. In the first example in this article, the username field shown in the comments can be used to specify the user ID a job is to have when it runs. This prevents the specified non-root user's jobs from running as root. The following example shows a job definition that runs a job as the user "student":

04 07 * * * student /usr/local/bin/mycronjob.sh


If no user is specified, the job is run as the user that owns the crontab file, root in this case.

cron.d

The directory /etc/cron.d is where some applications, such as SpamAssassin and sysstat , install cron files. Because there is no spamassassin or sysstat user, these programs need a place to locate cron files, so they are placed in /etc/cron.d .

The /etc/cron.d/sysstat file below contains cron jobs that relate to system activity reporting (SAR). These cron files have the same format as a user cron file.

# Run system activity accounting tool every 10 minutes
*/ 10 * * * * root / usr / lib64 / sa / sa1 1 1
# Generate a daily summary of process accounting at 23:53
53 23 * * * root / usr / lib64 / sa / sa2 -A

The sysstat package installs the /etc/cron.d/sysstat cron file to run programs for SAR.

The sysstat cron file has two lines that perform tasks. The first line runs the sa1 program every 10 minutes to collect data stored in special binary files in the /var/log/sa directory. Then, every night at 23:53, the sa2 program runs to create a daily summary.

Scheduling tips

Some of the times I set in the crontab files seem rather random -- and to some extent they are. Trying to schedule cron jobs can be challenging, especially as the number of jobs increases. I usually have only a few tasks to schedule on each of my computers, which is simpler than in some of the production and lab environments where I have worked.

One system I administered had around a dozen cron jobs that ran every night and an additional three or four that ran on weekends or the first of the month. That was a challenge, because if too many jobs ran at the same time -- especially the backups and compiles -- the system would run out of RAM and nearly fill the swap file, which resulted in system thrashing while performance tanked, so nothing got done. We added more memory and improved how we scheduled tasks. We also removed a task that was very poorly written and used large amounts of memory.

The crond service assumes that the host computer runs all the time. That means that if the computer is turned off during a period when cron jobs were scheduled to run, they will not run until the next time they are scheduled. This might cause problems if they are critical cron jobs. Fortunately, there is another option for running jobs at regular intervals: anacron .

anacron

The anacron program performs the same function as crond, but it adds the ability to run jobs that were skipped, such as if the computer was off or otherwise unable to run the job for one or more cycles. This is very useful for laptops and other computers that are turned off or put into sleep mode.

As soon as the computer is turned on and booted, anacron checks to see whether configured jobs missed their last scheduled run. If they have, those jobs run immediately, but only once (no matter how many cycles have been missed). For example, if a weekly job was not run for three weeks because the system was shut down while you were on vacation, it would be run soon after you turn the computer on, but only once, not three times.

The anacron program provides some easy options for running regularly scheduled tasks. Just install your scripts in the /etc/cron.[hourly|daily|weekly|monthly] directories, depending how frequently they need to be run.

How does this work? The sequence is simpler than it first appears.

1. The crond service runs the cron job specified in /etc/cron.d/0hourly .
# Run the hourly jobs
SHELL = / bin / bash
PATH = / sbin: / bin: / usr / sbin: / usr / bin
MAILTO =root
01 * * * * root run-parts / etc / cron.hourly

The contents of /etc/cron.d/0hourly cause the shell scripts located in /etc/cron.hourly to run.

1. The cron job specified in /etc/cron.d/0hourly runs the run-parts program once per hour.
2. The run-parts program runs all the scripts located in the /etc/cron.hourly directory.
3. The /etc/cron.hourly directory contains the 0anacron script, which runs the anacron program using the /etdc/anacrontab configuration file shown here.
# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.

SHELL = / bin / sh
PATH = / sbin: / bin: / usr / sbin: / usr / bin
MAILTO =root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY = 45
# the jobs will be started during the following hours only
START_HOURS_RANGE = 3 - 22

#period in days delay in minutes job-identifier command
1 5 cron.daily nice run-parts / etc / cron.daily
7 25 cron.weekly nice run-parts / etc / cron.weekly
@ monthly 45 cron.monthly nice run-parts / etc / cron.monthly

The contents of /etc/anacrontab file runs the executable files in the cron.[daily|weekly|monthly] directories at the appropriate times.

1. The anacron program runs the programs located in /etc/cron.daily once per day; it runs the jobs located in /etc/cron.weekly once per week, and the jobs in cron.monthly once per month. Note the specified delay times in each line that help prevent these jobs from overlapping themselves and other cron jobs.

Instead of placing complete Bash programs in the cron.X directories, I install them in the /usr/local/bin directory, which allows me to run them easily from the command line. Then I add a symlink in the appropriate cron directory, such as /etc/cron.daily .

The anacron program is not designed to run programs at specific times. Rather, it is intended to run programs at intervals that begin at the specified times, such as 3 a.m. (see the START_HOURS_RANGE line in the script just above) of each day, on Sunday (to begin the week), and on the first day of the month. If any one or more cycles are missed, anacron will run the missed jobs once, as soon as possible.

More on setting limits

I use most of these methods for scheduling tasks to run on my computers. All those tasks are ones that need to run with root privileges. It's rare in my experience that regular users really need a cron job. One case was a developer user who needed a cron job to kick off a daily compile in a development lab.

It is important to restrict access to cron functions by non-root users. However, there are circumstances when a user needs to set a task to run at pre-specified times, and cron can allow them to do that. Many users do not understand how to properly configure these tasks using cron and they make mistakes. Those mistakes may be harmless, but, more often than not, they can cause problems. By setting functional policies that cause users to interact with the sysadmin, individual cron jobs are much less likely to interfere with other users and other system functions.

It is possible to set limits on the total resources that can be allocated to individual users or groups, but that is an article for another time.

For more information, the man pages for cron , crontab , anacron , anacrontab , and run-parts all have excellent information and descriptions of how the cron system works.

Ben Cotton on 06 Nov 2017 Permalink

One problem I used to have in an old job was cron jobs that would hang for some reason. This old sysadvent post had some good suggestions for how to deal with that: http://sysadvent.blogspot.com/2009/12/cron-practices.html

Jesper Larsen on 06 Nov 2017 Permalink

Cron is definitely a good tool. But if you need to do more advanced scheduling then Apache Airflow is great for this.

Airflow has a number of advantages over Cron. The most important are: Dependencies (let tasks run after other tasks), nice web based overview, automatic failure recovery and a centralized scheduler. The disadvantages are that you will need to setup the scheduler and some other centralized components on one server and a worker on each machine you want to run stuff on.

You definitely want to use Cron for some stuff. But if you find that Cron is too limited for your use case I would recommend looking into Airflow.

Leslle Satenstein on 13 Nov 2017 Permalink

Hi David,
you have a well done article. Much appreciated. I make use of the @reboot crontab entry. With crontab and root. I run the following.

@reboot /bin/dofstrim.sh

I wanted to run fstrim for my SSD drive once and only once per week.
dofstrim.sh is a script that runs the "fstrim" program once per week, irrespective of the number of times the system is rebooted. I happen to have several Linux systems sharing one computer, and each system has a root crontab with that entry. Since I may hop from Linux to Linux in the day or several times per week, my dofstrim.sh only runs fstrim once per week, irrespective which Linux system I boot. I make use of a common partition to all Linux systems, a partition mounted as "/scratch" and the wonderful Linux command line "date" program.

The dofstrim.sh listing follows below.

#!/bin/bash
# run fstrim either once/week or once/day not once for every reboot
#
# Use the date function to extract today's day number or week number
# the day number range is 1..366, weekno is 1 to 53
#WEEKLY=0 #once per day
WEEKLY=1 #once per week
lockdir='/scratch/lock/'

if [[ WEEKLY -eq 1 ]]; then
dayno="$lockdir/dofstrim.weekno" today=$(date +%V)
else
dayno=$lockdir/dofstrim.dayno today=$(date +%j)
fi

prevval="000"

if [ -f "$dayno" ] then prevval=$(cat ${dayno} ) if [ x$prevval = x ];then
prevval="000"
fi
else
mkdir -p $lockdir fi if [${prevval} -ne ${today} ] then /sbin/fstrim -a echo$today > $dayno fi I had thought to use anacron, but then fstrim would be run frequently as each linux's anacron would have a similar entry. The "date" program produces a day number or a week number, depending upon the +%V or +%j Leslle Satenstein on 13 Nov 2017 Permalink Running a report on the last day of the month is easy if you use the date program. Use the date function from Linux as shown */9 15 28-31 * * [ date -d +'1 day' +\%d -eq 1 ] && echo "Tomorrow is the first of month Today(now) is date" >> /root/message Once per day from the 28th to the 31st, the date function is executed. If the result of date +1day is the first of the month, today must be the last day of the month. sgtrock on 14 Nov 2017 Permalink Why not use crontab to launch something like Ansible playbooks instead of simple bash scripts? A lot easier to troubleshoot and manage these days. :-) #### [Oct 25, 2019] Get inode number of a file on linux - Fibrevillage ###### Oct 25, 2019 | www.fibrevillage.com Get inode number of a file on linux An inode is a data structure in UNIX operating systems that contains important information pertaining to files within a file system. When a file system is created in UNIX, a set amount of inodes is created, as well. Usually, about 1 percent of the total file system disk space is allocated to the inode table. How do we find a file's inode ? ls -i Command: display inode ls -i Command: display inode$ls -i /etc/bashrc
131094 /etc/bashrc
131094 is the inode of /etc/bashrc.

Stat Command: display Inode
$stat /etc/bashrc File: /etc/bashrc' Size: 1386 Blocks: 8 IO Block: 4096 regular file Device: fd00h/64768d Inode: 131094 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-12-10 10:01:29.509908811 -0800 Modify: 2013-06-06 11:31:51.792356252 -0700 Change: 2013-06-06 11:31:51.792356252 -0700  find command: display inode $find ./ -iname sysfs_fc_tools.tar -printf '%p %i\n'
./sysfs_fc_tools.tar 28311964


Notes :

    %p stands for file path
%i stands for inode number

tree command: display inode under a directory
#tree -a -L 1 --inodes /etc
/etc
├── [ 132896]  a2ps
├── [ 132898]  a2ps.cfg
├── [ 132897]  a2ps-site.cfg
├── [ 133315]  acpi
...

usecase of using inode
find / -inum XXXXXX -print to find the full path for each file pointing to inode XXXXXX.


Though you can use the example to do rm action, but simply I discourage to do so, for security concern in find command, also in other file system, same inode refers a very different file.

filesystem repair

If you get a bad luck on your filesystem, most of time, run fsck to fix it. It helps if you have inode info of the filesystem in hand.
This is another big topic, I'll have another article for it.

#### [Oct 25, 2019] Howto Delete files by inode number by Erik

###### Feb 10, 2011 | erikimh.com
linux administration - tips, notes and projects

Ever mistakenly pipe output to a file with special characters that you couldn't remove?

-rw-r–r– 1 eriks eriks 4 2011-02-10 22:37 –fooface

Good luck. Anytime you pass any sort of command to this file, it's going to interpret it as a flag. You can't fool rm, echo, sed, or anything else into actually deeming this a file at this point. You do, however, have a inode for every file.

[eriks@jaded: ~]$rm -f –fooface rm: unrecognized option '–fooface' Try rm ./–fooface' to remove the file –fooface'. Try rm –help' for more information. [eriks@jaded: ~]$ rm -f '–fooface'
rm: unrecognized option '–fooface'
Try rm ./–fooface' to remove the file –fooface'.
Try rm –help' for more information.

So now what, do you live forever with this annoyance of a file sitting inside your filesystem, never to be removed or touched again? Nah.

We can remove a file, simply by an inode number, but first we must find out the file inode number:

$ls -il | grep foo Output: [eriks@jaded: ~]$ ls -il | grep foo
508160 drwxr-xr-x 3 eriks eriks 4096 2010-10-27 18:13 foo3
500724 -rw-r–r– 1 eriks eriks 4 2011-02-10 22:37 –fooface
589907 drwxr-xr-x 2 eriks eriks 4096 2010-11-22 18:52 tempfoo
589905 drwxr-xr-x 2 eriks eriks 4096 2010-11-22 18:48 tmpfoo

The number you see prior to the file permission set is actually the inode # of the file itself.

Hint: 500724 is inode number we want removed.

Now use find command to delete file by inode:

# find . -inum 500724 -exec rm -i {} \;

There she is.

[eriks@jaded: ~]$find . -inum 500724 -exec rm -i {} \; rm: remove regular file ./–fooface'? y #### [Oct 25, 2019] unix - Remove a file on Linux using the inode number - Super User ###### Oct 25, 2019 | superuser.com , ome other methods include: escaping the special chars: [~]$rm \"la\*


use the find command and only search the current directory. The find command can search for inode numbers, and has a handy -delete switch:

[~]$ls -i 7404301 "la* [~]$find . -maxdepth 1 -type f -inum 7404301
./"la*

[~]$find . -maxdepth 1 -type f -inum 7404301 -delete [~]$ls -i
[~]$ , Maybe I'm missing something, but... rm '"la*'  Anyways, filenames don't have inodes, files do. Trying to remove a file without removing all filenames that point to it will damage your filesystem. #### [Oct 25, 2019] Linux - Unix Find Inode Of a File Command ###### Jun 21, 2012 | www.cyberciti.biz ... ... .. stat Command: Display Inode You can also use the stat command as follows: $ stat fileName-Here $stat /etc/passwd Sample outputs:  File: /etc/passwd' Size: 1644 Blocks: 8 IO Block: 4096 regular file Device: fe01h/65025d Inode: 25766495 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2012-05-05 16:29:42.000000000 +0530 Modify: 2012-05-05 16:29:20.000000000 +0530 Change: 2012-05-05 16:29:21.000000000 +0530  Share on Facebook Twitter Posted by: Vivek Gite The author is the creator of nixCraft and a seasoned sysadmin, DevOps engineer, and a trainer for the Linux operating system/Unix shell scripting. Get the latest tutorials on SysAdmin, Linux/Unix and open source topics via RSS/XML feed or weekly email newsletter . #### [Oct 23, 2019] How To Record Everything You Do In Terminal - OSTechNix ###### Oct 23, 2019 | www.ostechnix.com Run the following command to start the Terminal session recording. $ script -a my_terminal_activities


Where, -a flag is used to append the output to file or to typescript, retaining the prior contents. The above command records everything you do in the Terminal and append the output to a file called 'my_terminal_activities' and save it in your current working directory.

Sample output would be:

Script started, file is my_terminal_activities


Now, run some random Linux commands in your Terminal.

$mkdir ostechnix  $ cd ostechnix/

$touch hello_world.txt  $ cd ..

$uname -r  After running all commands, end the 'script' command's session using command: $ exit


After typing exit, you will the following output.

exit
Script done, file is my_terminal_activities


As you see, the Terminal activities have been stored in a file called 'my_terminal_activities' and saves it in the current working directory.

You can also save the Terminal activities in a file in different location like below.

$script -a /home/ostechnix/documents/myscripts.txt  All commands will be stored in /home/ostechnix/documents/myscripts.txt file. To view your Terminal activities, just open this file in any text editor or simply display it using the 'cat' command. $ cat my_terminal_activities


Sample output:

Script started on 2019-10-22 12:07:37+0530
sk@ostechnix:~$mkdir ostechnix sk@ostechnix:~$ cd ostechnix/
sk@ostechnix:~/ostechnix$touch hello_world.txt sk@ostechnix:~/ostechnix$ cd ..
sk@ostechnix:~$uname -r 5.0.0-31-generic sk@ostechnix:~$ exit
exit

Script done on 2019-10-22 12:08:10+0530


As you see in the above output, script command has recorded all my Terminal activities, including the start and end time of the script command. Awesome, isn't it? The reason to use script command is it's not just records the commands, but also the commands' output as well. To put this simply, Script command will record everything you do on the Terminal.

Bonus tip:

As one of our reader Mr.Alastair Montgomery mentioned in the comment section, we could setup an alias with would timestamp the recorded sessions.

Create an alias for the script command like below.

$alias rec='script -aq ~/term.log-$(date "+%Y%m%d-%H-%M")'


Now simply enter the following command start recording the Terminal.

$rec  Now, all your Terminal activities will be logged in a text file with timestamp, for example term.log-20191022-12-16 . Suggested read: #### [Oct 09, 2019] The gzip Recovery Toolkit ###### Oct 09, 2019 | www.aaronrenn.com So you thought you had your files backed up - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well. I'm very eager for feedback on this program . If you download and try it, I'd appreciate and email letting me know what your results were. My email is arenn@urbanophile.com . Thanks. ATTENTION 99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted. Disclaimer and Warning This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified. Downloading and Installing Note that version 0.8 contains major bug fixes and improvements. See the ChangeLog for details. Upgrading is recommended. The old version is provided in the event you run into troubles with the new release. You need the following packages: First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make . Install manually by copying to the directory of your choice. Usage Run gzrecover on a corrupted .gz file. If you leave the filename blank, gzrecover will read from the standard input. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. The default filename when reading from the standard input is "stdin.recovered". To write recovered data to the standard output, use the -p option. (Note that -p and -o cannot be used together). To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect the stderr stream to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. Note that gzrecover will take longer than regular gunzip. The more corrupt your data the longer it takes. If your archive is a tarball, read on. For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.6 or higher) handles corrupted files out of the box. Here's an example: $ ls *.gz
my-corrupted-backup.tar.gz
$gzrecover my-corrupted-backup.tar.gz$ ls *.recovered
my-corrupted-backup.tar.recovered
$cpio -F my-corrupted-backup.tar.recovered -i -v  Note that newer versions of cpio can spew voluminous error messages to your terminal. You may want to redirect the stderr stream to /dev/null. Also, cpio might take quite a long while to run. Copyright The gzip Recovery Toolkit v0.8 Copyright (c) 2002-2013 Aaron M. Renn ( arenn@urbanophile.com ) The gzrecover program is licensed under the GNU General Public License . #### [Oct 09, 2019] gzip - How can I recover files from a corrupted .tar.gz archive - Stack Overflow ###### Oct 09, 2019 | stackoverflow.com 15 George ,Jun 24, 2016 at 2:49 Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that. Then I would read the The gzip Recovery Toolkit page. JohnEye ,Oct 4, 2016 at 11:27 Recovery is possible but it depends on what caused the corruption. If the file is just truncated, getting some partial result out is not too hard; just run gunzip < SMS.tar.gz > SMS.tar.partial  which will give some output despite the error at the end. If the compressed file has large missing blocks, it's basically hopeless after the bad block. If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question . The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member. , Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error: gzip -d A.tar.gz gzip: A.tar.gz: invalid compressed data--format violated  I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure). The solution was relatively simple using the unix dos2unix utility dos2unix A.tar.gz dos2unix: converting file A.tar.gz to UNIX format ... tar -xvf A.tar file1.txt file2.txt ....etc.  It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there. #### [Sep 16, 2019] Artistic Style - Index ###### Sep 16, 2019 | astyle.sourceforge.net Artistic Style 3.1 A Free, Fast, and Small Automatic Formatter for C, C++, C++/CLI, Objective‑C, C#, and Java Source Code  Project Page: http://astyle.sourceforge.net/ SourceForge: http://sourceforge.net/projects/astyle/ Artistic Style is a source code indenter, formatter, and beautifier for the C, C++, C++/CLI, Objective‑C, C# and Java programming languages. When indenting source code, we as programmers have a tendency to use both spaces and tab characters to create the wanted indentation. Moreover, some editors by default insert spaces instead of tabs when pressing the tab key. Other editors (Emacs for example) have the ability to "pretty up" lines by automatically setting up the white space before the code on the line, possibly inserting spaces in code that up to now used only tabs for indentation. The NUMBER of spaces for each tab character in the source code can change between editors (unless the user sets up the number to his liking...). One of the standard problems programmers face when moving from one editor to another is that code containing both spaces and tabs, which was perfectly indented, suddenly becomes a mess to look at. Even if you as a programmer take care to ONLY use spaces or tabs, looking at other people's source code can still be problematic. To address this problem, Artistic Style was created – a filter written in C++ that automatically re-indents and re-formats C / C++ / Objective‑C / C++/CLI / C# / Java source files. It can be used from a command line, or it can be incorporated as a library in another program. #### [Sep 16, 2019] Usage -- PrettyPrinter 0.18.0 documentation ###### Sep 16, 2019 | prettyprinter.readthedocs.io Usage Install the package with pip : pip install prettyprinter Then, instead of from pprint import pprint do from prettyprinter import cpprint for colored output. For colorless output, remove the c prefix from the function name: from prettyprinter import pprint #### [Sep 16, 2019] JavaScript code prettifier ###### Sep 16, 2019 | github.com Announcement: Action required rawgit.com is going away . An embeddable script that makes source-code snippets in HTML prettier. • Works on HTML pages. • Works even if code contains embedded links, line numbers, etc. • Simple API: include some JS & CSS and add an onload handler. • Lightweights: small download and does not block page from loading while running. • Customizable styles via CSS. See the themes gallery . • Supports all C-like, Bash-like, and XML-like languages. No need to specify the language. • Extensible language handlers for other languages. You can specify the language. • Widely used with good cross-browser support. Powers https://code.google.com/ and http://stackoverflow.com/ #### [Sep 16, 2019] Pretty-print for shell script ###### Sep 16, 2019 | stackoverflow.com Benoit ,Oct 21, 2010 at 13:19 I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc. Do you know of one ? Jamie ,Sep 11, 2012 at 3:00 Vim can indent bash scripts. But not reformat them before indenting. Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!) Though, some bugs with << (expecting EOF as first character on a line) e.g. EDIT: ZZ not ZQ Daniel Martí ,Apr 8, 2018 at 13:52 A bit late to the party, but it looks like shfmt could do the trick for you. Brian Chrisman ,Sep 9 at 7:47 In bash I do this: reindent() { source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//"
}


this eliminates comments and reindents the script "bash way".

If you have HEREDOCS in your script, they got ruined by the sed in the previous function.

So use:

reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}") declare -f Zibri|head --lines=-1|tail --lines=+3" }  But all your script will have a 4 spaces indentation. Or you can do: reindent () { rstr=$(mktemp -u "XXXXXXXXXX");
source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}");
echo '#!/bin/bash';
declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/rstr/ /" }  which takes care also of heredocs. > , Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script . Very nice, only one thing i took out is the [...]->test substitution. #### [Sep 16, 2019] A command-line HTML pretty-printer Making messy HTML readable - Stack Overflow ##### Notable quotes: ##### "... Have a look at the HTML Tidy Project: http://www.html-tidy.org/ ..." ###### Sep 16, 2019 | stackoverflow.com nisetama ,Aug 12 at 10:33 I'm looking for recommendations for HTML pretty printers which fulfill the following requirements: • Takes HTML as input, and then output a nicely formatted/correctly indented but "graphically equivalent" version of the given input HTML. • Must support command-line operation. • Must be open-source and run under Linux. > , Have a look at the HTML Tidy Project: http://www.html-tidy.org/ The granddaddy of HTML tools, with support for modern standards. There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository . Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards. For your needs, here is the command line to call Tidy: #### [Sep 12, 2019] 9 Best File Comparison and Difference (Diff) Tools for Linux ###### Sep 12, 2019 | www.tecmint.com 3. Kompare Kompare is a diff GUI wrapper that allows users to view differences between files and also merge them. Some of its features include: 1. Supports multiple diff formats 2. Supports comparison of directories 3. Supports reading diff files 4. Customizable interface 5. Creating and applying patches to source files <img aria-describedby="caption-attachment-21311" src="https://www.tecmint.com/wp-content/uploads/2016/07/Kompare-Two-Files-in-Linux.png" alt="Kompare Tool - Compare Two Files in Linux" width="1097" height="701" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/Kompare-Two-Files-in-Linux.png 1097w, https://www.tecmint.com/wp-content/uploads/2016/07/Kompare-Two-Files-in-Linux-768x491.png 768w" sizes="(max-width: 1097px) 100vw, 1097px" /> Kompare Tool – Compare Two Files in Linux Visit Homepage : https://www.kde.org/applications/development/kompare/ 4. DiffMerge DiffMerge is a cross-platform GUI application for comparing and merging files. It has two functionality engines, the Diff engine which shows the difference between two files, which supports intra-line highlighting and editing and a Merge engine which outputs the changed lines between three files. It has got the following features: 1. Supports directory comparison 2. File browser integration 3. Highly configurable <img aria-describedby="caption-attachment-21312" src="https://www.tecmint.com/wp-content/uploads/2016/07/DiffMerge-Compare-Files-in-Linux.png" alt="DiffMerge - Compare Files in Linux" width="1078" height="700" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/DiffMerge-Compare-Files-in-Linux.png 1078w, https://www.tecmint.com/wp-content/uploads/2016/07/DiffMerge-Compare-Files-in-Linux-768x499.png 768w" sizes="(max-width: 1078px) 100vw, 1078px" /> DiffMerge – Compare Files in Linux Visit Homepage : https://sourcegear.com/diffmerge/ 5. Meld – Diff Tool Meld is a lightweight GUI diff and merge tool. It enables users to compare files, directories plus version controlled programs. Built specifically for developers, it comes with the following features: 1. Two-way and three-way comparison of files and directories 2. Update of file comparison as a users types more words 3. Makes merges easier using auto-merge mode and actions on changed blocks 4. Easy comparisons using visualizations 5. Supports Git, Mercurial, Subversion, Bazaar plus many more <img aria-describedby="caption-attachment-21313" src="https://www.tecmint.com/wp-content/uploads/2016/07/Meld-Diff-Tool-to-Compare-Files-in-Linux.png" alt="Meld - A Diff Tool to Compare File in Linux" width="1028" height="708" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/Meld-Diff-Tool-to-Compare-Files-in-Linux.png 1028w, https://www.tecmint.com/wp-content/uploads/2016/07/Meld-Diff-Tool-to-Compare-Files-in-Linux-768x529.png 768w" sizes="(max-width: 1028px) 100vw, 1028px" /> Meld – A Diff Tool to Compare File in Linux Visit Homepage : http://meldmerge.org/ 6. Diffuse – GUI Diff Tool Diffuse is another popular, free, small and simple GUI diff and merge tool that you can use on Linux. Written in Python, It offers two major functionalities, that is: file comparison and version control, allowing file editing, merging of files and also output the difference between files. You can view a comparison summary, select lines of text in files using a mouse pointer, match lines in adjacent files and edit different file. Other features include: 1. Syntax highlighting 2. Keyboard shortcuts for easy navigation 3. Supports unlimited undo 4. Unicode support 5. Supports Git, CVS, Darcs, Mercurial, RCS, Subversion, SVK and Monotone <img aria-describedby="caption-attachment-21314" src="https://www.tecmint.com/wp-content/uploads/2016/07/DiffUse-Compare-Text-Files-in-Linux.png" alt="DiffUse - A Tool to Compare Text Files in Linux" width="1030" height="795" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/DiffUse-Compare-Text-Files-in-Linux.png 1030w, https://www.tecmint.com/wp-content/uploads/2016/07/DiffUse-Compare-Text-Files-in-Linux-768x593.png 768w" sizes="(max-width: 1030px) 100vw, 1030px" /> DiffUse – A Tool to Compare Text Files in Linux Visit Homepage : http://diffuse.sourceforge.net/ 7. XXdiff – Diff and Merge Tool XXdiff is a free, powerful file and directory comparator and merge tool that runs on Unix like operating systems such as Linux, Solaris, HP/UX, IRIX, DEC Tru64. One limitation of XXdiff is its lack of support for unicode files and inline editing of diff files. It has the following list of features: 1. Shallow and recursive comparison of two, three file or two directories 2. Horizontal difference highlighting 3. Interactive merging of files and saving of resulting output 4. Supports merge reviews/policing 5. Supports external diff tools such as GNU diff, SIG diff, Cleareddiff and many more 6. Extensible using scripts 7. Fully customizable using resource file plus many other minor features <img aria-describedby="caption-attachment-21315" src="https://www.tecmint.com/wp-content/uploads/2016/07/xxdiff-Tool.png" alt="xxdiff Tool" width="718" height="401" /> xxdiff Tool Visit Homepage : http://furius.ca/xxdiff/ 8. KDiff3 – – Diff and Merge Tool KDiff3 is yet another cool, cross-platform diff and merge tool made from KDevelop . It works on all Unix-like platforms including Linux and Mac OS X, Windows. It can compare or merge two to three files or directories and has the following notable features: 1. Indicates differences line by line and character by character 2. Supports auto-merge 3. In-built editor to deal with merge-conflicts 4. Supports Unicode, UTF-8 and many other codecs 5. Allows printing of differences 6. Windows explorer integration support 7. Also supports auto-detection via byte-order-mark "BOM" 8. Supports manual alignment of lines 9. Intuitive GUI and many more <img aria-describedby="caption-attachment-21418" src="https://www.tecmint.com/wp-content/uploads/2016/07/KDiff3-Tool-for-Linux.png" alt="KDiff3 Tool for Linux" width="950" height="694" srcset="https://www.tecmint.com/wp-content/uploads/2016/07/KDiff3-Tool-for-Linux.png 950w, https://www.tecmint.com/wp-content/uploads/2016/07/KDiff3-Tool-for-Linux-768x561.png 768w" sizes="(max-width: 950px) 100vw, 950px" /> KDiff3 Tool for Linux Visit Homepage : http://kdiff3.sourceforge.net/ 9. TkDiff TkDiff is also a cross-platform, easy-to-use GUI wrapper for the Unix diff tool. It provides a side-by-side view of the differences between two input files. It can run on Linux, Windows and Mac OS X. Additionally, it has some other exciting features including diff bookmarks, a graphical map of differences for easy and quick navigation plus many more. Visit Homepage : https://sourceforge.net/projects/tkdiff/ Having read this review of some of the best file and directory comparator and merge tools, you probably want to try out some of them. These may not be the only diff tools available you can find on Linux, but they are known to offer some the best features, you may also want to let us know of any other diff tools out there that you have tested and think deserve to be mentioned among the best. #### [Sep 04, 2019] Basic Trap for File Cleanup ###### Sep 04, 2019 | www.putorius.net Basic Trap for File Cleanup Using an trap to cleanup is simple enough. Here is an example of using trap to clean up a temporary file on exit of the script. #!/bin/bash trap "rm -f /tmp/output.txt" EXIT yum -y update > /tmp/output.txt if grep -qi "kernel" /tmp/output.txt; then mail -s "KERNEL UPDATED" user@example.com < /tmp/output.txt fi  NOTE: It is important that the trap statement be placed at the beginning of the script to function properly. Any commands above the trap can exit and not be caught in the trap. Now if the script exits for any reason, it will still run the rm command to delete the file. Here is an example of me sending SIGINT (CTRL+C) while the script was running. # ./test.sh ^Cremoved '/tmp/output.txt'  NOTE: I added verbose ( -v ) output to the rm command so it prints "removed". The ^C signifies where I hit CTRL+C to send SIGINT. This is a much cleaner and safer way to ensure the cleanup occurs when the script exists. Using EXIT ( 0 ) instead of a single defined signal (i.e. SIGINT – 2) ensures the cleanup happens on any exit, even successful completion of the script. #### [Sep 04, 2019] Exec - Process Replacement Redirection in Bash by Steven Vona ###### Sep 02, 2019 | www.putorius.net The Linux exec command is a bash builtin and a very interesting utility. It is not something most people who are new to Linux know. Most seasoned admins understand it but only use it occasionally. If you are a developer, programmer or DevOp engineer it is probably something you use more often. Lets take a deep dive into the builtin exec command, what it does and how to use it. Table of Contents Basics of the Sub-Shell In order to understand the exec command, you need a fundamental understanding of how sub-shells work. ... ... ... What the Exec Command Does In it's most basic function the exec command changes the default behavior of creating a sub-shell to run a command. If you run exec followed by a command, that command will REPLACE the original process, it will NOT create a sub-shell. An additional feature of the exec command, is redirection and manipulation of file descriptors . Explaining redirection and file descriptors is outside the scope of this tutorial. If these are new to you please read " Linux IO, Standard Streams and Redirection " to get acquainted with these terms and functions. In the following sections we will expand on both of these functions and try to demonstrate how to use them. How to Use the Exec Command with Examples Let's look at some examples of how to use the exec command and it's options. Basic Exec Command Usage – Replacement of Process If you call exec and supply a command without any options, it simply replaces the shell with command . Let's run an experiment. First, I ran the ps command to find the process id of my second terminal window. In this case it was 17524. I then ran "exec tail" in that second terminal and checked the ps command again. If you look at the screenshot below, you will see the tail process replaced the bash process (same process ID). Since the tail command replaced the bash shell process, the shell will close when the tail command terminates. Exec Command Options If the -l option is supplied, exec adds a dash at the beginning of the first (zeroth) argument given. So if we ran the following command: exec -l tail -f /etc/redhat-release  It would produce the following output in the process list. Notice the highlighted dash in the CMD column. The -c option causes the supplied command to run with a empty environment. Environmental variables like PATH , are cleared before the command it run. Let's try an experiment. We know that the printenv command prints all the settings for a users environment. So here we will open a new bash process, run the printenv command to show we have some variables set. We will then run printenv again but this time with the exec -c option. In the example above you can see that an empty environment is used when using exec with the -c option. This is why there was no output to the printenv command when ran with exec. The last option, -a [name], will pass name as the first argument to command . The command will still run as expected, but the name of the process will change. In this next example we opened a second terminal and ran the following command: exec -a PUTORIUS tail -f /etc/redhat-release  Here is the process list showing the results of the above command: As you can see, exec passed PUTORIUS as first argument to command , therefore it shows in the process list with that name. Using the Exec Command for Redirection & File Descriptor Manipulation The exec command is often used for redirection. When a file descriptor is redirected with exec it affects the current shell. It will exist for the life of the shell or until it is explicitly stopped. If no command is specified, redirections may be used to affect the current shell environment. – Bash Manual Here are some examples of how to use exec for redirection and manipulating file descriptors. As we stated above, a deep dive into redirection and file descriptors is outside the scope of this tutorial. Please read " Linux IO, Standard Streams and Redirection " for a good primer and see the resources section for more information. Redirect all standard output (STDOUT) to a file: exec >file  In the example animation below, we use exec to redirect all standard output to a file. We then enter some commands that should generate some output. We then use exec to redirect STDOUT to the /dev/tty to restore standard output to the terminal. This effectively stops the redirection. Using the cat command we can see that the file contains all the redirected output. Open a file as file descriptor 6 for writing: exec 6> file2write  Open file as file descriptor 8 for reading: exec 8< file2read  Copy file descriptor 5 to file descriptor 7: exec 7<&5  Close file descriptor 8: exec 8<&-  Conclusion In this article we covered the basics of the exec command. We discussed how to use it for process replacement, redirection and file descriptor manipulation. In the past I have seen exec used in some interesting ways. It is often used as a wrapper script for starting other binaries. Using process replacement you can call a binary and when it takes over there is no trace of the original wrapper script in the process table or memory. I have also seen many System Administrators use exec when transferring work from one script to another. If you call a script inside of another script the original process stays open as a parent. You can use exec to replace that original script. I am sure there are people out there using exec in some interesting ways. I would love to hear your experiences with exec. Please feel free to leave a comment below with anything on your mind. Resources #### [Sep 03, 2019] bash - How to convert strings like 19-FEB-12 to epoch date in UNIX - Stack Overflow ###### Feb 11, 2013 | stackoverflow.com Asked 6 years, 6 months ago Active 2 years, 2 months ago Viewed 53k times 24 4 hellish ,Feb 11, 2013 at 3:45 In UNIX how to convert to epoch milliseconds date strings like: 19-FEB-12 16-FEB-12 05-AUG-09  I need this to compare these dates with the current time on the server. > , To convert a date to seconds since the epoch: date --date="19-FEB-12" +%s  Current epoch: date +%s  So, since your dates are in the past: NOW=date +%s THEN=date --date="19-FEB-12" +%s let DIFF=NOW-$THEN echo "The difference is:$DIFF"


Using BSD's date command, you would need

$date -j -f "%d-%B-%y" 19-FEB-12 +%s  Differences from GNU date : 1. -j prevents date from trying to set the clock 2. The input format must be explicitly set with -f 3. The input date is a regular argument, not an option (viz. -d ) 4. When no time is specified with the date, use the current time instead of midnight. #### [Sep 03, 2019] Linux - UNIX Convert Epoch Seconds To the Current Time - nixCraft ###### Sep 03, 2019 | www.cyberciti.biz Print Current UNIX Time Type the following command to display the seconds since the epoch:  date +%s  date +%s Sample outputs: 1268727836 Convert Epoch To Current Time Type the command:  date -d @Epoch date -d @1268727836 date -d "1970-01-01 1268727836 sec GMT"  date -d @Epoch date -d @1268727836 date -d "1970-01-01 1268727836 sec GMT" Sample outputs: Tue Mar 16 13:53:56 IST 2010  Please note that @ feature only works with latest version of date (GNU coreutils v5.3.0+). To convert number of seconds back to a more readable form, use a command like this:  date -d @1268727836 +"%d-%m-%Y %T %z"  date -d @1268727836 +"%d-%m-%Y %T %z" Sample outputs: 16-03-2010 13:53:56 +0530  #### [Sep 03, 2019] command line - How do I convert an epoch timestamp to a human readable format on the cli - Unix Linux Stack Exchange ###### Sep 03, 2019 | unix.stackexchange.com Gilles ,Oct 11, 2010 at 18:14 date -d @1190000000 Replace 1190000000 with your epoch Stefan Lasiewski ,Oct 11, 2010 at 18:04 $ echo 1190000000 | perl -pe 's/(\d+)/localtime($1)/e' Sun Sep 16 20:33:20 2007  This can come in handy for those applications which use epoch time in the logfiles: $ tail -f /var/log/nagios/nagios.log | perl -pe 's/(\d+)/localtime($1)/e' [Thu May 13 10:15:46 2010] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;HOSTA;check_raid;0;check_raid.pl: OK (Unit 0 on Controller 0 is OK)  Stéphane Chazelas ,Jul 31, 2015 at 20:24 With bash-4.2 or above: printf '%(%F %T)T\n' 1234567890  (where %F %T is the strftime() -type format) That syntax is inspired from ksh93 . In ksh93 however, the argument is taken as a date expression where various and hardly documented formats are supported. For a Unix epoch time, the syntax in ksh93 is: printf '%(%F %T)T\n' '#1234567890'  ksh93 however seems to use its own algorithm for the timezone and can get it wrong. For instance, in Britain, it was summer time all year in 1970, but: $ TZ=Europe/London bash -c 'printf "%(%c)T\n" 0'
Thu 01 Jan 1970 01:00:00 BST
$TZ=Europe/London ksh93 -c 'printf "%(%c)T\n" "#0"' Thu Jan 1 00:00:00 1970  DarkHeart ,Jul 28, 2014 at 3:56 Custom format with GNU date : date -d @1234567890 +'%Y-%m-%d %H:%M:%S'  Or with GNU awk : awk 'BEGIN { print strftime("%Y-%m-%d %H:%M:%S", 1234567890); }'  , The two I frequently use are: $ perl -leprint\ scalar\ localtime\ 1234567890
Sat Feb 14 00:31:30 2009


#### [Sep 03, 2019] Time conversion using Bash Vanstechelman.eu

###### Sep 03, 2019 | www.vanstechelman.eu

Time conversion using Bash This article show how you can obtain the UNIX epoch time (number of seconds since 1970-01-01 00:00:00 UTC) using the Linux bash "date" command. It also shows how you can convert a UNIX epoch time to a human readable time.

Obtain UNIX epoch time using bash
Obtaining the UNIX epoch time using bash is easy. Use the build-in date command and instruct it to output the number of seconds since 1970-01-01 00:00:00 UTC. You can do this by passing a format string as parameter to the date command. The format string for UNIX epoch time is '%s'.

lode@srv-debian6:~$date "+%s" 1234567890 To convert a specific date and time into UNIX epoch time, use the -d parameter. The next example shows how to convert the timestamp "February 20th, 2013 at 08:41:15" into UNIX epoch time. lode@srv-debian6:~$ date "+%s" -d "02/20/2013 08:41:15" 1361346075

Converting UNIX epoch time to human readable time
Even though I didn't find it in the date manual, it is possible to use the date command to reformat a UNIX epoch time into a human readable time. The syntax is the following:

lode@srv-debian6:~$date -d @1234567890 Sat Feb 14 00:31:30 CET 2009 The same thing can also be achieved using a bit of perl programming: lode@srv-debian6:~$ perl -e 'print scalar(localtime(1234567890)), "\n"' Sat Feb 14 00:31:30 2009

Please note that the printed time is formatted in the timezone in which your Linux system is configured. My system is configured in UTC+2, you can get another output for the same command.

#### [Sep 03, 2019] Run PerlTidy to beautify the code

##### "... Once I installed Code::TidyAll and placed those files in the root directory of the project, I could run tidyall -a . ..."
###### Sep 03, 2019 | perlmaven.com

The Code-TidyAll distribution provides a command line script called tidyall that will use Perl::Tidy to change the layout of the code.

This tandem needs 2 configuration file.

The .perltidyrc file contains the instructions to Perl::Tidy that describes the layout of a Perl-file. We used the following file copied from the source code of the Perl Maven project.

-pbp
-nst
-et=4
--maximum-line-length=120

# Break a line after opening/before closing token.
-vt=0
-vtc=0

The tidyall command uses a separate file called .tidyallrc that describes which files need to be beautified.

[PerlTidy]
select = {lib,t}/**/*.{pl,pm,t}
select = Makefile.PL
select = {mod2html,podtree2html,pods2html,perl2html}
argv = --profile=$ROOT/.perltidyrc [SortLines] select = .gitignore Once I installed Code::TidyAll and placed those files in the root directory of the project, I could run tidyall -a . That created a directory called .tidyall.d/ where it stores cached versions of the files, and changed all the files that were matches by the select statements in the .tidyallrc file. Then, I added .tidyall.d/ to the .gitignore file to avoid adding that subdirectory to the repository and ran tidyall -a again to make sure the .gitignore file is sorted. #### [Sep 02, 2019] bash - Pretty-print for shell script ###### Oct 21, 2010 | stackoverflow.com Pretty-print for shell script Ask Question Asked 8 years, 10 months ago Active 30 days ago Viewed 14k times Benoit ,Oct 21, 2010 at 13:19 I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc. Do you know of one ? Jamie ,Sep 11, 2012 at 3:00 Vim can indent bash scripts. But not reformat them before indenting. Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!) Though, some bugs with << (expecting EOF as first character on a line) e.g. EDIT: ZZ not ZQ Daniel Martí ,Apr 8, 2018 at 13:52 A bit late to the party, but it looks like shfmt could do the trick for you. Brian Chrisman ,Aug 11 at 4:08 In bash I do this: reindent() { source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//"
}


this eliminates comments and reindents the script "bash way".

If you have HEREDOCS in your script, they got ruined by the sed in the previous function.

So use:

reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}") declare -f Zibri|head --lines=-1|tail --lines=+3" }  But all your script will have a 4 spaces indentation. Or you can do: reindent () { rstr=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 16 | head -n 1);
source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}");
echo '#!/bin/bash';
declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/ /" }  which takes care also of heredocs. Pius Raeder ,Jan 10, 2017 at 8:35 Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script . Very nice, only one thing i took out is the [...]->test substitution. #### [Sep 02, 2019] mvdan-sh A shell parser, formatter, and interpreter (POSIX-Bash-mksh) ##### Written in Go language ###### Sep 02, 2019 | github.com go parser shell bash formatter posix mksh interpreter bash-parser beautify 1. Go 98.8% 2. Other 1.2% Type Name Latest commit message Commit time Failed to load latest commit information. _fuzz/ it _js cmd expand fileutil interp shell syntax .gitignore .travis.yml LICENSE README.md go.mod go.sum release-docker.sh README.md sh A shell parser, formatter and interpreter. Supports POSIX Shell , Bash and mksh . Requires Go 1.11 or later. Quick start To parse shell scripts, inspect them, and print them out, see the syntax examples . For high-level operations like performing shell expansions on strings, see the shell examples . shfmt Go 1.11 and later can download the latest v2 stable release: cd$(mktemp -d); go mod init tmp; go get mvdan.cc/sh/cmd/shfmt


The latest v3 pre-release can be downloaded in a similar manner, using the /v3  module:

cd $(mktemp -d); go mod init tmp; go get mvdan.cc/sh/v3/cmd/shfmt  Finally, any older release can be built with their respective older Go versions by manually cloning, checking out a tag, and running go build ./cmd/shfmt . shfmt  formats shell programs. It can use tabs or any number of spaces to indent. See canonical.sh for a quick look at its default style. You can feed it standard input, any number of files or any number of directories to recurse into. When recursing, it will operate on .sh  and .bash  files and ignore files starting with a period. It will also operate on files with no extension and a shell shebang. shfmt -l -w script.sh  Typically, CI builds should use the command below, to error if any shell scripts in a project don't adhere to the format: shfmt -d .  Use -i N  to indent with a number of spaces instead of tabs. There are other formatting options - see shfmt -h . For example, to get the formatting appropriate for Google's Style guide, use  shfmt -i 2 -ci . Packages are available on Arch , CRUX , Docker , FreeBSD , Homebrew , NixOS , Scoop , Snapcraft , and Void . Replacing bash -n  bash -n  can be useful to check for syntax errors in shell scripts. However, shfmt >/dev/null  can do a better job as it checks for invalid UTF-8 and does all parsing statically, including checking POSIX Shell validity: $ echo '${foo:1 2}' | bash -n$ echo '${foo:1 2}' | shfmt 1:9: not a valid arithmetic operator: 2$ echo 'foo=(1 2)' | bash --posix -n
$echo 'foo=(1 2)' | shfmt -p 1:5: arrays are a bash feature  gosh cd$(mktemp -d); go mod init tmp; go get mvdan.cc/sh/v3/cmd/gosh


Experimental shell that uses interp . Work in progress, so don't expect stability just yet.

Fuzzing

This project makes use of go-fuzz to find crashes and hangs in both the parser and the printer. To get started, run:

git checkout fuzz
./fuzz


Caveats

• When indexing Bash associative arrays, always use quotes. The static parser will otherwise have to assume that the index is an arithmetic expression.
$echo '${array[spaced string]}' | shfmt
1:16: not a valid arithmetic operator: string
$echo '${array[dash-string]}' | shfmt
${array[dash - string]}  • $((  and ((  ambiguity is not supported. Backtracking would complicate the parser and make streaming support via io.Reader  impossible. The POSIX spec recommends to space the operands if $( (  is meant. $ echo '$((foo); (bar))' | shfmt 1:1: reached ) without matching$(( with ))

• Some builtins like export  and let  are parsed as keywords. This is to allow statically parsing them and building their syntax tree, as opposed to just keeping the arguments as a slice of arguments.

JavaScript

A subset of the Go packages are available as an npm package called mvdan-sh . See the _js directory for more information.

Docker

To build a Docker image, checkout a specific version of the repository and run:

docker build -t my:tag -f cmd/shfmt/Dockerfile .


Related projects

• Alternative docker images - by jamesmstone , PeterDaveHello
• format-shell - Atom plugin for  shfmt 
• micro - Editor with a built-in plugin for shfmt 
• modd - A developer tool that responds to filesystem changes, using sh 
• shell-format - VS Code plugin for shfmt 
• vim-shfmt - Vim plugin for shfmt 

#### [Aug 26, 2019] linux - Avoiding accidental 'rm' disasters - Super User

###### Aug 26, 2019 | superuser.com

Avoiding accidental 'rm' disasters Ask Question Asked 6 years, 3 months ago Active 6 years, 3 months ago Viewed 1k times 1

Mr_Spock ,May 26, 2013 at 11:30

Today, using sudo -s , I wanted to rm -R ./lib/ , but I actually rm -R /lib/ .

I had to reinstall my OS (Mint 15) and re-download and re-configure all my packages. Not fun.

How can I avoid similar mistakes in the future?

Vittorio Romeo ,May 26, 2013 at 11:55

First of all, stop executing everything as root . You never really need to do this. Only run individual commands with sudo if you need to. If a normal command doesn't work without sudo, just call sudo !! to execute it again.

If you're paranoid about rm , mv and other operations while running as root, you can add the following aliases to your shell's configuration file:



(or your operating system's equivalent). You can check the value of $? immediately after calling rm to see if a file was actually removed or not. vimdude ,May 28, 2014 at 18:10 Yes, -f is the most suitable option for this. tripleee ,Jan 11 at 4:50 -f is the correct flag, but for the test operator, not rm [ -f "$THEFILE" ] && rm "$THEFILE"  this ensures that the file exists and is a regular file (not a directory, device node etc...) mahemoff ,Jan 11 at 4:41 \rm -f file will never report not found. Idelic ,Apr 20, 2012 at 16:51 As far as rm -f doing "anything else", it does force ( -f is shorthand for --force ) silent removal in situations where rm would otherwise ask you for confirmation. For example, when trying to remove a file not writable by you from a directory that is writable by you. Keith Thompson ,May 28, 2014 at 18:09 I had same issue for cshell. The only solution I had was to create a dummy file that matched pattern before "rm" in my script. #### [Aug 26, 2019] shell - rm -rf return codes ###### Aug 26, 2019 | superuser.com rm -rf return codes Ask Question Asked 6 years ago Active 6 years ago Viewed 15k times 8 0 SheetJS ,Aug 15, 2013 at 2:50 Any one can let me know the possible return codes for the command rm -rf other than zero i.e, possible return codes for failure cases. I want to know more detailed reason for the failure of the command unlike just the command is failed(return other than 0). Adrian Frühwirth ,Aug 14, 2013 at 7:00 To see the return code, you can use echo$? in bash.

To see the actual meaning, some platforms (like Debian Linux) have the perror binary available, which can be used as follows:

$rm -rf something/; perror$?
rm: cannot remove something/': Permission denied
OS error code   1:  Operation not permitted


rm -rf automatically suppresses most errors. The most likely error you will see is 1 (Operation not permitted), which will happen if you don't have permissions to remove the file. -f intentionally suppresses most errors

Adrian Frühwirth ,Aug 14, 2013 at 7:21

grabbed coreutils from git....

looking at exit we see...

openfly@linux-host:~/coreutils/src $cat rm.c | grep -i exit if (status != EXIT_SUCCESS) exit (status); /* Since this program exits immediately after calling 'rm', rm need not atexit (close_stdin); usage (EXIT_FAILURE); exit (EXIT_SUCCESS); usage (EXIT_FAILURE); error (EXIT_FAILURE, errno, _("failed to get attributes of %s"), exit (EXIT_SUCCESS); exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);  Now looking at the status variable.... openfly@linux-host:~/coreutils/src$ cat rm.c | grep -i status
usage (int status)
if (status != EXIT_SUCCESS)
exit (status);
enum RM_status status = rm (file, &x);
assert (VALID_STATUS (status));
exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);


looks like there isn't much going on there with the exit status.

I see EXIT_FAILURE and EXIT_SUCCESS and not anything else.

so basically 0 and 1 / -1

To see specific exit() syscalls and how they occur in a process flow try this

openfly@linux-host:~/ $strace rm -rf$whatever


fairly simple.

ref:

http://www.unix.com/man-page/Linux/EXIT_FAILURE/exit/

#### [Aug 20, 2019] How to exclude file when using scp command recursively

###### Aug 12, 2019 | www.cyberciti.biz

I need to copy all the *.c files from local laptop named hostA to hostB including all directories. I am using the following scp command but do not know how to exclude specific files (such as *.out): $scp -r ~/projects/ user@hostB:/home/delta/projects/ How do I tell scp command to exclude particular file or directory at the Linux/Unix command line? One can use scp command to securely copy files between hosts on a network. It uses ssh for data transfer and authentication purpose. Typical scp command syntax is as follows: scp file1 user@host:/path/to/dest/ scp -r /path/to/source/ user@host:/path/to/dest/ scp [options] /dir/to/source/ user@host:/dir/to/dest/ Scp exclude files I don't think so you can filter or exclude files when using scp command. However, there is a great workaround to exclude files and copy it securely using ssh. This page explains how to filter or excludes files when using scp to copy a directory recursively. How to use rsync command to exclude files The syntax is: rsync -av -e ssh --exclude='*.out' /path/to/source/ user@hostB:/path/to/dest/ Where, 1. -a : Recurse into directories i.e. copy all files and subdirectories. Also, turn on archive mode and all other options (-rlptgoD) 2. -v : Verbose output 3. -e ssh : Use ssh for remote shell so everything gets encrypted 4. --exclude='*.out' : exclude files matching PATTERN e.g. *.out or *.c and so on. Example of rsync command In this example copy all file recursively from ~/virt/ directory but exclude all *.new files: $ rsync -av -e ssh --exclude='*.new' ~/virt/ root@centos7:/tmp

#### [Aug 19, 2019] Moreutils - A Collection Of More Useful Unix Utilities - OSTechNix

##### Parallel is a really useful utility. RPM is installable from EPEL.
###### Aug 19, 2019 | www.ostechnix.com
• chronic – Runs a command quietly unless it fails.
• combine – Combine the lines in two files using boolean operations.
• errno – Look up errno names and descriptions.
• ifdata – Get network interface info without parsing ifconfig output.
• ifne – Run a program if the standard input is not empty.
• isutf8 – Check if a file or standard input is utf-8.
• lckdo – Execute a program with a lock held.
• mispipe – Pipe two commands, returning the exit status of the first.
• parallel – Run multiple jobs at once.
• pee – tee standard input to pipes.
• sponge – Soak up standard input and write to a file.
• ts – timestamp standard input.
• vidir – Edit a directory in your text editor.
• vipe – Insert a text editor into a pipe.
• zrun – Automatically uncompress arguments to command.

... ... ...

On RHEL , CentOS , Scientific Linux :
$sudo yum install epel-release  $ sudo yum install moreutils


#### [Jul 29, 2019] Locate Command in Linux

###### Jul 25, 2019 | linuxize.com

... ... ...

The locate command also accepts patterns containing globbing characters such as the wildcard character * . When the pattern contains no globbing characters the command searches for *PATTERN* , that's why in the previous example all files containing the search pattern in their names were displayed.

The wildcard is a symbol used to represent zero, one or more characters. For example, to search for all .md files on the system you would use:

locate *.md


To limit the search results use the -n option followed by the number of results you want to be displayed. For example, the following command will search for all .py files and display only 10 results:

locate -n 10 *.py


By default, locate performs case-sensitive searches. The -i ( --ignore-case ) option tels locate to ignore case and run case-insensitive search.

locate -i readme.md

/home/linuxize/p1/readme.md


To display the count of all matching entries, use the -c ( --count ) option. The following command would return the number of all files containing .bashrc in their names:

locate -c .bashrc

6


By default, locate doesn't check whether the found files still exist on the file system. If you deleted a file after the latest database update if the file matches the search pattern it will be included in the search results.

To display only the names of the files that exist at the time locate is run use the -e ( --existing ) option. For example, the following would return only the existing .json files:

locate -e *.json


If you need to run a more complex search you can use the -r ( --regexp ) option which allows you to search using a basic regexp instead of patterns. This option can be specified multiple times.
For example, to search for all .mp4 and .avi files on your system and ignore case you would run:

locate --regex -i "(\.mp4|\.avi)"


#### [Jul 26, 2019] Sort Command in Linux [10 Useful Examples] by Christopher Murray

##### "... What is probably missing in that article is a short warning about the effect of the current locale. It is a common mistake to assume that the default behavior is to sort according ASCII texts according to the ASCII codes. ..."
###### Jul 12, 2019 | linuxhandbook.com
5. Sort by months [option -M]

Sort also has built in functionality to arrange by month. It recognizes several formats based on locale-specific information. I tried to demonstrate some unqiue tests to show that it will arrange by date-day, but not year. Month abbreviations display before full-names.

Here is the sample text file in this example:

March
Feb
February
April
August
July
June
November
October
December
May
September
1
4
3
6
01/05/19
01/10/19
02/06/18

Let's sort it by months using the -M option:

sort filename.txt -M


Here's the output you'll see:

01/05/19
01/10/19
02/06/18
1
3
4
6
Jan
Feb
February
March
April
May
June
July
August
September
October
November
December


... ... ...

7. Sort Specific Column [option -k]

If you have a table in your file, you can use the -k option to specify which column to sort. I added some arbitrary numbers as a third column and will display the output sorted by each column. I've included several examples to show the variety of output possible. Options are added following the column number.

1. MX Linux 100
2. Manjaro 400
3. Mint 300
4. elementary 500
5. Ubuntu 200

sort filename.txt -k 2


This will sort the text on the second column in alphabetical order:

4. elementary 500
2. Manjaro 400
3. Mint 300
1. MX Linux 100
5. Ubuntu 200

sort filename.txt -k 3n


This will sort the text by the numerals on the third column.

1. MX Linux 100
5. Ubuntu 200
3. Mint 300
2. Manjaro 400
4. elementary 500

sort filename.txt -k 3nr


Same as the above command just that the sort order has been reversed.

4. elementary 500
2. Manjaro 400
3. Mint 300
5. Ubuntu 200
1. MX Linux 100

8. Sort and remove duplicates [option -u]

If you have a file with potential duplicates, the -u option will make your life much easier. Remember that sort will not make changes to your original data file. I chose to create a new file with just the items that are duplicates. Below you'll see the input and then the contents of each file after the command is run.

READ Learn to Use CURL Command in Linux With These Examples

1. MX Linux
2. Manjaro
3. Mint
4. elementary
5. Ubuntu
1. MX Linux
2. Manjaro
3. Mint
4. elementary
5. Ubuntu
1. MX Linux
2. Manjaro
3. Mint
4. elementary
5. Ubuntu

sort filename.txt -u > filename_duplicates.txt


Here's the output files sorted and without duplicates.

1. MX Linux
2. Manjaro
3. Mint
4. elementary
5. Ubuntu

9. Ignore case while sorting [option -f]

Many modern distros running sort will implement ignore case by default. If yours does not, adding the -f option will produce the expected results.

sort filename.txt -f


Here's the output where cases are ignored by the sort command:

alpha
alPHa
Alpha
ALpha
beta
Beta
BEta
BETA

10. Sort by human numeric values [option -h]

This option allows the comparison of alphanumeric values like 1k (i.e. 1000).

sort filename.txt -h


Here's the sorted output:

10.0
100
1000.0
1k


I hope this tutorial helped you get the basic usage of the sort command in Linux. If you have some cool sort trick, why not share it with us in the comment section?

Christopher works as a Software Developer in Orlando, FL. He loves open source, Taco Bell, and a Chi-weenie named Max. Visit his website for more information or connect with him on social media.

John
The sort command option "k" specifies a field, not a column. In your example all five lines have the same character in column 2 – a "."

Stephane Chauveau

In gnu sort, the default field separator is 'blank to non-blank transition' which is a good default to separate columns. In his example, the "." is part of the first column so it should work fine. If –debug is used then the range of characters used as keys is dumped.

What is probably missing in that article is a short warning about the effect of the current locale. It is a common mistake to assume that the default behavior is to sort according ASCII texts according to the ASCII codes. For example, the command echo printf ".nxn0nXn@në" | sort produces ". 0 @ X x ë" with LC_ALL=C but ". @ 0 ë x X" with LC_ALL=en_US.UTF-8.

#### [Jul 26, 2019] Cheat.sh Shows Cheat Sheets On The Command Line Or In Your Code Editor>

##### "... The tool is developed by Igor Chubin, also known for its console-oriented weather forecast service wttr.in , which can be used to retrieve the weather from the console using only cURL or Wget. ..."
###### Jul 26, 2019 | www.linuxuprising.com

While it does have its own cheat sheet repository too, the project is actually concentrated around the creation of a unified mechanism to access well developed and maintained cheat sheet repositories.

The tool is developed by Igor Chubin, also known for its console-oriented weather forecast service wttr.in , which can be used to retrieve the weather from the console using only cURL or Wget.

It's worth noting that cheat.sh is not new. In fact it had its initial commit around May, 2017, and is a very popular repository on GitHub. But I personally only came across it recently, and I found it very useful, so I figured there must be some Linux Uprising readers who are not aware of this cool gem.

cheat.sh features & more

cheat.sh major features:

• Supports 58 programming languages , several DBMSes, and more than 1000 most important UNIX/Linux commands
• Very fast, returns answers within 100ms
• Simple curl / browser interface
• An optional command line client (cht.sh) is available, which allows you to quickly search cheat sheets and easily copy snippets without leaving the terminal
• Can be used from code editors, allowing inserting code snippets without having to open a web browser, search for the code, copy it, then return to your code editor and paste it. It supports Vim, Emacs, Visual Studio Code, Sublime Text and IntelliJ Idea
• Comes with a special stealth mode in which any text you select (adding it into the selection buffer of X Window System or into the clipboard) is used as a search query by cht.sh, so you can get answers without touching any other keys
The command line client features a special shell mode with a persistent queries context and readline support. It also has a query history, it integrates with the clipboard, supports tab completion for shells like Bash, Fish and Zsh, and it includes the stealth mode I mentioned in the cheat.sh features.

The web, curl and cht.sh (command line) interfaces all make use of https://cheat.sh/ but if you prefer, you can self-host it .

It should be noted that each editor plugin supports a different feature set (configurable server, multiple answers, toggle comments, and so on). You can view a feature comparison of each cheat.sh editor plugin on the Editors integration section of the project's GitHub page.

Want to contribute a cheat sheet? See the cheat.sh guide on editing or adding a new cheat sheet.

Interested in bookmarking commands instead? You may want to give Marker, a command bookmark manager for the console , a try.

cheat.sh curl / command line client usage examples
Examples of using cheat.sh using the curl interface (this requires having curl installed as you'd expect) from the command line:

Show the tar command cheat sheet:

curl cheat.sh/tar


Example with output:
$curl cheat.sh/tar # To extract an uncompressed archive: tar -xvf /path/to/foo.tar # To create an uncompressed archive: tar -cvf /path/to/foo.tar /path/to/foo/ # To extract a .gz archive: tar -xzvf /path/to/foo.tgz # To create a .gz archive: tar -czvf /path/to/foo.tgz /path/to/foo/ # To list the content of an .gz archive: tar -ztvf /path/to/foo.tgz # To extract a .bz2 archive: tar -xjvf /path/to/foo.tgz # To create a .bz2 archive: tar -cjvf /path/to/foo.tgz /path/to/foo/ # To extract a .tar in specified Directory: tar -xvf /path/to/foo.tar -C /path/to/destination/ # To list the content of an .bz2 archive: tar -jtvf /path/to/foo.tgz # To create a .gz archive and exclude all jpg,gif,... from the tgz tar czvf /path/to/foo.tgz --exclude=\*.{jpg,gif,png,wmv,flv,tar.gz,zip} /path/to/foo/ # To use parallel (multi-threaded) implementation of compression algorithms: tar -z ... -> tar -Ipigz ... tar -j ... -> tar -Ipbzip2 ... tar -J ... -> tar -Ipixz ...  cht.sh also works instead of cheat.sh: curl cht.sh/tar  Want to search for a keyword in all cheat sheets? Use: curl cheat.sh/~keyword  List the Python programming language cheat sheet for random list : curl cht.sh/python/random+list  Example with output: $ curl cht.sh/python/random+list
#  python - How to randomly select an item from a list?
#
#  Use random.choice
#  (https://docs.python.org/2/library/random.htmlrandom.choice):

import random

foo = ['a', 'b', 'c', 'd', 'e']
print(random.choice(foo))

#  For cryptographically secure random choices (e.g. for generating a
#  passphrase from a wordlist), use random.SystemRandom
#  (https://docs.python.org/2/library/random.htmlrandom.SystemRandom)
#  class:

import random

foo = ['battery', 'correct', 'horse', 'staple']
secure_random = random.SystemRandom()
print(secure_random.choice(foo))

#  [Pēteris Caune] [so/q/306400] [cc by-sa 3.0]


Replace python with some other programming language supported by cheat.sh, and random+list with the cheat sheet you want to show.

Want to eliminate the comments from your answer? Add ?Q at the end of the query (below is an example using the same /python/random+list):

$curl cht.sh/python/random+list?Q import random foo = ['a', 'b', 'c', 'd', 'e'] print(random.choice(foo)) import random foo = ['battery', 'correct', 'horse', 'staple'] secure_random = random.SystemRandom() print(secure_random.choice(foo))  For more flexibility and tab completion you can use cht.sh, the command line cheat.sh client; you'll find instructions for how to install it further down this article. Examples of using the cht.sh command line client: Show the tar command cheat sheet: cht.sh tar  List the Python programming language cheat sheet for random list : cht.sh python random list  There is no need to use quotes with multiple keywords. You can start the cht.sh client in a special shell mode using: cht.sh --shell  And then you can start typing your queries. Example: $ cht.sh --shell
cht.sh> bash loop


If all your queries are about the same programming language, you can start the client in the special shell mode, directly in that context. As an example, start it with the Bash context using:
cht.sh --shell bash


Example with output:
$cht.sh --shell bash cht.sh/bash> loop ........... cht.sh/bash> switch case  Want to copy the previously listed answer to the clipboard? Type c , then press Enter to copy the whole answer, or type C and press Enter to copy it without comments. Type help in the cht.sh interactive shell mode to see all available commands. Also look under the Usage section from the cheat.sh GitHub project page for more options and advanced usage. How to install cht.sh command line client You can use cheat.sh in a web browser, from the command line with the help of curl and without having to install anything else, as explained above, as a code editor plugin, or using its command line client which has some extra features, which I already mentioned. The steps below are for installing this cht.sh command line client. If you'd rather install a code editor plugin for cheat.sh, see the Editors integration page. 1. Install dependencies. To install the cht.sh command line client, the curl command line tool will be used, so this needs to be installed on your system. Another dependency is rlwrap , which is required by the cht.sh special shell mode. Install these dependencies as follows. • Debian, Ubuntu, Linux Mint, Pop!_OS, and any other Linux distribution based on Debian or Ubuntu: sudo apt install curl rlwrap  • Fedora: sudo dnf install curl rlwrap  • Arch Linux, Manjaro: sudo pacman -S curl rlwrap  • openSUSE: sudo zypper install curl rlwrap  The packages seem to be named the same on most (if not all) Linux distributions, so if your Linux distribution is not on this list, just install the curl and rlwrap packages using your distro's package manager. 2. Download and install the cht.sh command line interface. You can install this either for your user only (so only you can run it), or for all users: • Install it for your user only. The command below assumes you have a ~/.bin folder added to your PATH (and the folder exists). If you have some other local folder in your PATH where you want to install cht.sh, change install path in the commands: curl https://cht.sh/:cht.sh > ~/.bin/cht.sh chmod +x ~/.bin/cht.sh  • Install it for all users (globally, in /usr/local/bin ): curl https://cht.sh/:cht.sh | sudo tee /usr/local/bin/cht.sh sudo chmod +x /usr/local/bin/cht.sh  If the first command appears to have frozen displaying only the cURL output, press the Enter key and you'll be prompted to enter your password in order to save the file to /usr/local/bin . You may also download and install the cheat.sh command completion for Bash or Zsh: • Bash: mkdir ~/.bash.d curl https://cheat.sh/:bash_completion > ~/.bash.d/cht.sh echo ". ~/.bash.d/cht.sh" >> ~/.bashrc  • Zsh: mkdir ~/.zsh.d curl https://cheat.sh/:zsh > ~/.zsh.d/_cht echo 'fpath=(~/.zsh.d/$fpath)' >> ~/.zshrc


Opening a new shell / terminal and it will load the cheat.sh completion.

#### [Jun 23, 2019] Utilizing multi core for tar+gzip-bzip compression-decompression

##### "... You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use. ..."
###### Jun 23, 2019 | stackoverflow.com

user1118764 , Sep 7, 2012 at 6:58

I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit).

I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression.

Is there any way I can utilize the unused cores to make it faster?

Warren Severin , Nov 13, 2017 at 4:37

The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37

Mark Adler , Sep 7, 2012 at 14:48

You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz:
tar cf - paths-to-archive | pigz > archive.tar.gz

By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g.

tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz

user788171 , Feb 20, 2013 at 12:43

How do you use pigz to decompress in the same fashion? Or does it only work for compression?

Mark Adler , Feb 20, 2013 at 16:18

pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression.

The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores.

Garrett , Mar 1, 2014 at 7:26

Mark Adler , Jul 2, 2014 at 21:29

Yes. 100% compatible in both directions.

Mark Adler , Apr 23, 2015 at 5:23

There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files.

Jen , Jun 14, 2013 at 14:34

You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use.

For example use:

tar -c --use-compress-program=pigz -f tar.file dir_to_zip

Valerio Schiavoni , Aug 5, 2014 at 22:38

Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38

bovender , Sep 18, 2015 at 10:14

@ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14

Valerio Schiavoni , Sep 28, 2015 at 23:41

On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41

Offenso , Jan 11, 2017 at 17:26

I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26

Maxim Suslov , Dec 18, 2014 at 7:31

Common approach

There is option for tar program:

-I, --use-compress-program PROG
filter through PROG (must accept -d)

You can use multithread version of archiver or compressor utility.

$tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive$ tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive

Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary):

$tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz$ tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz


Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa.

p7zip

For p7zip for compression you need a small shell script like the following:

#!/bin/sh
case $1 in -d) 7za -txz -si -so e;; *) 7za -txz -si -so a .;; esac 2>/dev/null  Save it as 7zhelper.sh. Here the example of usage: $ tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive
$tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z  xz Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ). This is a fragment of man for 5.1.0alpha version: Multithreaded compression and decompression are not implemented yet, so this option has no effect for now. However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2: Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used. Recompiling with replacement If you build tar from sources, then you can recompile with parameters --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip  After recompiling tar with these options you can check the output of tar's help: $ tar --help | grep "lbzip2\|plzip\|pigz"
-j, --bzip2                filter the archive through lbzip2
--lzip                 filter the archive through plzip
-z, --gzip, --gunzip, --ungzip   filter the archive through pigz


mpibzip2 , Apr 28, 2015 at 20:57

I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57

oᴉɹǝɥɔ , Jun 10, 2015 at 17:39

Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39

selurvedu , May 26, 2016 at 22:13

Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13

panticz.de , Sep 1, 2014 at 15:02

You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores:
tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/


einpoklum , Feb 11, 2017 at 15:59

A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59
If you want to have more flexibility with filenames and compression options, you can use:
find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \
tar -P --transform='s@/my/path/@@g' -cf - {} + | \
pigz -9 -p 4 > myarchive.tar.gz

Step 1: find

find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec

This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want.

-exec will execute the next command using the results of find : tar

Step 2: tar

tar -P --transform='s@/my/path/@@g' -cf - {} +

--transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included.

-P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading /' from member names". Leading '/' with be removed by --transform anyway.

-cf - tells tar to use the tarball name we'll specify later

{} + uses everyfiles that find found previously

Step 3: pigz

pigz -9 -p 4

Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores.

Step 4: archive name

> myarchive.tar.gz

Finally.

#### [Jun 23, 2019] Test with rsync between two partitions

###### Jun 23, 2019 | www.fsarchiver.org

An important test is done using rsync. It requires two partitions: the original one, and a spare partition where to restore the archive. It allows to know whether or not there are differences between the original and the restored filesystem. rsync is able to compare both the files contents, and files attributes (timestamps, permissions, owner, extended attributes, acl, ), so that's a very good test. The following command can be used to know whether or not files are the same (data and attributes) on two file-systems:

rsync -axHAXnP /mnt/part1/ /mnt/part2/


#### [Jun 22, 2019] Using SSH and Tmux for screen sharing Enable by Seth Kenlon Tmux

###### Jun 22, 2019 | www.redhat.com

Tmux is a screen multiplexer, meaning that it provides your terminal with virtual terminals, allowing you to switch from one virtual session to another. Modern terminal emulators feature a tabbed UI, making the use of Tmux seem redundant, but Tmux has a few peculiar features that still prove difficult to match without it.

First of all, you can launch Tmux on a remote machine, start a process running, detach from Tmux, and then log out. In a normal terminal, logging out would end the processes you started. Since those processes were started in Tmux, they persist even after you leave.

Secondly, Tmux can "mirror" its session on multiple screens. If two users log into the same Tmux session, then they both see the same output on their screens in real time.

Tmux is a lightweight, simple, and effective solution in cases where you're training someone remotely, debugging a command that isn't working for them, reviewing text, monitoring services or processes, or just avoiding the ten minutes it sometimes takes to read commands aloud over a phone clearly enough that your user is able to accurately type them.

To try this option out, you must have two computers. Assume one computer is owned by Alice, and the other by Bob. Alice remotely logs into Bob's PC and launches a Tmux session:

alice$ssh bob.local alice$ tmux

On his PC, Bob starts Tmux, attaching to the same session:

bob$tmux attach When Alice types, Bob sees what she is typing, and when Bob types, Alice sees what he's typing. It's a simple but effective trick that enables interactive live sessions between computer users, but it is entirely text-based. Collaboration With these two applications, you have access to some powerful methods of supporting users. You can use these tools to manage systems remotely, as training tools, or as support tools, and in every case, it sure beats wandering around the office looking for somebody's desk. Get familiar with SSH and Tmux, and start using them today. #### [Jun 10, 2019] Screen Command Examples To Manage Multiple Terminal Sessions ###### Jun 10, 2019 | www.ostechnix.com Screen Command Examples To Manage Multiple Terminal Sessions by sk · Published June 6, 2019 · Updated June 7, 2019 GNU Screen is a terminal multiplexer (window manager). As the name says, Screen multiplexes the physical terminal between multiple interactive shells, so we can perform different tasks in each terminal session. All screen sessions run their programs completely independent. So, a program or process running inside a screen session will keep running even if the session is accidentally closed or disconnected. For instance, when upgrading Ubuntu server via SSH, Screen command will keep running the upgrade process just in case your SSH session is terminated for any reason. The GNU Screen allows us to easily create multiple screen sessions, switch between different sessions, copy text between sessions, attach or detach from a session at any time and so on. It is one of the important command line tool every Linux admins should learn and use wherever necessary. In this brief guide, we will see the basic usage of Screen command with examples in Linux. Installing GNU Screen GNU Screen is available in the default repositories of most Linux operating systems. To install GNU Screen on Arch Linux, run: $ sudo pacman -S screen


On Debian, Ubuntu, Linux Mint:

$sudo apt-get install screen  On Fedora: $ sudo dnf install screen


On RHEL, CentOS:

$sudo yum install screen  On SUSE/openSUSE: $ sudo zypper install screen


Let us go ahead and see some screen command examples.

Screen Command Examples To Manage Multiple Terminal Sessions

The default prefix shortcut to all commands in Screen is Ctrl+a . You need to use this shortcut a lot when using Screen. So, just remember this keyboard shortcut.

Create new Screen session

Let us create a new Screen session and attach to it. To do so, type the following command in terminal:

screen


Now, run any program or process inside this session. The running process or program will keep running even if you're disconnected from this session.

Detach from Screen sessions

To detach from inside a screen session, press Ctrl+a and d . You don't have to press the both key combinations at the same time. First press Ctrl+a and then press d . After detaching from a session, you will see an output something like below.

[detached from 29149.pts-0.sk]


Here, 29149 is the screen ID and pts-0.sk is the name of the screen session. You can attach, detach and kill Screen sessions using either screen ID or name of the respective session.

Create a named session

You can also create a screen session with any custom name of your choice other than the default username like below.

screen -S ostechnix


The above command will create a new screen session with name "xxxxx.ostechnix" and attach to it immediately. To detach from the current session, press Ctrl+a followed by d .

Naming screen sessions can be helpful when you want to find which processes are running on which sessions. For example, when a setup LAMP stack inside a session, you can simply name it like below.

screen -S lampstack

Create detached sessions

Sometimes, you might want to create a session, but don't want to attach it automatically. In such cases, run the following command to create detached session named "senthil" :

screen -S senthil -d -m


Or, shortly:

screen -dmS senthil


The above command will create a session called "senthil", but won't attach to it.

List Screen sessions

To list all running sessions (attached or detached), run:

screen -ls


Sample output:

There are screens on:
29700.senthil	(Detached)
29415.ostechnix	(Detached)
29149.pts-0.sk	(Detached)
3 Sockets in /run/screens/S-sk.


As you can see, I have three running sessions and all are detached.

Attach to Screen sessions

If you want to attach to a session at any time, for example 29415.ostechnix , simply run:

screen -r 29415.ostechnix


Or,

screen -r ostechnix


Or, just use the screen ID:

screen -r 29415


To verify if we are attached to the aforementioned session, simply list the open sessions and check.

screen -ls


Sample output:

There are screens on:
29700.senthil   (Detached)
29415.ostechnix (Attached)
29149.pts-0.sk  (Detached)
3 Sockets in /run/screens/S-sk.


As you see in the above output, we are currently attached to 29415.ostechnix session. To exit from the current session, press ctrl+a, d.

Create nested sessions

When we run "screen" command, it will create a single session for us. We can, however, create nested sessions (a session inside a session).

First, create a new session or attach to an opened session. I am going to create a new session named "nested".

screen -S nested


Now, press Ctrl+a and c inside the session to create another session. Just repeat this to create any number of nested Screen sessions. Each session will be assigned with a number. The number will start from 0 .

You can move to the next session by pressing Ctrl+n and move to previous by pressing Ctrl+p .

Here is the list of important Keyboard shortcuts to manage nested sessions.

• Ctrl+a " – List all sessions
• Ctrl+a 0 – Switch to session number 0
• Ctrl+a n – Switch to next session
• Ctrl+a p – Switch to the previous session
• Ctrl+a S – Split current region horizontally into two regions
• Ctrl+a l – Split current region vertically into two regions
• Ctrl+a Q – Close all sessions except the current one
• Ctrl+a X – Close the current session
• Ctrl+a \ – Kill all sessions and terminate Screen
• Ctrl+a ? – Show keybindings. To quit this, press ENTER.
Lock sessions

Screen has an option to lock a screen session. To do so, press Ctrl+a and x . Enter your Linux password to lock the screen.

Screen used by sk <sk> on ubuntuserver.

Logging sessions

You might want to log everything when you're in a Screen session. To do so, just press Ctrl+a and H .

Alternatively, you can enable the logging when starting a new session using -L parameter.

screen -L


From now on, all activities you've done inside the session will recorded and stored in a file named screenlog.x in your $HOME directory. Here, x is a number. You can view the contents of the log file using cat command or any text viewer applications. Log screen sessions Suggested read: Kill Screen sessions If a session is not required anymore, just kill it. To kill a detached session named "senthil": screen -r senthil -X quit Or, screen -X -S senthil quit Or, screen -X -S 29415 quit If there are no open sessions, you will see the following output: $ screen -ls
No Sockets found in /run/screens/S-sk.


For more details, refer man pages.

$man screen There is also a similar command line utility named "Tmux" which does the same job as GNU Screen. To know more about it, refer the following guide. Resource: #### [Mar 13, 2019] Getting started with the cat command by Alan Formy-Duval ###### Mar 13, 2019 | opensource.com 6 comments Cat can also number a file's lines during output. There are two commands to do this, as shown in the help documentation: -b, --number-nonblank number nonempty output lines, overrides -n -n, --number number all output lines If I use the -b command with the hello.world file, the output will be numbered like this: $ cat -b hello.world
1 Hello World !

In the example above, there is an empty line. We can determine why this empty line appears by using the -n argument:

$cat -n hello.world 1 Hello World ! 2$

Now we see that there is an extra empty line. These two arguments are operating on the final output rather than the file contents, so if we were to use the -n option with both files, numbering will count lines as follows:


$cat -n hello.world goodbye.world 1 Hello World ! 2 3 Good Bye World ! 4$

One other option that can be useful is -s for squeeze-blank . This argument tells cat to reduce repeated empty line output down to one line. This is helpful when reviewing files that have a lot of empty lines, because it effectively fits more text on the screen. Suppose I have a file with three lines that are spaced apart by several empty lines, such as in this example, greetings.world :

   $cat greetings.world Greetings World ! Take me to your Leader ! We Come in Peace !$

Using the -s option saves screen space:

$cat -s greetings.world Cat is often used to copy contents of one file to another file. You may be asking, "Why not just use cp ?" Here is how I could create a new file, called both.files , that contains the contents of the hello and goodbye files: $ cat hello.world goodbye.world > both.files
$cat both.files Hello World ! Good Bye World !$
zcat

There is another variation on the cat command known as zcat . This command is capable of displaying files that have been compressed with Gzip without needing to uncompress the files with the gunzip command. As an aside, this also preserves disk space, which is the entire reason files are compressed!

The zcat command is a bit more exciting because it can be a huge time saver for system administrators who spend a lot of time reviewing system log files. Where can we find compressed log files? Take a look at /var/log on most Linux systems. On my system, /var/log contains several files, such as syslog.2.gz and syslog.3.gz . These files are the result of the log management system, which rotates and compresses log files to save disk space and prevent logs from growing to unmanageable file sizes. Without zcat , I would have to uncompress these files with the gunzip command before viewing them. Thankfully, I can use zcat :

$cd / var / log$ ls * .gz
syslog.2.gz syslog.3.gz
 zcat syslog.2.gz | more
Jan 30 00:02: 26 workstation systemd [ 1850 ] : Starting GNOME Terminal Server...
Jan 30 00:02: 26 workstation dbus-daemon [ 1920 ] : [ session uid = 2112 pid = 1920 ] Successful
ly activated service 'org.gnome.Terminal'
Jan 30 00:02: 26 workstation systemd [ 1850 ] : Started GNOME Terminal Server.
Jan 30 00:02: 26 workstation org.gnome.Terminal.desktop [ 2059 ] : # watch_fast: "/org/gno
me / terminal / legacy / " (establishing: 0, active: 0)
Jan 30 00:02:26 workstation org.gnome.Terminal.desktop[2059]: # unwatch_fast: " / org / g
nome / terminal / legacy / " (active: 0, establishing: 1)
Jan 30 00:02:26 workstation org.gnome.Terminal.desktop[2059]: # watch_established: " /
org / gnome / terminal / legacy / " (establishing: 0)
--More--

We can also pass both files to zcat if we want to review both of them uninterrupted. Due to how log rotation works, you need to pass the filenames in reverse order to preserve the chronological order of the log contents:

$ls -l * .gz -rw-r----- 1 syslog adm 196383 Jan 31 00:00 syslog.2.gz -rw-r----- 1 syslog adm 1137176 Jan 30 00:00 syslog.3.gz$ zcat syslog.3.gz syslog.2.gz | more

The cat command seems simple but is very useful. I use it regularly. You also don't need to feed or pet it like a real cat. As always, I suggest you review the man pages ( man cat ) for the cat and zcat commands to learn more about how it can be used. You can also use the --help argument for a quick synopsis of command line arguments.

Victorhck on 13 Feb 2019 Permalink

and there's also a "tac" command, that is just a "cat" upside down!

~~~~~

tac both.files
Good Bye World!
Hello World!
~~~~
Happy hacking! :)
Johan Godfried on 26 Feb 2019 Permalink

Interesting article but please don't misuse cat to pipe to more......

I am trying to teach people to use less pipes and here you go abusing cat to pipe to other commands. IMHO, 99.9% of the time this is not necessary!

In stead of "cat file | command" most of the time, you can use "command file" (yes, I am an old dinosaur from a time where memory was very expensive and forking multiple commands could fill it all up)

Uri Ran on 03 Mar 2019 Permalink

Run cat then press keys to see the codes your shortcut send. (Press Ctrl+C to kill the cat when you're done.)

For example, on my Mac, the key combination option-leftarrow is ^[^[[D and command-downarrow is ^[[B.

I learned it from https://stackoverflow.com/users/787216/lolesque in his answer to https://stackoverflow.com/questions/12382499/looking-for-altleftarrowkey...

Geordie on 04 Mar 2019 Permalink

cat is also useful to make (or append to) text files without an editor:

$cat >> foo << "EOF"> Hello World> Another Line> EOF$

#### [Mar 01, 2019] Emergency reboot/shutdown using SysRq by Ilija Matoski

###### peakoilbarrel.com
As you know linux implements some type of mechanism to gracefully shutdown and reboot, this means the daemons are stopping, usually linux stops them one by one, the file cache is synced to disk.

But what sometimes happens is that the system will not reboot or shutdown no mater how many times you issue the shutdown or reboot command.

If the server is close to you, you can always just do a physical reset, but what if it's far away from you, where you can't reach it, sometimes it's not feasible, why if the OpenSSH server crashes and you cannot log in again in the system.

If you ever find yourself in a situation like that, there is another option to force the system to reboot or shutdown.

The magic SysRq key is a key combination understood by the Linux kernel, which allows the user to perform various low-level commands regardless of the system's state. It is often used to recover from freezes, or to reboot a computer without corrupting the filesystem.

Description QWERTY
Immediately reboot the system, without unmounting or syncing filesystems b
Sync all mounted filesystems s
Shut off the system o
Send the SIGKILL signal to all processes except init i

So if you are in a situation where you cannot reboot or shutdown the server, you can force an immediate reboot by issuing

echo 1 > /proc/sys/kernel/sysrq
echo b > /proc/sysrq-trigger


If you want you can also force a sync before rebooting by issuing these commands

echo 1 > /proc/sys/kernel/sysrq
echo s > /proc/sysrq-trigger
echo b > /proc/sysrq-trigger


These are called magic commands , and they're pretty much synonymous with holding down Alt-SysRq and another key on older keyboards. Dropping 1 into /proc/sys/kernel/sysrq tells the kernel that you want to enable SysRq access (it's usually disabled). The second command is equivalent to pressing * Alt-SysRq-b on a QWERTY keyboard.

If you want to keep SysRq enabled all the time, you can do that with an entry in your server's sysctl.conf:

echo "kernel.sysrq = 1" >> /etc/sysctl.conf


#### [Feb 11, 2019] Resuming rsync on a interrupted transfer

###### May 15, 2013 | stackoverflow.com

Glitches , May 15, 2013 at 18:06

I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning.

Here is my command:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"

When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 .

Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume.

How do I fix this so I don't have to manually intervene each time?

Richard Michael , Nov 6, 2013 at 4:26

TL;DR : Use --timeout=X (X in seconds) to change the default rsync server timeout, not --inplace .

The issue is the rsync server processes (of which there are two, see rsync --server ... in ps output on the receiver) continue running, to wait for the rsync client to send data.

If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9 ), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

• Don't use --inplace to workaround this. You will undoubtedly have other problems as a result, man rsync for the details.
• It's trivial, but -t in your rsync options is redundant, it is implied by -a .
• An already compressed disk image sent over rsync without compression might result in shorter transfer time (by avoiding double compression). However, I'm unsure of the compression techniques in both cases. I'd test it.
• As far as I understand --checksum / -c , it won't help you in this case. It affects how rsync decides if it should transfer a file. Though, after a first rsync completes, you could run a second rsync with -c to insist on checksums, to prevent the strange case that file size and modtime are the same on both sides, but bad data was written.

JamesTheAwesomeDude , Dec 29, 2013 at 16:50

Just curious: wouldn't SIGINT (aka ^C ) be 'politer' than SIGTERM ? – JamesTheAwesomeDude Dec 29 '13 at 16:50

Richard Michael , Dec 29, 2013 at 22:34

I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives SIGINT via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) – Richard Michael Dec 29 '13 at 22:34

d-b , Feb 3, 2015 at 8:48

I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/ but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? – d-b Feb 3 '15 at 8:48 Cees Timmerman , Sep 15, 2015 at 17:10 @user23122 --checksum reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. – Cees Timmerman Sep 15 '15 at 17:10 #### [Feb 11, 2019] prsync command man page - pssh ##### Originally from Brent N. Chun ~ Intel Research Berkeley ###### Feb 11, 2019 | www.mankier.com prsync -- parallel file sync program Synopsis prsync [ - v A r a z ] [ -h hosts_file ] [ -H [ user @] host [: port ]] [ -l user ] [ -p par ] [ -o outdir ] [ -e errdir ] [ -t timeout ] [ -O options ] [ -x args ] [ -X arg ] [ -S args ] local ... remote Description prsync is a program for copying files in parallel to a number of hosts using the popular rsync program. It provides features such as passing a password to ssh, saving output to files, and timing out. Options -h host_file --hosts host_file Read hosts from the given host_file . Lines in the host file are of the form [ user @] host [: port ] and can include blank lines and comments (lines beginning with "#"). If multiple host files are given (the -h option is used more than once), then prsync behaves as though these files were concatenated together. If a host is specified multiple times, then prsync will connect the given number of times. -H [ user @] host [: port ] --host [ user @] host [: port ] -H "[ user @] host [: port ] [ [ user @] host [: port ] ... ]" --host "[ user @] host [: port ] [ [ user @] host [: port ] ... ]" Add the given host strings to the list of hosts. This option may be given multiple times, and may be used in conjunction with the -h option. -l user --user user Use the given username as the default for any host entries that don't specifically specify a user. -p parallelism --par parallelism Use the given number as the maximum number of concurrent connections. -t timeout --timeout timeout Make connections time out after the given number of seconds. With a value of 0, prsync will not timeout any connections. -o outdir --outdir outdir Save standard output to files in the given directory. Filenames are of the form [ user @] host [: port ][. num ] where the user and port are only included for hosts that explicitly specify them. The number is a counter that is incremented each time for hosts that are specified more than once. -e errdir --errdir errdir Save standard error to files in the given directory. Filenames are of the same form as with the -o option. -x args --extra-args args Passes extra rsync command-line arguments (see the rsync(1) man page for more information about rsync arguments). This option may be specified multiple times. The arguments are processed to split on whitespace, protect text within quotes, and escape with backslashes. To pass arguments without such processing, use the -X option instead. -X arg --extra-arg arg Passes a single rsync command-line argument (see the rsync(1) man page for more information about rsync arguments). Unlike the -x option, no processing is performed on the argument, including word splitting. To pass multiple command-line arguments, use the option once for each argument. -O options --options options SSH options in the format used in the SSH configuration file (see the ssh_config(5) man page for more information). This option may be specified multiple times. -A --askpass Prompt for a password and pass it to ssh. The password may be used for either to unlock a key or for password authentication. The password is transferred in a fairly secure manner (e.g., it will not show up in argument lists). However, be aware that a root user on your system could potentially intercept the password. -v --verbose Include error messages from rsync with the -i and \ options. -r --recursive Recursively copy directories. -a --archive Use rsync archive mode (rsync's -a option). -z --compress Use rsync compression. -S args --ssh-args args Passes extra SSH command-line arguments (see the ssh(1) man page for more information about SSH arguments). The given value is appended to the ssh command (rsync's -e option) without any processing. Tips The ssh_config file can include an arbitrary number of Host sections. Each host entry specifies ssh options which apply only to the given host. Host definitions can even behave like aliases if the HostName option is included. This ssh feature, in combination with pssh host files, provides a tremendous amount of flexibility. Exit Status The exit status codes from prsync are as follows: 0 Success 1 Miscellaneous error 2 Syntax or usage error 3 At least one process was killed by a signal or timed out. 4 All processes completed, but at least one rsync process reported an error (exit status other than 0). Authors Written by Brent N. Chun <bnc@theether.org> and Andrew McNabb <amcnabb@mcnabbs.org>. https://github.com/lilydjwg/pssh See Also rsync(1) , ssh(1) , ssh_config(5) , pssh(1) , prsync (1), pslurp(1) , pnuke(1) , Referenced By #### [Jan 29, 2019] Backing things up with rsync ##### Notable quotes: ##### "... I RECURSIVELY DELETED ALL THE LIVE CORPORATE WEBSITES ON FRIDAY AFTERNOON AT 4PM! ..." ##### "... This is why it's ALWAYS A GOOD IDEA to use Midnight Commander or something similar to delete directories!! ..." ##### "... rsync with ssh as the transport mechanism works very well with my nightly LAN backups. I've found this page to be very helpful: http://www.mikerubel.org/computers/rsync_snapshots/ ..." ###### Jul 20, 2017 | www.linuxjournal.com Anonymous on Fri, 11/08/2002 - 03:00. The Subject, not the content, really brings back memories. Imagine this, your tasked with complete control over the network in a multi-million dollar company. You've had some experience in the real world of network maintaince, but mostly you've learned from breaking things at home. Time comes to implement (yes this was a startup company), a backup routine. You carefully consider the best way to do it and decide copying data to a holding disk before the tape run would be perfect in the situation, faster restore if the holding disk is still alive. So off you go configuring all your servers for ssh pass through, and create the rsync scripts. Then before the trial run you think it would be a good idea to create a local backup of all the websites. You logon to the web server, create a temp directory and start testing your newly advance rsync skills. After a couple of goes, you think your ready for the real thing, but you decide to run the test one more time. Everything seems fine so you delete the temp directory. You pause for a second and your month drops open wider than it has ever opened before, and a feeling of terror overcomes you. You want to hide in a hole and hope you didn't see what you saw. I RECURSIVELY DELETED ALL THE LIVE CORPORATE WEBSITES ON FRIDAY AFTERNOON AT 4PM! Anonymous on Sun, 11/10/2002 - 03:00. This is why it's ALWAYS A GOOD IDEA to use Midnight Commander or something similar to delete directories!! ...Root for (5) years and never trashed a filesystem yet (knockwoody)... Anonymous on Fri, 11/08/2002 - 03:00. rsync with ssh as the transport mechanism works very well with my nightly LAN backups. I've found this page to be very helpful: http://www.mikerubel.org/computers/rsync_snapshots/ #### [Dec 05, 2018] How can I scroll up to see the past output in PuTTY? ###### Dec 05, 2018 | superuser.com user1721949 ,Dec 12, 2012 at 8:32 I have a script which, when I run it from PuTTY, it scrolls the screen. Now, I want to go back to see the errors, but when I scroll up, I can see the past commands, but not the output of the command. How can I see the past output? Rico ,Dec 13, 2012 at 8:24 Shift+Pgup/PgDn should work for scrolling without using the scrollbar. > ,Jul 12, 2017 at 21:45 If shift pageup/pagedown fails, try this command: "reset", which seems to correct the display. – user530079 Jul 12 '17 at 21:45 RedGrittyBrick ,Dec 12, 2012 at 9:31 If you don't pipe the output of your commands into something like less , you will be able to use Putty's scroll-bars to view earlier output. Putty has settings for how many lines of past output it retains in it's buffer. before scrolling after scrolling back (upwards) If you use something like less the output doesn't get into Putty's scroll buffer after using less David Dai ,Dec 14, 2012 at 3:31 why is putty different with the native linux console at this point? – David Dai Dec 14 '12 at 3:31 konradstrack ,Dec 12, 2012 at 9:52 I would recommend using screen if you want to have good control over the scroll buffer on a remote shell. You can change the scroll buffer size to suit your needs by setting: defscrollback 4000  in ~/.screenrc , which will specify the number of lines you want to be buffered (4000 in this case). Then you should run your script in a screen session, e.g. by executing screen ./myscript.sh or first executing screen and then ./myscript.sh inside the session. It's also possible to enable logging of the console output to a file. You can find more info on the screen's man page . , From your descript, it sounds like the "problem" is that you are using screen, tmux, or another window manager dependent on them (byobu). Normally you should be able to scroll back in putty with no issue. Exceptions include if you are in an application like less or nano that creates it's own "window" on the terminal. With screen and tmux you can generally scroll back with SHIFT + PGUP (same as you could from the physical terminal of the remote machine). They also both have a "copy" mode that frees the cursor from the prompt and lets you use arrow keys to move it around (for selecting text to copy with just the keyboard). It also lets you scroll up and down with the PGUP and PGDN keys. Copy mode under byobu using screen or tmux backends is accessed by pressing F7 (careful, F6 disconnects the session). To do so directly under screen you press CTRL + a then ESC or [ . You can use ESC to exit copy mode. Under tmux you press CTRL + b then [ to enter copy mode and ] to exit. The simplest solution, of course, is not to use either. I've found both to be quite a bit more trouble than they are worth. If you would like to use multiple different terminals on a remote machine simply connect with multiple instances of putty and manage your windows using, er... Windows. Now forgive me but I must flee before I am burned at the stake for my heresy. EDIT: almost forgot, some keys may not be received correctly by the remote terminal if putty has not been configured correctly. In your putty config check Terminal -> Keyboard . You probably want the function keys and keypad set to be either Linux or Xterm R6 . If you are seeing strange characters on the terminal when attempting the above this is most likely the problem. #### [Nov 21, 2018] Linux Shutdown Command 5 Practical Examples Linux Handbook ###### Nov 21, 2018 | linuxhandbook.com Restart the system with shutdown command There is a separate reboot command but you don't need to learn a new command just for rebooting the system. You can use the Linux shutdown command for rebooting as wel. To reboot a system using the shutdown command, use the -r option. sudo shutdown -r  The behavior is the same as the regular shutdown command. It's just that instead of a shutdown, the system will be restarted. So, if you used shutdown -r without any time argument, it will schedule a reboot after one minute. You can schedule reboots the same way you did with shutdown. sudo shutdown -r +30  You can also reboot the system immediately with shutdown command: sudo shutdown -r now  4. Broadcast a custom message If you are in a multi-user environment and there are several users logged on the system, you can send them a custom broadcast message with the shutdown command. By default, all the logged users will receive a notification about scheduled shutdown and its time. You can customize the broadcast message in the shutdown command itself: sudo shutdown 16:00 "systems will be shutdown for hardware upgrade, please save your work"  Fun Stuff: You can use the shutdown command with -k option to initiate a 'fake shutdown'. It won't shutdown the system but the broadcast message will be sent to all logged on users. 5. Cancel a scheduled shutdown If you scheduled a shutdown, you don't have to live with it. You can always cancel a shutdown with option -c. sudo shutdown -c  And if you had broadcasted a messaged about the scheduled shutdown, as a good sysadmin, you might also want to notify other users about cancelling the scheduled shutdown. sudo shutdown -c "planned shutdown has been cancelled"  Halt vs Power off Halt (option -H): terminates all processes and shuts down the cpu . Power off (option -P): Pretty much like halt but it also turns off the unit itself (lights and everything on the system). Historically, the earlier computers used to halt the system and then print a message like "it's ok to power off now" and then the computers were turned off through physical switches. These days, halt should automically power off the system thanks to ACPI . These were the most common and the most useful examples of the Linux shutdown command. I hope you have learned how to shut down a Linux system via command line. You might also like reading about the less command usage or browse through the list of Linux commands we have covered so far. If you have any questions or suggestions, feel free to let me know in the comment section. #### [Nov 15, 2018] Is Glark a Better Grep Linux.com The source for Linux information ##### Notable quotes: ##### "... stringfilenames ..." ###### Nov 15, 2018 | www.linux.com Is Glark a Better Grep? GNU grep is one of my go-to tools on any Linux box. But grep isn't the only tool in town. If you want to try something a bit different, check out glark a grep alternative that might might be better in some situations. What is glark? Basically, it's a utility that's similar to grep, but it has a few features that grep does not. This includes complex expressions, Perl-compatible regular expressions, and excluding binary files. It also makes showing contextual lines a bit easier. Let's take a look. I installed glark (yes, annoyingly it's yet another *nix utility that has no initial cap) on Linux Mint 11. Just grab it with apt-get install glark and you should be good to go. Simple searches work the same way as with grep : glark stringfilenames . So it's pretty much a drop-in replacement for those. But you're interested in what makes glark special. So let's start with a complex expression, where you're looking for this or that term: glark -r -o thing1 thing2 * This will search the current directory and subdirectories for "thing1" or "thing2." When the results are returned, glark will colorize the results and each search term will be highlighted in a different color. So if you search for, say "Mozilla" and "Firefox," you'll see the terms in different colors. You can also use this to see if something matches within a few lines of another term. Here's an example: glark --and=3 -o Mozilla Firefox -o ID LXDE * This was a search I was using in my directory of Linux.com stories that I've edited. I used three terms I knew were in one story, and one term I knew wouldn't be. You can also just use the --and option to spot two terms within X number of lines of each other, like so: glark --and=3 term1 term2 That way, both terms must be present. You'll note the --and option is a bit simpler than grep's context line options. However, glark tries to stay compatible with grep, so it also supports the -A , -B and -C options from grep. Miss the grep output format? You can tell glark to use grep format with the --grep option. Most, if not all, GNU grep options should work with glark . Before and After If you need to search through the beginning or end of a file, glark has the --before and --after options (short versions, -b and -a ). You can use these as percentages or as absolute number of lines. For instance: glark -a 20 expression * That will find instances of expression after line 20 in a file. The glark Configuration File Note that you can have a ~/.glarkrc that will set common options for each use of glark (unless overridden at the command line). The man page for glark does include some examples, like so: after-context: 1 before-context: 6 context: 5 file-color: blue on yellow highlight: off ignore-case: false quiet: yes text-color: bold reverse line-number-color: bold verbose: false grep: true  Just put that in your ~/.glarkrc and customize it to your heart's content. Note that I've set mine to grep: false and added the binary-files: without-match option. You'll definitely want the quiet option to suppress all the notes about directories, etc. See the man page for more options. It's probably a good idea to spend about 10 minutes on setting up a configuration file. Final Thoughts One thing that I have noticed is that glark doesn't seem as fast as grep . When I do a recursive search through a bunch of directories containing (mostly) HTML files, I seem to get results a lot faster with grep . This is not terribly important for most of the stuff I do with either utility. However, if you're doing something where performance is a major factor, then you may want to see if grep fits the bill better. Is glark "better" than grep? It depends entirely on what you're doing. It has a few features that give it an edge over grep, and I think it's very much worth trying out if you've never given it a shot. #### [Nov 13, 2018] Resuming rsync partial (-P/--partial) on a interrupted transfer ##### Notable quotes: ##### "... should ..." ###### May 15, 2013 | stackoverflow.com Glitches , May 15, 2013 at 18:06 I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning. Here is my command: rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"  When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 . Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume. How do I fix this so I don't have to manually intervene each time? Richard Michael , Nov 6, 2013 at 4:26 TL;DR : Use  --timeout=X  (X in seconds) to change the default rsync server timeout, not --inplace . The issue is the rsync server processes (of which there are two, see rsync --server ...  in ps  output on the receiver) continue running, to wait for the rsync client to send data. If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume. If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL  (e.g., -9 ), but SIGTERM  (e.g.,  pkill -TERM -x rsync  - only an example, you should take care to match only the rsync processes concerned with your client). Fortunately there is an easier way: use the --timeout=X  (X in seconds) option; it is passed to the rsync server processes as well. For example, if you specify  rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming. I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds. If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process). Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM  the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client,  SIGTERM  the first servers, it will merge the data, and you can resume. Finally, a few short remarks: • Don't use --inplace  to workaround this. You will undoubtedly have other problems as a result, man rsync  for the details. • It's trivial, but -t  in your rsync options is redundant, it is implied by -a . • An already compressed disk image sent over rsync without compression might result in shorter transfer time (by avoiding double compression). However, I'm unsure of the compression techniques in both cases. I'd test it. • As far as I understand  --checksum  / -c , it won't help you in this case. It affects how rsync decides if it should transfer a file. Though, after a first rsync completes, you could run a second rsync with -c  to insist on checksums, to prevent the strange case that file size and modtime are the same on both sides, but bad data was written. JamesTheAwesomeDude , Dec 29, 2013 at 16:50 Just curious: wouldn't  SIGINT  (aka ^C ) be 'politer' than  SIGTERM ? – JamesTheAwesomeDude Dec 29 '13 at 16:50 Richard Michael , Dec 29, 2013 at 22:34 I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT  to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives  SIGINT  via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) – Richard Michael Dec 29 '13 at 22:34 d-b , Feb 3, 2015 at 8:48 I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/  but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? – d-b Feb 3 '15 at 8:48

Cees Timmerman , Sep 15, 2015 at 17:10

@user23122 --checksum  reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. – Cees Timmerman Sep 15 '15 at 17:10

#### [Nov 08, 2018] Can rsync resume after being interrupted?

###### Sep 15, 2012 | unix.stackexchange.com

Tim , Sep 15, 2012 at 23:36

I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly.

After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time?

Gilles , Sep 16, 2012 at 1:56

Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56

Tim , Sep 16, 2012 at 2:30

@Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30

jwbensley , Sep 16, 2012 at 16:15

There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15

Tim , Sep 19, 2012 at 5:20

@Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20

Gilles , Sep 19, 2012 at 9:25

@Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25

DanielSmedegaardBuus , Nov 1, 2014 at 12:32

First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred.

While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify .

So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you.

With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files.

So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt.

As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions.

--append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target.

Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences."

That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name.

This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets.

It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :)

So, in short:

If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify .

If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred.

When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size.

UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!)

UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!)

Alex , Aug 28, 2015 at 3:49

According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49

DanielSmedegaardBuus , Sep 1, 2015 at 13:29

Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29

Cees Timmerman , Sep 15, 2015 at 17:21

This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21

DanielSmedegaardBuus , May 10, 2016 at 19:31

@CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31

Jonathan Y. , Jun 14, 2017 at 5:48

What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48

#### [Nov 08, 2018] collectl

###### Nov 08, 2018 | collectl.sourceforge.net
Collectl now supports OpenStack Clouds
Colmux now part of collectl package
Looking for colplot ? It's now here!

Remember, to get lustre support contact Peter Piela to get his custom plugin.

Home | Architecture | Features | Documentation | Releases | FAQ | Support | News | Acknowledgements

There are a number of times in which you find yourself needing performance data. These can include benchmarking, monitoring a system's general heath or trying to determine what your system was doing at some time in the past. Sometimes you just want to know what the system is doing right now. Depending on what you're doing, you often end up using different tools, each designed to for that specific situation.

Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interatively or as a daemon but not both, collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp.

The following is an example taken while writing a large file and running the collectl command with no arguments. By default it shows cpu, network and disk stats in brief format . The key point of this format is all output appears on a single line making it much easier to spot spikes or other anomalies in the output:

collectl

#<--------CPU--------><-----------Disks-----------><-----------Network---------->
#cpu sys inter  ctxsw KBRead  Reads  KBWrit Writes netKBi pkt-in  netKBo pkt-out
37  37   382    188      0      0   27144    254     45     68       3      21
25  25   366    180     20      4   31280    296      0      1       0       0
25  25   368    183      0      0   31720    275      2     20       0       1

In this example, taken while writing to an NFS mounted filesystem, collectl displays interrupts, memory usage and nfs activity with timestamps. Keep in mind that you can mix and match any data and in the case of brief format you simply need to have a window wide enough to accommodate your output.
collectl -sjmf -oT

#         <-------Int--------><-----------Memory-----------><------NFS Totals------>
#Time     Cpu0 Cpu1 Cpu2 Cpu3 Free Buff Cach Inac Slab  Map  Reads Writes Meta Comm
08:36:52  1001   66    0    0   2G 201M 609M 363M 219M 106M      0      0    5    0
08:36:53   999 1657    0    0   2G 201M   1G 918M 252M 106M      0  12622    0    2
08:36:54  1001 7488    0    0   1G 201M   1G   1G 286M 106M      0  20147    0    2

You can also display the same information in verbose format , in which case you get a single line for each type of data at the expense of more screen real estate, as can be seen in this example of network data during NFS writes. Note how you can actually see the network traffic stall while waiting for the server to physically write the data.
collectl -sn --verbose -oT

# NETWORK SUMMARY (/sec)
#          KBIn  PktIn SizeIn  MultI   CmpI  ErrIn  KBOut PktOut  SizeO   CmpO ErrOut
08:46:35   3255  41000     81      0      0      0 112015  78837   1454      0      0
08:46:36      0      9     70      0      0      0     29     25   1174      0      0
08:46:37      0      2     70      0      0      0      0      2    134      0      0

In this last example we see what detail format looks like where we see multiple lines of output for a partitular type of data, which in this case is interrupts. We've also elected to show the time in msecs as well.
collectl -sJ -oTm

#              Int    Cpu0   Cpu1   Cpu2   Cpu3   Type            Device(s)
08:52:32.002   225       0      4      0      0   IO-APIC-level   ioc0
08:52:32.002   000    1000      0      0      0   IO-APIC-edge    timer
08:52:32.002   014       0      0     18      0   IO-APIC-edge    ide0
08:52:32.002   090       0      0      0  15461   IO-APIC-level   eth1

Collectl output can also be saved in a rolling set of logs for later playback or displayed interactively in a variety of formats. If all that isn't enough there are plugins that allow you to report data in alternate formats or even send them over a socket to remote tools such as ganglia or graphite. You can even create files in space-separated format for plotting with external packages like gnuplot. The one below was created with colplot, part of the collectl utilities project, which provides a web-based interface to gnuplot.

Are you a big user of the top command? Have you ever wanted to look across a cluster to see what the top processes are? Better yet, how about using iostat across a cluster? Or maybe vmstat or even looking at top network interfaces across a cluster? Look no more because if collectl reports it for one node, colmux can do it across a cluster AND you can sort by any column of your choice by simply using the right/left arrow keys.

Collectl and Colmux run on all linux distros and are available in redhat and debian respositories and so getting it may be as simple as running yum or apt-get. Note that since colmux has just been merged into the collectl V4.0.0 package it may not yet be available in the repository of your choice and you should install collectl-utils V4.8.2 or earlier to get it for the time being.

Collectl requires perl which is usually installed by default on all major Linux distros and optionally uses Time::Hires which is also usually installed and allows collectl to use fractional intervals and display timestamps in msec. The Compress::Zlib module is usually installed as well and if present the recorded data will be compressed and therefore use on average 90% less storage when recording to a file.

If you're still not sure if collectl is right for you, take a couple of minutes to look at the Collectl Tutorial to get a better feel for what collectl can do. Also be sure to check back and see what's new on the website, sign up for a Mailing List or watch the Forums .

 "I absolutely love it and have been using it extensively for months." Kevin Closson: Performance Architect, EMC "Collectl is indispensable to any system admin." Matt Heaton: President, Bluehost.com

#### [Nov 08, 2018] pexec utility is similar to parallel

###### Nov 08, 2018 | www.gnu.org

Welcome to the web page of the pexec program!

The main purpose of the program pexec is to execute the given command or shell script (e.g. parsed by /bin/sh ) in parallel on the local host or on remote hosts, while some of the execution parameters, namely the redirected standard input, output or error and environmental variables can be varied. This program is therefore capable to replace the classic shell loop iterators (e.g. for ~ in ~ done , in bash ) by executing the body of the loop in parallel. Thus, the program pexec implements shell level data parallelism in a barely simple form. The capabilities of the program is extended with additional features, such as allowing to define mutual exclusions, do atomic command executions and implement higher level resource and job control. See the complete manual for more details. See a brief Hungarian description of the program here .

The actual version of the program package is 1.0rc8 .

You may browse the package directory here (for FTP access, see this directory ). See the GNU summary page of this project here . The latest version of the program source package is pexec-1.0rc8.tar.gz . Here is another mirror of the package directory.

Please consider making donations to the author (via PayPal ) in order to help further development of the program or support the GNU project via the FSF .

#### [Nov 08, 2018] 15 Linux Split and Join Command Examples to Manage Large Files

###### Nov 08, 2018 | www.thegeekstuff.com

by Himanshu Arora on October 16, 2012

Linux split and join commands are very helpful when you are manipulating large files. This article explains how to use Linux split and join command with descriptive examples.

Join and split command syntax:

join [OPTION] FILE1 FILE2
split [OPTION] [INPUT [PREFIX]]

Linux Split Command Examples 1. Basic Split Example

Here is a basic example of split command.

$split split.zip$ ls
split.zip  xab  xad  xaf  xah  xaj  xal  xan  xap  xar  xat  xav  xax  xaz  xbb  xbd  xbf  xbh  xbj  xbl  xbn
xaa        xac  xae  xag  xai  xak  xam  xao  xaq  xas  xau  xaw  xay  xba  xbc  xbe  xbg  xbi  xbk  xbm  xbo


So we see that the file split.zip was split into smaller files with x** as file names. Where ** is the two character suffix that is added by default. Also, by default each x** file would contain 1000 lines.

$wc -l * 40947 split.zip 1000 xaa 1000 xab 1000 xac 1000 xad 1000 xae 1000 xaf 1000 xag 1000 xah 1000 xai ... ... ...  So the output above confirms that by default each x** file contains 1000 lines. 2.Change the Suffix Length using -a option As discussed in example 1 above, the default suffix length is 2. But this can be changed by using -a option. me name= As you see in the following example, it is using suffix of length 5 on the split files. $ split -a5 split.zip
$ls split.zip xaaaac xaaaaf xaaaai xaaaal xaaaao xaaaar xaaaau xaaaax xaaaba xaaabd xaaabg xaaabj xaaabm xaaaaa xaaaad xaaaag xaaaaj xaaaam xaaaap xaaaas xaaaav xaaaay xaaabb xaaabe xaaabh xaaabk xaaabn xaaaab xaaaae xaaaah xaaaak xaaaan xaaaaq xaaaat xaaaaw xaaaaz xaaabc xaaabf xaaabi xaaabl xaaabo  Note: Earlier we also discussed about other file manipulation utilities – tac, rev, paste . 3.Customize Split File Size using -b option Size of each output split file can be controlled using -b option. In this example, the split files were created with a size of 200000 bytes. $ split -b200000 split.zip

$ls -lart total 21084 drwxrwxr-x 3 himanshu himanshu 4096 Sep 26 21:20 .. -rw-rw-r-- 1 himanshu himanshu 10767315 Sep 26 21:21 split.zip -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xad -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xac -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xab -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xaa -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xah -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xag -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xaf -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xae -rw-rw-r-- 1 himanshu himanshu 200000 Sep 26 21:35 xar ... ... ...  4. Create Split Files with Numeric Suffix using -d option As seen in examples above, the output has the format of x** where ** are alphabets. You can change this to number using -d option. Here is an example. This has numeric suffix on the split files. $ split -d split.zip
$ls split.zip x01 x03 x05 x07 x09 x11 x13 x15 x17 x19 x21 x23 x25 x27 x29 x31 x33 x35 x37 x39 x00 x02 x04 x06 x08 x10 x12 x14 x16 x18 x20 x22 x24 x26 x28 x30 x32 x34 x36 x38 x40  5. Customize the Number of Split Chunks using -C option To get control over the number of chunks, use the -C option. This example will create 50 chunks of split files. $ split -n50 split.zip
$ls split.zip xac xaf xai xal xao xar xau xax xba xbd xbg xbj xbm xbp xbs xbv xaa xad xag xaj xam xap xas xav xay xbb xbe xbh xbk xbn xbq xbt xbw xab xae xah xak xan xaq xat xaw xaz xbc xbf xbi xbl xbo xbr xbu xbx  6. Avoid Zero Sized Chunks using -e option While splitting a relatively small file in large number of chunks, its good to avoid zero sized chunks as they do not add any value. This can be done using -e option. Here is an example: $ split -n50 testfile

$ls -lart x* -rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xag -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xaf -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xae -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xad -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xac -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xab -rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:55 xaa -rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xbx -rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xbw -rw-rw-r-- 1 himanshu himanshu 0 Sep 26 21:55 xbv ... ... ...  So we see that lots of zero size chunks were produced in the above output. Now, lets use -e option and see the results: $ split -n50 -e testfile
$ls split.zip testfile xaa xab xac xad xae xaf$ ls -lart x*
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xaf
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xae
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xad
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xac
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xab
-rw-rw-r-- 1 himanshu himanshu 1 Sep 26 21:57 xaa


So we see that no zero sized chunk was produced in the above output.

7. Customize Number of Lines using -l option

Number of lines per output split file can be customized using the -l option.

As seen in the example below, split files are created with 20000 lines.

$split -l20000 split.zip$ ls
split.zip  testfile  xaa  xab  xac

$wc -l x* 20000 xaa 20000 xab 947 xac 40947 total  Get Detailed Information using –verbose option To get a diagnostic message each time a new split file is opened, use –verbose option as shown below. $ split -l20000 --verbose split.zip
creating file xaa'
creating file xab'
creating file xac'


#### [Nov 08, 2018] Utilizing multi core for tar+gzip-bzip compression-decompression

###### Nov 08, 2018 | stackoverflow.com

user1118764 , Sep 7, 2012 at 6:58

I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit).

I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression.

Is there any way I can utilize the unused cores to make it faster?

Warren Severin , Nov 13, 2017 at 4:37

The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37

Mark Adler , Sep 7, 2012 at 14:48

You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz:
tar cf - paths-to-archive | pigz > archive.tar.gz


By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g.

tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz


user788171 , Feb 20, 2013 at 12:43

How do you use pigz to decompress in the same fashion? Or does it only work for compression? – user788171 Feb 20 '13 at 12:43

Mark Adler , Feb 20, 2013 at 16:18

pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression. The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores. – Mark Adler Feb 20 '13 at 16:18

Garrett , Mar 1, 2014 at 7:26

The hyphen here is stdout (see this page ). – Garrett Mar 1 '14 at 7:26

Mark Adler , Jul 2, 2014 at 21:29

Yes. 100% compatible in both directions. – Mark Adler Jul 2 '14 at 21:29

Mark Adler , Apr 23, 2015 at 5:23

There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. – Mark Adler Apr 23 '15 at 5:23

Jen , Jun 14, 2013 at 14:34

You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use.

For example use:

tar -c --use-compress-program=pigz -f tar.file dir_to_zip


ranman , Nov 13, 2013 at 10:01

This is an awesome little nugget of knowledge and deserves more upvotes. I had no idea this option even existed and I've read the man page a few times over the years. – ranman Nov 13 '13 at 10:01

Valerio Schiavoni , Aug 5, 2014 at 22:38

Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38

bovender , Sep 18, 2015 at 10:14

@ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14

Valerio Schiavoni , Sep 28, 2015 at 23:41

On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41

Offenso , Jan 11, 2017 at 17:26

I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26

Maxim Suslov , Dec 18, 2014 at 7:31

Common approach

There is option for tar program:

-I, --use-compress-program PROG
filter through PROG (must accept -d)


You can use multithread version of archiver or compressor utility.

$tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive$ tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive


Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary):

$tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz$ tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz


Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa.

p7zip

For p7zip for compression you need a small shell script like the following:

#!/bin/sh
case $1 in -d) 7za -txz -si -so e;; *) 7za -txz -si -so a .;; esac 2>/dev/null  Save it as 7zhelper.sh. Here the example of usage: $ tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive
$tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z  xz Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ). This is a fragment of man for 5.1.0alpha version: Multithreaded compression and decompression are not implemented yet, so this option has no effect for now. However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2: Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used. Recompiling with replacement If you build tar from sources, then you can recompile with parameters --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip  After recompiling tar with these options you can check the output of tar's help: $ tar --help | grep "lbzip2\|plzip\|pigz"
-j, --bzip2                filter the archive through lbzip2
--lzip                 filter the archive through plzip
-z, --gzip, --gunzip, --ungzip   filter the archive through pigz


> , Apr 28, 2015 at 20:41

This is indeed the best answer. I'll definitely rebuild my tar! – user1985657 Apr 28 '15 at 20:41

mpibzip2 , Apr 28, 2015 at 20:57

I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57

oᴉɹǝɥɔ , Jun 10, 2015 at 17:39

This is a great and elaborate answer. It may be good to mention that multithreaded compression (e.g. with pigz ) is only enabled when it reads from the file. Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39

selurvedu , May 26, 2016 at 22:13

Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13

panticz.de , Sep 1, 2014 at 15:02

You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores:
tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/


einpoklum , Feb 11, 2017 at 15:59

A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59

,

If you want to have more flexibility with filenames and compression options, you can use:
find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \
tar -P --transform='s@/my/path/@@g' -cf - {} + | \
pigz -9 -p 4 > myarchive.tar.gz

Step 1: find

find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec

This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want.

-exec will execute the next command using the results of find : tar

Step 2: tar

tar -P --transform='s@/my/path/@@g' -cf - {} +

--transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included.

-P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading /' from member names". Leading '/' with be removed by --transform anyway.

-cf - tells tar to use the tarball name we'll specify later

{} + uses everyfiles that find found previously

Step 3: pigz

pigz -9 -p 4

Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores.

Step 4: archive name

> myarchive.tar.gz

Finally.

#### [Nov 03, 2018] David Both

###### Jun 22, 2017 | opensource.com
...

The long listing of the /lib64 directory above shows that the first character in the filemode is the letter "l," which means that each is a soft or symbolic link.

In An introduction to Linux's EXT4 filesystem , I discussed the fact that each file has one inode that contains information about that file, including the location of the data belonging to that file. Figure 2 in that article shows a single directory entry that points to the inode. Every file must have at least one directory entry that points to the inode that describes the file. The directory entry is a hard link, thus every file has at least one hard link.

In Figure 1 below, multiple directory entries point to a single inode. These are all hard links. I have abbreviated the locations of three of the directory entries using the tilde ( ~ ) convention for the home directory, so that ~ is equivalent to /home/user in this example. Note that the fourth directory entry is in a completely different directory, /home/shared , which might be a location for sharing files between users of the computer.

Figure 1

Hard links are limited to files contained within a single filesystem. "Filesystem" is used here in the sense of a partition or logical volume (LV) that is mounted on a specified mount point, in this case /home . This is because inode numbers are unique only within each filesystem, and a different filesystem, for example, /var or /opt , will have inodes with the same number as the inode for our file.

Because all the hard links point to the single inode that contains the metadata about the file, all of these attributes are part of the file, such as ownerships, permissions, and the total number of hard links to the inode, and cannot be different for each hard link. It is one file with one set of attributes. The only attribute that can be different is the file name, which is not contained in the inode. Hard links to a single file/inode located in the same directory must have different names, due to the fact that there can be no duplicate file names within a single directory.

The number of hard links for a file is displayed with the ls -l command. If you want to display the actual inode numbers, the command ls -li does that.

The difference between a hard link and a soft link, also known as a symbolic link (or symlink), is that, while hard links point directly to the inode belonging to the file, soft links point to a directory entry, i.e., one of the hard links. Because soft links point to a hard link for the file and not the inode, they are not dependent upon the inode number and can work across filesystems, spanning partitions and LVs.

The downside to this is: If the hard link to which the symlink points is deleted or renamed, the symlink is broken. The symlink is still there, but it points to a hard link that no longer exists. Fortunately, the ls command highlights broken links with flashing white text on a red background in a long listing.

I think the easiest way to understand the use of and differences between hard and soft links is with a lab project that you can do. This project should be done in an empty directory as a non-root user . I created the ~/temp directory for this project, and you should, too. It creates a safe place to do the project and provides a new, empty directory to work in so that only files associated with this project will be located there.

Initial setup

First, create the temporary directory in which you will perform the tasks needed for this project. Ensure that the present working directory (PWD) is your home directory, then enter the following command.

mkdir temp


Change into ~/temp to make it the PWD with this command.

cd temp


To get started, we need to create a file we can link to. The following command does that and provides some content as well.

du -h > main.file.txt


Use the ls -l long list to verify that the file was created correctly. It should look similar to my results. Note that the file size is only 7 bytes, but yours may vary by a byte or two.

[ dboth @ david temp ] $ls -l total 4 -rw-rw-r-- 1 dboth dboth 7 Jun 13 07: 34 main.file.txt Notice the number "1" following the file mode in the listing. That number represents the number of hard links that exist for the file. For now, it should be 1 because we have not created any additional links to our test file. Experimenting with hard links Hard links create a new directory entry pointing to the same inode, so when hard links are added to a file, you will see the number of links increase. Ensure that the PWD is still ~/temp . Create a hard link to the file main.file.txt , then do another long list of the directory. [ dboth @ david temp ]$ ln main.file.txt link1.file.txt
[ dboth @ david temp ] $ls -l total 8 -rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 link1.file.txt -rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 main.file.txt Notice that both files have two links and are exactly the same size. The date stamp is also the same. This is really one file with one inode and two links, i.e., directory entries to it. Create a second hard link to this file and list the directory contents. You can create the link to either of the existing ones: link1.file.txt or main.file.txt . [ dboth @ david temp ]$ ln link1.file.txt link2.file.txt ; ls -l
total 16
-rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link1.file.txt
-rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link2.file.txt
-rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 main.file.txt

Notice that each new hard link in this directory must have a different name because two files -- really directory entries -- cannot have the same name within the same directory. Try to create another link with a target name the same as one of the existing ones.

[ dboth @ david temp ] $ln main.file.txt link2.file.txt ln: failed to create hard link 'link2.file.txt' : File exists Clearly that does not work, because link2.file.txt already exists. So far, we have created only hard links in the same directory. So, create a link in your home directory, the parent of the temp directory in which we have been working so far. [ dboth @ david temp ]$ ln main.file.txt .. / main.file.txt ; ls -l .. / main *
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt

The ls command in the above listing shows that the main.file.txt file does exist in the home directory with the same name as the file in the temp directory. Of course, these are not different files; they are the same file with multiple links -- directory entries -- to the same inode. To help illustrate the next point, add a file that is not a link.

[ dboth @ david temp ] $touch unlinked.file ; ls -l total 12 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file Look at the inode number of the hard links and that of the new file using the -i option to the ls command. [ dboth @ david temp ]$ ls -li
total 12
657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt
657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt
657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

Notice the number 657024 to the left of the file mode in the example above. That is the inode number, and all three file links point to the same inode. You can use the -i option to view the inode number for the link we created in the home directory as well, and that will also show the same value. The inode number of the file that has only one link is different from the others. Note that the inode numbers will be different on your system.

Let's change the size of one of the hard-linked files.

[ dboth @ david temp ] $df -h > link2.file.txt ; ls -li total 12 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file The file size of all the hard-linked files is now larger than before. That is because there is really only one file that is linked to by multiple directory entries. I know this next experiment will work on my computer because my /tmp directory is on a separate LV. If you have a separate LV or a filesystem on a different partition (if you're not using LVs), determine whether or not you have access to that LV or partition. If you don't, you can try to insert a USB memory stick and mount it. If one of those options works for you, you can do this experiment. Try to create a link to one of the files in your ~/temp directory in /tmp (or wherever your different filesystem directory is located). [ dboth @ david temp ]$ ln link2.file.txt / tmp / link3.file.txt

Why does this error occur? The reason is each separate mountable filesystem has its own set of inode numbers. Simply referring to a file by an inode number across the entire Linux directory structure can result in confusion because the same inode number can exist in each mounted filesystem.

There may be a time when you will want to locate all the hard links that belong to a single inode. You can find the inode number using the ls -li command. Then you can use the find command to locate all links with that inode number.

[ dboth @ david temp ] $find . -inum 657024 . / main.file.txt . / link1.file.txt . / link2.file.txt Note that the find command did not find all four of the hard links to this inode because we started at the current directory of ~/temp . The find command only finds files in the PWD and its subdirectories. To find all the links, we can use the following command, which specifies your home directory as the starting place for the search. [ dboth @ david temp ]$ find ~ -samefile main.file.txt
/ home / dboth / temp / main.file.txt
/ home / dboth / temp / link1.file.txt
/ home / dboth / temp / link2.file.txt
/ home / dboth / main.file.txt

You may see error messages if you do not have permissions as a non-root user. This command also uses the -samefile option instead of specifying the inode number. This works the same as using the inode number and can be easier if you know the name of one of the hard links.

As you have just seen, creating hard links is not possible across filesystem boundaries; that is, from a filesystem on one LV or partition to a filesystem on another. Soft links are a means to answer that problem with hard links. Although they can accomplish the same end, they are very different, and knowing these differences is important.

Let's start by creating a symlink in our ~/temp directory to start our exploration.

[ dboth @ david temp ] $ln -s link2.file.txt link3.file.txt ; ls -li total 12 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt 658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - > link2.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file The hard links, those that have the inode number 657024 , are unchanged, and the number of hard links shown for each has not changed. The newly created symlink has a different inode, number 658270 . The soft link named link3.file.txt points to link2.file.txt . Use the cat command to display the contents of link3.file.txt . The file mode information for the symlink starts with the letter " l " which indicates that this file is actually a symbolic link. The size of the symlink link3.file.txt is only 14 bytes in the example above. That is the size of the text link3.file.txt -> link2.file.txt , which is the actual content of the directory entry. The directory entry link3.file.txt does not point to an inode; it points to another directory entry, which makes it useful for creating links that span file system boundaries. So, let's create that link we tried before from the /tmp directory. [ dboth @ david temp ]$ ln -s / home / dboth / temp / link2.file.txt
/ tmp / link3.file.txt ; ls -l / tmp / link *
lrwxrwxrwx 1 dboth dboth 31 Jun 14 21 : 53 / tmp / link3.file.txt - >

There are some other things that you should consider when you need to delete links or the files to which they point.

First, let's delete the link main.file.txt . Remember that every directory entry that points to an inode is simply a hard link.

[ dboth @ david temp ] $rm main.file.txt ; ls -li total 8 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt 658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - > link2.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file The link main.file.txt was the first link created when the file was created. Deleting it now still leaves the original file and its data on the hard drive along with all the remaining hard links. To delete the file and its data, you would have to delete all the remaining hard links. Now delete the link2.file.txt hard link. [ dboth @ david temp ]$ rm link2.file.txt ; ls -li
total 8
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - >
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The unlink command can also be used to delete files and links. It is very simple and has no options, as the rm command does. It does, however, more accurately reflect the underlying process of deletion, in that it removes the link -- the directory entry -- to the file being deleted.

Final thoughts

If you use an editor which makes automatic backups - emacs certainly is one such - then you may end up with a new version of the edited file, while the backup is the linked copy, because the editor simply renames the file to the backup name (with emacs, test.c would be renamed test.c~) and the new version when saved under the old name is no longer linked.

Symbolic links avoid this problem, so I tend to use them for source code where required.

#### [Nov 03, 2018] David Both

###### Nov 03, 2018 | opensource.com

Feed 161 up 4 comments Image by : Paul Lewin . Modified by Opensource.com. CC BY-SA 2.0 x Get the newsletter

https://opensource.com/eloqua-embedded-email-capture-block.html?offer_id=70160000000QzXNAA0

An introduction to Linux's EXT4 filesystem ; Managing devices in Linux ; An introduction to Linux filesystems ; and A Linux user's guide to Logical Volume Management , I have briefly mentioned an interesting feature of Linux filesystems that can make some tasks easier by providing access to files from multiple locations in the filesystem directory tree.

There are two types of Linux filesystem links: hard and soft. The difference between the two types of links is significant, but both types are used to solve similar problems. They both provide multiple directory entries (or references) to a single file, but they do it quite differently. Links are powerful and add flexibility to Linux filesystems because everything is a file .

More Linux resources

I have found, for instance, that some programs required a particular version of a library. When a library upgrade replaced the old version, the program would crash with an error specifying the name of the old, now-missing library. Usually, the only change in the library name was the version number. Acting on a hunch, I simply added a link to the new library but named the link after the old library name. I tried the program again and it worked perfectly. And, okay, the program was a game, and everyone knows the lengths that gamers will go to in order to keep their games running.

In fact, almost all applications are linked to libraries using a generic name with only a major version number in the link name, while the link points to the actual library file that also has a minor version number. In other instances, required files have been moved from one directory to another to comply with the Linux file specification, and there are links in the old directories for backwards compatibility with those programs that have not yet caught up with the new locations. If you do a long listing of the /lib64 directory, you can find many examples of both.

lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.hwm -> ../../usr/share/cracklib/pw_dict.hwm
lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.pwd -> ../../usr/share/cracklib/pw_dict.pwd
lrwxrwxrwx. 1 root root 36 Dec 8 2016 cracklib_dict.pwi -> ../../usr/share/cracklib/pw_dict.pwi
lrwxrwxrwx. 1 root root 27 Jun 9 2016 libaccountsservice.so.0 -> libaccountsservice.so.0.0.0
-rwxr-xr-x. 1 root root 288456 Jun 9 2016 libaccountsservice.so.0.0.0
lrwxrwxrwx 1 root root 15 May 17 11:47 libacl.so.1 -> libacl.so.1.1.0
-rwxr-xr-x 1 root root 36472 May 17 11:47 libacl.so.1.1.0
lrwxrwxrwx. 1 root root 15 Feb 4 2016 libaio.so.1 -> libaio.so.1.0.1
-rwxr-xr-x. 1 root root 6224 Feb 4 2016 libaio.so.1.0.0
-rwxr-xr-x. 1 root root 6224 Feb 4 2016 libaio.so.1.0.1
-rwxr-xr-x. 1 root root 816160 Jan 16 16:39 libakonadi-calendar.so.4.14.26

A few of the links in the /lib64 directory

The long listing of the /lib64 directory above shows that the first character in the filemode is the letter "l," which means that each is a soft or symbolic link.

In An introduction to Linux's EXT4 filesystem , I discussed the fact that each file has one inode that contains information about that file, including the location of the data belonging to that file. Figure 2 in that article shows a single directory entry that points to the inode. Every file must have at least one directory entry that points to the inode that describes the file. The directory entry is a hard link, thus every file has at least one hard link.

In Figure 1 below, multiple directory entries point to a single inode. These are all hard links. I have abbreviated the locations of three of the directory entries using the tilde ( ~ ) convention for the home directory, so that ~ is equivalent to /home/user in this example. Note that the fourth directory entry is in a completely different directory, /home/shared , which might be a location for sharing files between users of the computer.

Figure 1

Hard links are limited to files contained within a single filesystem. "Filesystem" is used here in the sense of a partition or logical volume (LV) that is mounted on a specified mount point, in this case /home . This is because inode numbers are unique only within each filesystem, and a different filesystem, for example, /var or /opt , will have inodes with the same number as the inode for our file.

Because all the hard links point to the single inode that contains the metadata about the file, all of these attributes are part of the file, such as ownerships, permissions, and the total number of hard links to the inode, and cannot be different for each hard link. It is one file with one set of attributes. The only attribute that can be different is the file name, which is not contained in the inode. Hard links to a single file/inode located in the same directory must have different names, due to the fact that there can be no duplicate file names within a single directory.

The number of hard links for a file is displayed with the ls -l command. If you want to display the actual inode numbers, the command ls -li does that.

The difference between a hard link and a soft link, also known as a symbolic link (or symlink), is that, while hard links point directly to the inode belonging to the file, soft links point to a directory entry, i.e., one of the hard links. Because soft links point to a hard link for the file and not the inode, they are not dependent upon the inode number and can work across filesystems, spanning partitions and LVs.

The downside to this is: If the hard link to which the symlink points is deleted or renamed, the symlink is broken. The symlink is still there, but it points to a hard link that no longer exists. Fortunately, the ls command highlights broken links with flashing white text on a red background in a long listing.

I think the easiest way to understand the use of and differences between hard and soft links is with a lab project that you can do. This project should be done in an empty directory as a non-root user . I created the ~/temp directory for this project, and you should, too. It creates a safe place to do the project and provides a new, empty directory to work in so that only files associated with this project will be located there.

Initial setup

First, create the temporary directory in which you will perform the tasks needed for this project. Ensure that the present working directory (PWD) is your home directory, then enter the following command.

mkdir temp


Change into ~/temp to make it the PWD with this command.

cd temp


To get started, we need to create a file we can link to. The following command does that and provides some content as well.

du -h > main.file.txt


Use the ls -l long list to verify that the file was created correctly. It should look similar to my results. Note that the file size is only 7 bytes, but yours may vary by a byte or two.

[ dboth @ david temp ] $ls -l total 4 -rw-rw-r-- 1 dboth dboth 7 Jun 13 07: 34 main.file.txt Notice the number "1" following the file mode in the listing. That number represents the number of hard links that exist for the file. For now, it should be 1 because we have not created any additional links to our test file. Experimenting with hard links Hard links create a new directory entry pointing to the same inode, so when hard links are added to a file, you will see the number of links increase. Ensure that the PWD is still ~/temp . Create a hard link to the file main.file.txt , then do another long list of the directory. [ dboth @ david temp ]$ ln main.file.txt link1.file.txt
[ dboth @ david temp ] $ls -l total 8 -rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 link1.file.txt -rw-rw-r-- 2 dboth dboth 7 Jun 13 07: 34 main.file.txt Notice that both files have two links and are exactly the same size. The date stamp is also the same. This is really one file with one inode and two links, i.e., directory entries to it. Create a second hard link to this file and list the directory contents. You can create the link to either of the existing ones: link1.file.txt or main.file.txt . [ dboth @ david temp ]$ ln link1.file.txt link2.file.txt ; ls -l
total 16
-rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link1.file.txt
-rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 link2.file.txt
-rw-rw-r-- 3 dboth dboth 7 Jun 13 07: 34 main.file.txt

Notice that each new hard link in this directory must have a different name because two files -- really directory entries -- cannot have the same name within the same directory. Try to create another link with a target name the same as one of the existing ones.

[ dboth @ david temp ] $ln main.file.txt link2.file.txt ln: failed to create hard link 'link2.file.txt' : File exists Clearly that does not work, because link2.file.txt already exists. So far, we have created only hard links in the same directory. So, create a link in your home directory, the parent of the temp directory in which we have been working so far. [ dboth @ david temp ]$ ln main.file.txt .. / main.file.txt ; ls -l .. / main *
-rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt

The ls command in the above listing shows that the main.file.txt file does exist in the home directory with the same name as the file in the temp directory. Of course, these are not different files; they are the same file with multiple links -- directory entries -- to the same inode. To help illustrate the next point, add a file that is not a link.

[ dboth @ david temp ] $touch unlinked.file ; ls -l total 12 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file Look at the inode number of the hard links and that of the new file using the -i option to the ls command. [ dboth @ david temp ]$ ls -li
total 12
657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link1.file.txt
657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 link2.file.txt
657024 -rw-rw-r-- 4 dboth dboth 7 Jun 13 07: 34 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

Notice the number 657024 to the left of the file mode in the example above. That is the inode number, and all three file links point to the same inode. You can use the -i option to view the inode number for the link we created in the home directory as well, and that will also show the same value. The inode number of the file that has only one link is different from the others. Note that the inode numbers will be different on your system.

Let's change the size of one of the hard-linked files.

[ dboth @ david temp ] $df -h > link2.file.txt ; ls -li total 12 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file The file size of all the hard-linked files is now larger than before. That is because there is really only one file that is linked to by multiple directory entries. I know this next experiment will work on my computer because my /tmp directory is on a separate LV. If you have a separate LV or a filesystem on a different partition (if you're not using LVs), determine whether or not you have access to that LV or partition. If you don't, you can try to insert a USB memory stick and mount it. If one of those options works for you, you can do this experiment. Try to create a link to one of the files in your ~/temp directory in /tmp (or wherever your different filesystem directory is located). [ dboth @ david temp ]$ ln link2.file.txt / tmp / link3.file.txt

Why does this error occur? The reason is each separate mountable filesystem has its own set of inode numbers. Simply referring to a file by an inode number across the entire Linux directory structure can result in confusion because the same inode number can exist in each mounted filesystem.

There may be a time when you will want to locate all the hard links that belong to a single inode. You can find the inode number using the ls -li command. Then you can use the find command to locate all links with that inode number.

[ dboth @ david temp ] $find . -inum 657024 . / main.file.txt . / link1.file.txt . / link2.file.txt Note that the find command did not find all four of the hard links to this inode because we started at the current directory of ~/temp . The find command only finds files in the PWD and its subdirectories. To find all the links, we can use the following command, which specifies your home directory as the starting place for the search. [ dboth @ david temp ]$ find ~ -samefile main.file.txt
/ home / dboth / temp / main.file.txt
/ home / dboth / temp / link1.file.txt
/ home / dboth / temp / link2.file.txt
/ home / dboth / main.file.txt

You may see error messages if you do not have permissions as a non-root user. This command also uses the -samefile option instead of specifying the inode number. This works the same as using the inode number and can be easier if you know the name of one of the hard links.

As you have just seen, creating hard links is not possible across filesystem boundaries; that is, from a filesystem on one LV or partition to a filesystem on another. Soft links are a means to answer that problem with hard links. Although they can accomplish the same end, they are very different, and knowing these differences is important.

Let's start by creating a symlink in our ~/temp directory to start our exploration.

[ dboth @ david temp ] $ln -s link2.file.txt link3.file.txt ; ls -li total 12 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt 658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - > link2.file.txt 657024 -rw-rw-r-- 4 dboth dboth 1157 Jun 14 14 : 14 main.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file The hard links, those that have the inode number 657024 , are unchanged, and the number of hard links shown for each has not changed. The newly created symlink has a different inode, number 658270 . The soft link named link3.file.txt points to link2.file.txt . Use the cat command to display the contents of link3.file.txt . The file mode information for the symlink starts with the letter " l " which indicates that this file is actually a symbolic link. The size of the symlink link3.file.txt is only 14 bytes in the example above. That is the size of the text link3.file.txt -> link2.file.txt , which is the actual content of the directory entry. The directory entry link3.file.txt does not point to an inode; it points to another directory entry, which makes it useful for creating links that span file system boundaries. So, let's create that link we tried before from the /tmp directory. [ dboth @ david temp ]$ ln -s / home / dboth / temp / link2.file.txt
/ tmp / link3.file.txt ; ls -l / tmp / link *
lrwxrwxrwx 1 dboth dboth 31 Jun 14 21 : 53 / tmp / link3.file.txt - >

There are some other things that you should consider when you need to delete links or the files to which they point.

First, let's delete the link main.file.txt . Remember that every directory entry that points to an inode is simply a hard link.

[ dboth @ david temp ] $rm main.file.txt ; ls -li total 8 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt 657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link2.file.txt 658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - > link2.file.txt 657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file The link main.file.txt was the first link created when the file was created. Deleting it now still leaves the original file and its data on the hard drive along with all the remaining hard links. To delete the file and its data, you would have to delete all the remaining hard links. Now delete the link2.file.txt hard link. [ dboth @ david temp ]$ rm link2.file.txt ; ls -li
total 8
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 link1.file.txt
658270 lrwxrwxrwx 1 dboth dboth 14 Jun 14 15 : 21 link3.file.txt - >
657024 -rw-rw-r-- 3 dboth dboth 1157 Jun 14 14 : 14 main.file.txt
657863 -rw-rw-r-- 1 dboth dboth 0 Jun 14 08: 18 unlinked.file

The unlink command can also be used to delete files and links. It is very simple and has no options, as the rm command does. It does, however, more accurately reflect the underlying process of deletion, in that it removes the link -- the directory entry -- to the file being deleted.

Final thoughts

I worked with both types of links for a long time before I began to understand their capabilities and idiosyncrasies. It took writing a lab project for a Linux class I taught to fully appreciate how links work. This article is a simplification of what I taught in that class, and I hope it speeds your learning curve. Topics Linux About the author David Both - David Both is a Linux and Open Source advocate who resides in Raleigh, North Carolina. He has been in the IT industry for over forty years and taught OS/2 for IBM where he worked for over 20 years. While at IBM, he wrote the first training course for the original IBM PC in 1981. He has taught RHCE classes for Red Hat and has worked at MCI Worldcom, Cisco, and the State of North Carolina. He has been working with Linux and Open Source Software for almost 20 years. David has written articles for...

#### [Nov 02, 2018] How to Recover from an Accidental SSH Disconnection on Linux RoseHosting

###### Nov 02, 2018 | www.rosehosting.com

... I can get a list of all previous screens using the command:

screen -ls


And this gives me the output as shown here:

As you can see, there is a screen session here with the name:

pts-0.test-centos-server

To reconnect to it, just type:

screen -r


And this will take you back to where you were before the SSH connection was terminated! It's an amazing tool that you need to use for all important operations as insurance against accidental terminations.

Manually Detaching Screens

When you break an SSH session, what actually happens is that the screen is automatically detached from it and exists independently. While this is great, you can also detach screens manually and have multiple screens existing at the same time.

For example, to detach a screen just type:

screen -d


And the current screen will be detached and preserved. However, all the processes inside it are still running, and all the states are preserved:

You can re-attach to a screen at any time using the "screen -r" command. To connect to a specific screen instead of the most recent, use:

screen -r [screenname]

Changing the Screen Names to Make Them More Relevant

By default, the screen names don't mean much. And when you have a bunch of them present, you won't know which screens contain which processes. Fortunately, renaming a screen is easy when inside one. Just type:

ctrl+a :

We saw in the previous article that "ctrl+a" is the trigger condition for screen commands. The colon (:) will take you to the bottom of the screen where you can type commands. To rename, use:

sessionname [newscreenname]


As shown here:

And now when you detach the screen, it will show with the new name like this:

Now you can have as many screens as you want without getting confused about which one is which!

If you are one of our Managed VPS hosting clients, we can do all of this for you. Simply contact our system administrators and they will respond to your request as soon as possible.

If you liked this blog post on how to recover from an accidental SSH disconnection on Linux, please share it with your friends on social media networks, or if you have any question regarding this blog post, simply leave a comment below and we will answer it. Thanks!

#### [Oct 30, 2018] 10 tr Command Examples in Linux

###### Oct 30, 2018 | www.tecmint.com

8. Here is an example of breaking a single line of words (sentence) into multiple lines, where each word appears in a separate line.

$echo "My UID is$UID"

My UID is 1000

$echo "My UID is$UID" | tr " "  "\n"

My
UID
is
1000


9. Related to the previous example, you can also translate multiple lines of words into a single sentence as shown.

$cat uid.txt My UID is 1000$ tr "\n" " " < uid.txt

My UID is 1000


10. It is also possible to translate just a single character, for instance a space into a " : " character, as follows.

$echo "Tecmint.com =>Linux-HowTos,Guides,Tutorials" | tr " " ":" Tecmint.com:=>Linux-HowTos,Guides,Tutorials  There are several sequence characters you can use with tr , for more information, see the tr man page. ... ... ... #### [Oct 29, 2018] Getting all the matches with 'grep -f' option ##### Perverted example, but interesting question. ###### Oct 29, 2018 | stackoverflow.com Arturo ,Mar 24, 2017 at 8:59 I would like to find all the matches of the text I have in one file ('file1.txt') that are found in another file ('file2.txt') using the grep option -f, that tells to read the expressions to be found from file. 'file1.txt' a a 'file2.txt' a When I run the command: grep -f file1.txt file2.txt -w I get only once the output of the 'a'. instead I would like to get it twice, because it occurs twice in my 'file1.txt' file. Is there a way to let grep (or any other unix/linux) tool to output a match for each line it reads? Thanks in advance. Arturo RomanPerekhrest ,Mar 24, 2017 at 9:02 the matches of the text - some exact text? should it compare line to line? – RomanPerekhrest Mar 24 '17 at 9:02 Arturo ,Mar 24, 2017 at 9:04 Yes it contains exact match. I added the -w options, following your input. Yes, it is a comparison line by line. – Arturo Mar 24 '17 at 9:04 Remko ,Mar 24, 2017 at 9:19 Grep works as designed, giving only one output line. You could use another approach: while IFS= read -r pattern; do grep -e$pattern file2.txt
done < file1.txt


This would use every line in file1.txt as a pattern for the grep, thus resulting in the output you're looking for.

Arturo ,Mar 24, 2017 at 9:30

That did the trick!. Thank you. And it is even much faster than my previous grep command. – Arturo Mar 24 '17 at 9:30

ar7 ,Mar 24, 2017 at 9:12

When you use
grep -f pattern.txt file.txt


It means match the pattern found in pattern.txt in the file file.txt .

It is giving you only one output because that is all is there in the second file.

Try interchanging the files,

grep -f file2.txt file1.txt -w


Arturo ,Mar 24, 2017 at 9:17

I understand that, but still I would like to find a way to print a match each time a pattern (even a repeated one) from 'pattern.txt' is found in 'file.txt'. Even a tool or a script rather then 'grep -f' would suffice. – Arturo Mar 24 '17 at 9:17

#### [Oct 22, 2018] linux - If I rm -rf a symlink will the data the link points to get erased, to

##### "... Put it in another words, those symlink-files will be deleted. The files they "point"/"link" to will not be touch. ..."
###### Oct 22, 2018 | unix.stackexchange.com

user4951 ,Jan 25, 2013 at 2:40

This is the contents of the /home3 directory on my system:
./   backup/    hearsttr@  lost+found/  randomvi@  sexsmovi@
../  freemark@  investgr@  nudenude@    romanced@  wallpape@

I want to clean this up but I am worried because of the symlinks, which point to another drive.

If I say rm -rf /home3 will it delete the other drive?

John Sui

rm -rf /home3 will delete all files and directory within home3 and home3 itself, which include symlink files, but will not "follow"(de-reference) those symlink.

Put it in another words, those symlink-files will be deleted. The files they "point"/"link" to will not be touch.

###### Jan 25, 2012 | superuser.com
I have a directory like this:
$ls -l total 899166 drwxr-xr-x 12 me scicomp 324 Jan 24 13:47 data -rw-r--r-- 1 me scicomp 84188 Jan 24 13:47 lod-thin-1.000000-0.010000-0.030000.rda drwxr-xr-x 2 me scicomp 808 Jan 24 13:47 log lrwxrwxrwx 1 me scicomp 17 Jan 25 09:41 msg -> /home/me/msg  And I want to remove it using rm -r . However I'm scared rm -r will follow the symlink and delete everything in that directory (which is very bad). I can't find anything about this in the man pages. What would be the exact behavior of running rm -rf from a directory above this one? LordDoskias Jan 25 '12 at 16:43, Jan 25, 2012 at 16:43 How hard it is to create a dummy dir with a symlink pointing to a dummy file and execute the scenario? Then you will know for sure how it works! – hakre ,Feb 4, 2015 at 13:09 X-Ref: If I rm -rf a symlink will the data the link points to get erased, too? ; Deleting a folder that contains symlinkshakre Feb 4 '15 at 13:09 Susam Pal ,Jan 25, 2012 at 16:47 Example 1: Deleting a directory containing a soft link to another directory. susam@nifty:~/so$ mkdir foo bar
susam@nifty:~/so$touch bar/a.txt susam@nifty:~/so$ ln -s /home/susam/so/bar/ foo/baz
susam@nifty:~/so$tree . ├── bar │ └── a.txt └── foo └── baz -> /home/susam/so/bar/ 3 directories, 1 file susam@nifty:~/so$ rm -r foo
susam@nifty:~/so$tree . └── bar └── a.txt 1 directory, 1 file susam@nifty:~/so$


So, we see that the target of the soft-link survives.

Example 2: Deleting a soft link to a directory

susam@nifty:~/so$ln -s /home/susam/so/bar baz susam@nifty:~/so$ tree
.
├── bar
│   └── a.txt
└── baz -> /home/susam/so/bar

2 directories, 1 file
susam@nifty:~/so$rm -r baz susam@nifty:~/so$ tree
.
└── bar
└── a.txt

1 directory, 1 file
susam@nifty:~/so$ Only, the soft link is deleted. The target of the soft-link survives. Example 3: Attempting to delete the target of a soft-link susam@nifty:~/so$ ln -s /home/susam/so/bar baz
susam@nifty:~/so$tree . ├── bar │ └── a.txt └── baz -> /home/susam/so/bar 2 directories, 1 file susam@nifty:~/so$ rm -r baz/
rm: cannot remove 'baz/': Not a directory
susam@nifty:~/so$tree . ├── bar └── baz -> /home/susam/so/bar 2 directories, 0 files  The file in the target of the symbolic link does not survive. The above experiments were done on a Debian GNU/Linux 9.0 (stretch) system. Wyrmwood ,Oct 30, 2014 at 20:36 rm -rf baz/* will remove the contents – Wyrmwood Oct 30 '14 at 20:36 Buttle Butkus ,Jan 12, 2016 at 0:35 Yes, if you do rm -rf [symlink], then the contents of the original directory will be obliterated! Be very careful. – Buttle Butkus Jan 12 '16 at 0:35 frnknstn ,Sep 11, 2017 at 10:22 Your example 3 is incorrect! On each system I have tried, the file a.txt will be removed in that scenario. – frnknstn Sep 11 '17 at 10:22 Susam Pal ,Sep 11, 2017 at 15:20 @frnknstn You are right. I see the same behaviour you mention on my latest Debian system. I don't remember on which version of Debian I performed the earlier experiments. In my earlier experiments on an older version of Debian, either a.txt must have survived in the third example or I must have made an error in my experiment. I have updated the answer with the current behaviour I observe on Debian 9 and this behaviour is consistent with what you mention. – Susam Pal Sep 11 '17 at 15:20 Ken Simon ,Jan 25, 2012 at 16:43 Your /home/me/msg directory will be safe if you rm -rf the directory from which you ran ls. Only the symlink itself will be removed, not the directory it points to. The only thing I would be cautious of, would be if you called something like "rm -rf msg/" (with the trailing slash.) Do not do that because it will remove the directory that msg points to, rather than the msg symlink itself. > ,Jan 25, 2012 at 16:54 "The only thing I would be cautious of, would be if you called something like "rm -rf msg/" (with the trailing slash.) Do not do that because it will remove the directory that msg points to, rather than the msg symlink itself." - I don't find this to be true. See the third example in my response below. – Susam Pal Jan 25 '12 at 16:54 Andrew Crabb ,Nov 26, 2013 at 21:52 I get the same result as @Susam ('rm -r symlink/' does not delete the target of symlink), which I am pleased about as it would be a very easy mistake to make. – Andrew Crabb Nov 26 '13 at 21:52 , rm should remove files and directories. If the file is symbolic link, link is removed, not the target. It will not interpret a symbolic link. For example what should be the behavior when deleting 'broken links'- rm exits with 0 not with non-zero to indicate failure #### [Oct 18, 2018] 'less' command clearing screen upon exit - how to switch it off? ##### Notable quotes: ##### "... To prevent less from clearing the screen upon exit, use -X . ..." ###### Oct 18, 2018 | superuser.com Wojciech Kaczmarek ,Feb 9, 2010 at 11:21 How to force the less program to not clear the screen upon exit? I'd like it to behave like git log command: • it leaves the recently seen page on screen upon exiting • it does not exit the less even if the content fits on one screen (try git log -1 ) Any ideas? I haven't found any suitable less options nor env variables in a manual, I suspect it's set via some env variable though. sleske ,Feb 9, 2010 at 11:59 To prevent less from clearing the screen upon exit, use -X . From the manpage: -X or --no-init Disables sending the termcap initialization and deinitialization strings to the terminal. This is sometimes desirable if the deinitialization string does something unnecessary, like clearing the screen. As to less exiting if the content fits on one screen, that's option -F : -F or --quit-if-one-screen Causes less to automatically exit if the entire file can be displayed on the first screen. -F is not the default though, so it's likely preset somewhere for you. Check the env var LESS . markpasc ,Oct 11, 2010 at 3:44 This is especially annoying if you know about -F but not -X , as then moving to a system that resets the screen on init will make short files simply not appear, for no apparent reason. This bit me with ack when I tried to take my ACK_PAGER='less -RF' setting to the Mac. Thanks a bunch! – markpasc Oct 11 '10 at 3:44 sleske ,Oct 11, 2010 at 8:45 @markpasc: Thanks for pointing that out. I would not have realized that this combination would cause this effect, but now it's obvious. – sleske Oct 11 '10 at 8:45 Michael Goldshteyn ,May 30, 2013 at 19:28 This is especially useful for the man pager, so that man pages do not disappear as soon as you quit less with the 'q' key. That is, you scroll to the position in a man page that you are interested in only for it to disappear when you quit the less pager in order to use the info. So, I added: export MANPAGER='less -s -X -F' to my .bashrc to keep man page info up on the screen when I quit less, so that I can actually use it instead of having to memorize it. – Michael Goldshteyn May 30 '13 at 19:28 Michael Burr ,Mar 18, 2014 at 22:00 It kinda sucks that you have to decide when you start less how it must behave when you're going to exit. – Michael Burr Mar 18 '14 at 22:00 Derek Douville ,Jul 11, 2014 at 19:11 If you want any of the command-line options to always be default, you can add to your .profile or .bashrc the LESS environment variable. For example: export LESS="-XF"  will always apply -X -F whenever less is run from that login session. Sometimes commands are aliased (even by default in certain distributions). To check for this, type alias  without arguments to see if it got aliased with options that you don't want. To run the actual command in your$PATH instead of an alias, just preface it with a back-slash :

\less


To see if a LESS environment variable is set in your environment and affecting behavior:

echo $LESS  dotancohen ,Sep 2, 2014 at 10:12 In fact, I add export LESS="-XFR" so that the colors show through less as well. – dotancohen Sep 2 '14 at 10:12 Giles Thomas ,Jun 10, 2015 at 12:23 Thanks for that! -XF on its own was breaking the output of git diff , and -XFR gets the best of both worlds -- no screen-clearing, but coloured git diff output. – Giles Thomas Jun 10 '15 at 12:23 #### [Oct 18, 2018] Isn't less just more ##### Highly recommended! ###### Oct 18, 2018 | unix.stackexchange.com Bauna ,Aug 18, 2010 at 3:07 less is a lot more than more , for instance you have a lot more functionality: g: go top of the file G: go bottom of the file /: search forward ?: search backward N: show line number : goto line F: similar to tail -f, stop with ctrl+c S: split lines  And I don't remember more ;-) törzsmókus ,Feb 19 at 13:19 h : everything you don't remember ;) – törzsmókus Feb 19 at 13:19 KeithB ,Aug 18, 2010 at 0:36 There are a couple of things that I do all the time in less , that doesn't work in more (at least the versions on the systems I use. One is using G to go to the end of the file, and g to go to the beginning. This is useful for log files, when you are looking for recent entries at the end of the file. The other is search, where less highlights the match, while more just brings you to the section of the file where the match occurs, but doesn't indicate where it is. geoffc ,Sep 8, 2010 at 14:11 Less has a lot more functionality. You can use v to jump into the current$EDITOR. You can convert to tail -f mode with f as well as all the other tips others offered.

Ubuntu still has distinct less/more bins. At least mine does, or the more command is sending different arguments to less.

In any case, to see the difference, find a file that has more rows than you can see at one time in your terminal. Type cat , then the file name. It will just dump the whole file. Type more , then the file name. If on ubuntu, or at least my version (9.10), you'll see the first screen, then --More--(27%) , which means there's more to the file, and you've seen 27% so far. Press space to see the next page. less allows moving line by line, back and forth, plus searching and a whole bunch of other stuff.

Basically, use less . You'll probably never need more for anything. I've used less on huge files and it seems OK. I don't think it does crazy things like load the whole thing into memory ( cough Notepad). Showing line numbers could take a while, though, with huge files.

#### [Oct 18, 2018] What are the differences between most, more and less

##### Highly recommended!
###### Jun 29, 2013 | unix.stackexchange.com

Smith John ,Jun 29, 2013 at 13:16

more

more is an old utility. When the text passed to it is too large to fit on one screen, it pages it. You can scroll down but not up.

Some systems hardlink more to less , providing users with a strange hybrid of the two programs that looks like more and quits at the end of the file like more but has some less features such as backwards scrolling. This is a result of less 's more compatibility mode. You can enable this compatibility mode temporarily with LESS_IS_MORE=1 less ... .

more passes raw escape sequences by default. Escape sequences tell your terminal which colors to display.

less

less was written by a man who was fed up with more 's inability to scroll backwards through a file. He turned less into an open source project and over time, various individuals added new features to it. less is massive now. That's why some small embedded systems have more but not less . For comparison, less 's source is over 27000 lines long. more implementations are generally only a little over 2000 lines long.

In order to get less to pass raw escape sequences, you have to pass it the -r flag. You can also tell it to only pass ANSI escape characters by passing it the -R flag.

most

most is supposed to be more than less . It can display multiple files at a time. By default, it truncates long lines instead of wrapping them and provides a left/right scrolling mechanism. most's website has no information about most 's features. Its manpage indicates that it is missing at least a few less features such as log-file writing (you can use tee for this though) and external command running.

By default, most uses strange non-vi-like keybindings. man most | grep '\<vi.?\>' doesn't return anything so it may be impossible to put most into a vi-like mode.

most has the ability to decompress gunzip-compressed files before reading. Its status bar has more information than less 's.

most passes raw escape sequences by default.

tifo ,Oct 14, 2014 at 8:44

Short answer: Just use less and forget about more

Longer version:

more is old utility. You can't browse step wise with more, you can use space to browse page wise, or enter line by line, that is about it. less is more + more additional features. You can browse page wise, line wise both up and down, search

Jonathan.Brink ,Aug 9, 2015 at 20:38

If "more" is lacking for you and you know a few vi commands use "less" – Jonathan.Brink Aug 9 '15 at 20:38

Wilko Fokken ,Jan 30, 2016 at 20:31

There is one single application whereby I prefer more to less :

To check my LATEST modified log files (in /var/log/ ), I use ls -AltF | more .

While less deletes the screen after exiting with q , more leaves those files and directories listed by ls on the screen, sparing me memorizing their names for examination.

(Should anybody know a parameter or configuration enabling less to keep it's text after exiting, that would render this post obsolete.)

Jan Warchoł ,Mar 9, 2016 at 10:18

The parameter you want is -X (long form: --no-init ). From less ' manpage:

Disables sending the termcap initialization and deinitialization strings to the terminal. This is sometimes desirable if the deinitialization string does something unnecessary, like clearing the screen.

#### [Oct 16, 2018] Taking Command of the Terminal with GNU Screen

##### "... Note that byobu doesn't actually do anything to screen itself. It's an elaborate (and pretty groovy) screen configuration customization. You could do something similar on your own by hacking your ~/.screenrc, but the byobu maintainers have already done it for you. ..."
###### Oct 16, 2018 | www.linux.com

Can I have a Copy of That?

Want a quick and dirty way to take notes of what's on your screen? Yep, there's a command for that. Run Ctrl-a h and screen will save a text file called "hardcopy.n" in your current directory that has all of the existing text. Want to get a quick snapshot of the top output on a system? Just run Ctrl-a h and there you go.

You can also save a log of what's going on in a window by using Ctrl-a H . This will create a file called screenlog.0 in the current directory. Note that it may have limited usefulness if you're doing something like editing a file in Vim, and the output can look pretty odd if you're doing much more than entering a few simple commands. To close a screenlog, use Ctrl-a H again.

Note if you want a quick glance at the system info, including hostname, system load, and system time, you can get that with Ctrl-a t .

Simplifying Screen with Byobu

If the screen commands seem a bit too arcane to memorize, don't worry. You can tap the power of GNU Screen in a slightly more user-friendly package called byobu . Basically, byobu is a souped-up screen profile originally developed for Ubuntu. Not using Ubuntu? No problem, you can find RPMs or a tarball with the profiles to install on other Linux distros or Unix systems that don't feature a native package.

Note that byobu doesn't actually do anything to screen itself. It's an elaborate (and pretty groovy) screen configuration customization. You could do something similar on your own by hacking your ~/.screenrc, but the byobu maintainers have already done it for you.

Since most of byobu is self-explanatory, I won't go into great detail about using it. You can launch byobu by running byobu . You'll see a shell prompt plus a few lines at the bottom of the screen with additional information about your system, such as the system CPUs, uptime, and system time. To get a quick help menu, hit F9 and then use the Help entry. Most of the commands you would use most frequently are assigned F keys as well. Creating a new window is F2, cycling between windows is F3 and F4, and detaching from a session is F6. To re-title a window use F8, and if you want to lock the screen use F12.

The only downside to byobu is that it's not going to be on all systems, and in a pinch it may help to know your way around plain-vanilla screen rather than byobu.

For an easy reference, here's a list of the most common screen commands that you'll want to know. This isn't exhaustive, but it should be enough for most users to get started using screen happily for most use cases.

• Start Screen: screen
• Detatch Screen: Ctrl-a d
• Re-attach Screen: screen -x or screen -x PID
• Split Horizontally: Ctrl-a S
• Split Vertically: Ctrl-a |
• Move Between Windows: Ctrl-a Tab
• Name Session: Ctrl-a A
• Log Session: Ctrl-a H
• Note Session: Ctrl-a h

Finally, if you want help on GNU Screen, use the man page (man screen) and its built-in help with Ctrl-a :help. Screen has quite a few advanced options that are beyond an introductory tutorial, so be sure to check out the man page when you have the basics down.

#### [Oct 16, 2018] How To Use Linux Screen

###### Oct 16, 2018 | linuxize.com

Working with Linux Screen Windows

When you start a new screen session by default it creates a single window with a shell in it.

You can have multiple windows inside a Screen session.

To create a new window with shell type Ctrl+a  c , the first available number from the range  0...9  will be assigned to it.

Bellow are some most common commands for managing Linux Screen Windows:

• Ctrl+a  c  Create a new window (with shell)
• Ctrl+a  "  List all window
• Ctrl+a  0  Switch to window 0 (by number )
• Ctrl+a  A  Rename the current window
• Ctrl+a  S  Split current region horizontally into two regions
• Ctrl+a  |  Split current region vertically into two regions
• Ctrl+a  tab  Switch the input focus to the next region
• Ctrl+a  Ctrl+a  Toggle between current and previous region
• Ctrl+a  Q  Close all regions but the current one
• Ctrl+a  X  Close the current region

Detach from Linux Screen Session

You can detach from the screen session at anytime by typing:

Ctrl+a  d 

The program running in the screen session will continue to run after you detach from the session.

To resume your screen session use the following command:

screen -r

Copy

In case you have multiple screen sessions running on you machine you will need to append the screen session ID after the r  switch.

To find the session ID list the current running screen sessions with:

screen -ls

Copy
There are screens on:
10835.pts-0.linuxize-desktop   (Detached)
10366.pts-0.linuxize-desktop   (Detached)
2 Sockets in /run/screens/S-linuxize.

Copy

If you want to restore screen 10835.pts-0, then type the following command:

screen -r 10835

Copy

When screen is started it reads its configuration parameters from /etc/screenrc  and ~/.screenrc  if the file is present. We can modify the default Screen settings according to our own preferences using the .screenrc  file.

Here is a sample ~/.screenrc  configuration with customized status line and few additional options:

~/.screenrc
# Turn off the welcome message
startup_message off

# Disable visual bell
vbell off

# Set scrollback buffer to 10000
defscrollback 10000

# Customize the status line
hardstatus alwayslastline
hardstatus string '%{= kG}[ %{G}%H %{g}][%= %{= kw}%?%-Lw%?%{r}(%{W}%n*%f%t%?(%u)%?%{r})%{w}%?%+


#### [Oct 14, 2018] Linux and Unix cut command tutorial with examples by George Ornbo

###### Jul 19, 2016 | shapeshed.com
... ... ... How to cut by complement pattern

To cut by complement us the --complement option. Note this option is not available on the BSD version of cut . The --complement option selects the inverse of the options passed to sort.

In the following example the -c option is used to select the first character. Because the --complement option is also passed to cut the second and third characters are cut.

echo 'foo' | cut --complement -c 1
oo

How to modify the output delimiter

To modify the output delimiter use the --output-delimiter option. Note that this option is not available on the BSD version of cut . In the following example a semi-colon is converted to a space and the first, third and fourth fields are selected.

echo 'how;now;brown;cow' | cut -d ';' -f 1,3,4 --output-delimiter=' '
how brown cow


George Ornbo is a hacker, futurist, blogger and Dad based in Buckinghamshire, England.He is the author of Sams Teach Yourself Node.js in 24 Hours .He can be found in most of the usual places as shapeshed including Twitter and GitHub .

#### [Oct 12, 2018] How To Create And Maintain Your Own Man Pages by sk

###### Oct 09, 2018 | www.ostechnix.com

We already have discussed about a few good alternatives to Man pages . Those alternatives are mainly used for learning concise Linux command examples without having to go through the comprehensive man pages. If you're looking for a quick and dirty way to easily and quickly learn a Linux command, those alternatives are worth trying. Now, you might be thinking – how can I create my own man-like help pages for a Linux command? This is where "Um" comes in handy. Um is a command line utility, used to easily create and maintain your own Man pages that contains only what you've learned about a command so far.

By creating your own alternative to man pages, you can avoid lots of unnecessary, comprehensive details in a man page and include only what is necessary to keep in mind. If you ever wanted to created your own set of man-like pages, Um will definitely help. In this brief tutorial, we will see how to install "Um" command line utility and how to create our own man pages.

Installing Um

Um is available for Linux and Mac OS. At present, it can only be installed using Linuxbrew package manager in Linux systems. Refer the following link if you haven't installed Linuxbrew yet.

Once Linuxbrew installed, run the following command to install Um utility.

$brew install sinclairtarget/wst/um  If you will see an output something like below, congratulations! Um has been installed and ready to use. [...] ==> Installing sinclairtarget/wst/um ==> Downloading https://github.com/sinclairtarget/um/archive/4.0.0.tar.gz ==> Downloading from https://codeload.github.com/sinclairtarget/um/tar.gz/4.0.0 -=#=# # # ==> Downloading https://rubygems.org/gems/kramdown-1.17.0.gem ######################################################################## 100.0% ==> gem install /home/sk/.cache/Homebrew/downloads/d0a5d978120a791d9c5965fc103866815189a4e3939 ==> Caveats Bash completion has been installed to: /home/linuxbrew/.linuxbrew/etc/bash_completion.d ==> Summary /home/linuxbrew/.linuxbrew/Cellar/um/4.0.0: 714 files, 1.3MB, built in 35 seconds ==> Caveats ==> openssl A CA file has been bootstrapped using certificates from the SystemRoots keychain. To add additional certificates (e.g. the certificates added in the System keychain), place .pem files in /home/linuxbrew/.linuxbrew/etc/openssl/certs and run /home/linuxbrew/.linuxbrew/opt/openssl/bin/c_rehash ==> ruby Emacs Lisp files have been installed to: /home/linuxbrew/.linuxbrew/share/emacs/site-lisp/ruby ==> um Bash completion has been installed to: /home/linuxbrew/.linuxbrew/etc/bash_completion.d  Before going to use to make your man pages, you need to enable bash completion for Um. To do so, open your ~/.bash_profile file: $ nano ~/.bash_profile


And, add the following lines in it:

if [ -f $(brew --prefix)/etc/bash_completion.d/um-completion.sh ]; then .$(brew --prefix)/etc/bash_completion.d/um-completion.sh
fi


Save and close the file. Run the following commands to update the changes.

$source ~/.bash_profile  All done. let us go ahead and create our first man page. ###### Create And Maintain Your Own Man Pages Let us say, you want to create your own man page for "dpkg" command. To do so, run: $ um edit dpkg

The above command will open a markdown template in your default editor:

Create a new man page

My default editor is Vi, so the above commands open it in the Vi editor. Now, start adding everything you want to remember about "dpkg" command in this template.

Here is a sample:

Add contents in dpkg man page

As you see in the above output, I have added Synopsis, description and two options for dpkg command. You can add as many as sections you want in the man pages. Make sure you have given proper and easily-understandable titles for each section. Once done, save and quit the file (If you use Vi editor, Press ESC key and type :wq ).

Finally, view your newly created man page using command:

$um dpkg View dpkg man page As you can see, the the dpkg man page looks exactly like the official man pages. If you want to edit and/or add more details in a man page, again run the same command and add the details. $ um edit dpkg

To view the list of newly created man pages using Um, run:

$um list All man pages will be saved under a directory named .um in your home directory Just in case, if you don't want a particular page, simply delete it as shown below. $ um rm dpkg

To view the help section and all available general options, run:

$um --help usage: um <page name> um <sub-command> [ARGS...] The first form is equivalent to um read <page name>. Subcommands: um (l)ist List the available pages for the current topic. um (r)ead <page name> Read the given page under the current topic. um (e)dit <page name> Create or edit the given page under the current topic. um rm <page name> Remove the given page. um (t)opic [topic] Get or set the current topic. um topics List all topics. um (c)onfig [config key] Display configuration environment. um (h)elp [sub-command] Display this help message, or the help message for a sub-command. Configure Um To view the current configuration, run: $ um config
Options prefixed by '*' are set in /home/sk/.um/umconfig.
editor = vi
pager = less
pages_directory = /home/sk/.um/pages
default_topic = shell
pages_ext = .md

In this file, you can edit and change the values for pager , editor , default_topic , pages_directory , and pages_ext options as you wish. Say for example, if you want to save the newly created Um pages in your Dropbox folder, simply change the value of pages_directory directive and point it to the Dropbox folder in ~/.um/umconfig file.

pages_directory = /Users/myusername/Dropbox/um

And, that's all for now. Hope this was useful. More good stuffs to come. Stay tuned!

Cheers!

Resource:

#### [Sep 27, 2018] bash - Conflict between pushd . and cd - - Unix Linux Stack Exchange

###### Sep 27, 2018 | unix.stackexchange.com

Bernhard ,Feb 21, 2012 at 12:07

I am a happy user of the cd - command to go to the previous directory. At the same time I like pushd . and popd .

However, when I want to remember the current working directory by means of pushd . , I lose the possibility to go to the previous directory by cd - . (As pushd . also performs cd . ).

How can I use pushd to still be able to use cd -

By the way: GNU bash, version 4.1.7(1)

Patrick ,Feb 21, 2012 at 12:39

Why not use pwd to figure out where you are? – Patrick Feb 21 '12 at 12:39

Bernhard ,Feb 21, 2012 at 12:46

I don't understand your question? The point is that pushd breaks the behavior of cd - that I want (or expect). I know perfectly well in which directory I am, but I want to increase the speed with which I change directories :) – Bernhard Feb 21 '12 at 12:46

jofel ,Feb 21, 2012 at 14:39

Do you know zsh ? It has really nice features like AUTO_PUSHD. – jofel Feb 21 '12 at 14:39

Theodore R. Smith ,Feb 21, 2012 at 16:26

+1 Thank you for teaching me about cd -! For most of a decade, I've been doing $cd$OLDPWD instead. – Theodore R. Smith Feb 21 '12 at 16:26

Patrick ,Feb 22, 2012 at 1:58

@bernhard Oh, I misunderstood what you were asking. You were wanting to know how to store the current working directory. I was interpreting it as you wanted to remember (as in you forgot) your current working directory. – Patrick Feb 22 '12 at 1:58

Wojtek Rzepala ,Feb 21, 2012 at 12:32

You can use something like this:
push() {
if [ "$1" = . ]; then old=$OLDPWD
current=$PWD builtin pushd . cd "$old"
cd "$current" else builtin pushd "$1"
fi
}


If you name it pushd , then it will have precedence over the built-in as functions are evaluated before built-ins.

You need variables old and current as overwriting OLDPWD will make it lose its special meaning.

Bernhard ,Feb 21, 2012 at 12:41

This works perfectly for me. Is there no such feature in the built-in pushd? As I would always prefer a standard solution. Thanks for this function however, maybe I will leave out the argument and it's checking at some point. – Bernhard Feb 21 '12 at 12:41

bsd ,Feb 21, 2012 at 12:53

There is no such feature in the builtin. Your own function is the best solution because pushd and popd both call cd modifying $OLDPWD, hence the source of your problem. I would name the function saved and use it in the context you like too, that of saving cwd. – bsd Feb 21 '12 at 12:53 Wildcard ,Mar 29, 2016 at 23:08 You might also want to unset old and current after you're done with them. – Wildcard Mar 29 '16 at 23:08 Kevin ,Feb 21, 2012 at 16:11 A slightly more concise version of Wojtek's answer : pushd () { if [ "$1" = . ]; then
cd -
builtin pushd -
else
builtin pushd "$1" fi }  By naming the function pushd , you can use pushd as normal, you don't need to remember to use the function name. , Kevin's answer is excellent. I've written up some details about what's going on, in case people are looking for a better understanding of why their script is necessary to solve the problem. The reason that pushd . breaks the behavior of cd - will be apparent if we dig into the workings of cd and the directory stack. Let's push a few directories onto the stack: $ mkdir dir1 dir2 dir3
$pushd dir1 ~/dir1 ~$ pushd../dir2
~/dir2 ~/dir1 ~
$pushd../dir3 ~/dir3 ~/dir2 ~/dir1 ~$ dirs -v
0       ~/dir3
1       ~/dir2
2       ~/dir1
3       ~


Now we can try cd - to jump back a directory:

$cd - /home/username/dir2$ dirs -v
0       ~/dir2
1       ~/dir2
2       ~/dir1
3       ~


We can see that cd - jumped us back to the previous directory, replacing stack ~0 with the directory we jumped into. We can jump back with cd - again:

$cd - /home/username/dir3$ dirs -v
0       ~/dir3
1       ~/dir2
2       ~/dir1
3       ~


Notice that we jumped back to our previous directory, even though the previous directory wasn't actually listed in the directory stack. This is because cd uses the environment variable $OLDPWD to keep track of the previous directory: $ echo $OLDPWD /home/username/dir2  If we do pushd . we will push an extra copy of the current directory onto the stack: $ pushd .
~/dir3 ~/dir3 ~/dir2 ~/dir1 ~
$dirs -v 0 ~/dir3 1 ~/dir3 2 ~/dir2 3 ~/dir1 4 ~  In addition to making an extra copy of the current directory in the stack, pushd . has updated $OLDPWD :

$echo$OLDPWD


So cd - has lost its useful history, and will now just move you to the current directory - accomplishing nothing.

#### [Sep 26, 2018] bash - removing or clearing stack of popd-pushd paths

###### Sep 26, 2018 | unix.stackexchange.com

chrisjlee ,Feb 9, 2012 at 6:24

After pushd ing too many times, I want to clear the whole stack of paths.

How would I popd all the items in the stack?

I'd like to popd without needing to know how many are in the stack?

The bash manual doesn't seem to cover this .

Why do I need to know this? I'm fastidious and to clean out the stack.

jw013 ,Feb 9, 2012 at 6:39

BTW, the complete bash manual is over at gnu.org. If you use the all on one page version, it may be easier to find stuff there. – jw013 Feb 9 '12 at 6:39

jw013 ,Feb 9, 2012 at 6:37

dirs -c is what you are looking for.

Eliran Malka ,Mar 23, 2017 at 15:20

this does empty the stack, but does not restore the working directory from the stack bottom – Eliran Malka Mar 23 '17 at 15:20

Eliran Malka ,Mar 23, 2017 at 15:37

In order to both empty the stack and restore the working directory from the stack bottom, either:
• retrieve that directory from dirs , change to that directory, and than clear the stack:
cd "$(dirs -l -0)" && dirs -c  The -l option here will list full paths, to make sure we don't fail if we try to cd into ~ , and the -0 retrieves the first entry from the stack bottom. @jw013 suggested making this command more robust, by avoiding path expansions: pushd -0 && dirs -c  • or, popd until you encounter an error (which is the status of a popd call when the directory stack is empty): while (($? == 0 )); do popd; done


Chuck Wilbur ,Nov 14, 2017 at 18:21

The first method is exactly what I wanted. The second wouldn't work in my case since I had called pushd a few times, then removed one of the directories in the middle, then popd was failing when I tried to unroll. I needed to jump over all the buggered up stuff in the middle to get back to where I started. – Chuck Wilbur Nov 14 '17 at 18:21

Eliran Malka ,Nov 14, 2017 at 22:51

right @ChuckWilbur - if you scrambled the dir stack, popd won't save you :) – Eliran Malka Nov 14 '17 at 22:51

jw013 ,Dec 7, 2017 at 20:50

It's better to pushd -0 instead of cd "$(dirs ...)" . – jw013 Dec 7 '17 at 20:50 Eliran Malka ,Dec 11, 2017 at 13:56 @jw013 how so? that would mess with the dir stack even more (which we're trying to clear here..) – Eliran Malka Dec 11 '17 at 13:56 jw013 ,Dec 12, 2017 at 15:31 cd "$(...)" works in 90%, probably even 99% of use cases, but with pushd -0 you can confidently say 100%. There are so many potential gotchas and edge cases associated with expanding file/directory paths in the shell that the most robust thing to do is just avoid it altogether, which pushd -0 does very concisely.

There is no chance of getting caught by a bug with a weird edge case if you never take the risk. If you want further reading on the possible headaches involved with Unix file / path names, a good starting point is mywiki.wooledge.org/ParsingLsjw013 Dec 12 '17 at 15:31

#### [Sep 25, 2018] Sorting Text

##### "... Fortunately, the GNU implementation in the coreutils package [1] remedies that deficiency via the -- stable option ..."
###### Sep 25, 2018 | www.amazon.com
awk , cut , and join , sort views its input as a stream of records made up of fields of variable width, with records delimited by newline characters and fields delimited by whitespace or a user-specifiable single character.

sort

Usage
sort [ options ] [ file(s) ]
Purpose
Sort input lines into an order determined by the key field and datatype options, and the locale.
Major options
-b
-c
Check that input is correctly sorted. There is no output, but the exit code is nonzero if the input is not sorted.
-d
Dictionary order: only alphanumerics and whitespace are significant.
-g
General numeric value: compare fields as floating-point numbers. This works like -n , except that numbers may have decimal points and exponents (e.g., 6.022e+23 ). GNU version only.
-f
Fold letters implicitly to a common lettercase so that sorting is case-insensitive.
-i
Ignore nonprintable characters.
-k
Define the sort key field.
-m
Merge already-sorted input files into a sorted output stream.
-n
Compare fields as integer numbers.
-o outfile
Write output to the specified file instead of to standard output. If the file is one of the input files, sort copies it to a temporary file before sorting and writing the output.
-r
Reverse the sort order to descending, rather than the default ascending.
-t char
Use the single character char as the default field separator, instead of the default of whitespace.
-u
Unique records only: discard all but the first record in a group with equal keys. Only the key fields matter: other parts of the discarded records may differ.
Behavior
sort reads the specified files, or standard input if no files are given, and writes the sorted data on standard output.
Sorting by Lines

In the simplest case, when no command-line options are supplied, complete records are sorted according to the order defined by the current locale. In the traditional C locale, that means ASCII order, but you can set an alternate locale as we described in Section 2.8 . A tiny bilingual dictionary in the ISO 8859-1 encoding translates four French words differing only in accents:

$cat french-english Show the tiny dictionary côte coast cote dimension coté dimensioned côté side  To understand the sorting, use the octal dump tool, od , to display the French words in ASCII and octal: $ cut -f1 french-english | od -a -b            Display French words in octal bytes

0000000   c   t   t   e  nl   c   o   t   e  nl   c   o   t   i  nl   c

143 364 164 145 012 143 157 164 145 012 143 157 164 351 012 143

0000020   t   t   i  nl

364 164 351 012

0000024

Evidently, with the ASCII option -a , od strips the high-order bit of characters, so the accented letters have been mangled, but we can see their octal values: é is 351 8 and ô is 364 8 . On GNU/Linux systems, you can confirm the character values like this:
$man iso_8859_1 Check the ISO 8859-1 manual page ... Oct Dec Hex Char Description -------------------------------------------------------------------- ... 351 233 E9 é LATIN SMALL LETTER E WITH ACUTE ... 364 244 F4 ô LATIN SMALL LETTER O WITH CIRCUMFLEX ...  First, sort the file in strict byte order: $ LC_ALL=C sort french-english                 Sort in traditional ASCII order

cote    dimension

coté    dimensioned

côte    coast

côté    side

Notice that e (145 8 ) sorted before é (351 8 ), and o (157 8 ) sorted before ô (364 8 ), as expected from their numerical values. Now sort the text in Canadian-French order:
$LC_ALL=fr_CA.iso88591 sort french-english Sort in Canadian-French locale côte coast cote dimension coté dimensioned côté side  The output order clearly differs from the traditional ordering by raw byte values. Sorting conventions are strongly dependent on language, country, and culture, and the rules are sometimes astonishingly complex. Even English, which mostly pretends that accents are irrelevant, can have complex sorting rules: examine your local telephone directory to see how lettercase, digits, spaces, punctuation, and name variants like McKay and Mackay are handled. Sorting by Fields For more control over sorting, the -k option allows you to specify the field to sort on, and the -t option lets you choose the field delimiter. If -t is not specified, then fields are separated by whitespace and leading and trailing whitespace in the record is ignored. With the -t option, the specified character delimits fields, and whitespace is significant. Thus, a three-character record consisting of space-X-space has one field without -t , but three with -t ' ' (the first and third fields are empty). The -k option is followed by a field number, or number pair, optionally separated by whitespace after -k . Each number may be suffixed by a dotted character position, and/or one of the modifier letters shown in Table. Letter Description b Ignore leading whitespace. d Dictionary order. f Fold letters implicitly to a common lettercase. g Compare as general floating-point numbers. GNU version only. i Ignore nonprintable characters. n Compare as (integer) numbers. r Reverse the sort order. Fields and characters within fields are numbered starting from one. If only one field number is specified, the sort key begins at the start of that field, and continues to the end of the record ( not the end of the field). If a comma-separated pair of field numbers is given, the sort key starts at the beginning of the first field, and finishes at the end of the second field. With a dotted character position, comparison begins (first of a number pair) or ends (second of a number pair) at that character position: -k2.4,5.6 compares starting with the fourth character of the second field and ending with the sixth character of the fifth field. If the start of a sort key falls beyond the end of the record, then the sort key is empty, and empty sort keys sort before all nonempty ones. When multiple -k options are given, sorting is by the first key field, and then, when records match in that key, by the second key field, and so on.  ! While the -k option is available on all of the systems that we tested, sort also recognizes an older field specification, now considered obsolete, where fields and character positions are numbered from zero. The key start for character m in field n is defined by + n.m , and the key end by - n.m . For example, sort +2.1 -3.2 is equivalent to sort -k3.2,4.3 . If the character position is omitted, it defaults to zero. Thus, +4.0nr and +4nr mean the same thing: a numeric key, beginning at the start of the fifth field, to be sorted in reverse (descending) order. Let's try out these options on a sample password file, sorting it by the username, which is found in the first colon-separated field: $ sort -t: -k1,1 /etc/passwd               Sort by username

chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash

groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh

gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93

harpo:x:12502:1000:Harpo Marx:/home/harpo:/bin/ksh

root:x:0:0:root:/root:/bin/bash

zeppo:x:12505:1000:Zeppo Marx:/home/zeppo:/bin/zsh


For more control, add a modifier letter in the field selector to define the type of data in the field and the sorting order. Here's how to sort the password file by descending UID:

$sort -t: -k3nr /etc/passwd Sort by descending UID zeppo:x:12505:1000:Zeppo Marx:/home/zeppo:/bin/zsh gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93 groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh harpo:x:12502:1000:Harpo Marx:/home/harpo:/bin/ksh chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash daemon:x:2:2:daemon:/sbin:/sbin/nologin bin:x:1:1:bin:/bin:/sbin/nologin root:x:0:0:root:/root:/bin/bash  A more precise field specification would have been -k3nr,3 (that is, from the start of field three, numerically, in reverse order, to the end of field three), or -k3,3nr , or even -k3,3 -n -r , but sort stops collecting a number at the first nondigit, so -k3nr works correctly. In our password file example, three users have a common GID in field 4, so we could sort first by GID, and then by UID, with: $ sort -t: -k4n -k3n /etc/passwd           Sort by GID and UID

root:x:0:0:root:/root:/bin/bash

chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash

harpo:x:12502:1000:Harpo Marx:/home/harpo:/bin/ksh

zeppo:x:12505:1000:Zeppo Marx:/home/zeppo:/bin/zsh

groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh

gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93


The useful -u option asks sort to output only unique records, where unique means that their sort-key fields match, even if there are differences elsewhere. Reusing the password file one last time, we find:

$sort -t: -k4n -u /etc/passwd Sort by unique GID root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin chico:x:12501:1000:Chico Marx:/home/chico:/bin/bash groucho:x:12503:2000:Groucho Marx:/home/groucho:/bin/sh gummo:x:12504:3000:Gummo Marx:/home/gummo:/usr/local/bin/ksh93  Notice that the output is shorter: three users are in group 1000, but only one of them was output... Sorting Text Blocks Sometimes you need to sort data composed of multiline records. A good example is an address list, which is conveniently stored with one or more blank lines between addresses. For data like this, there is no constant sort-key position that could be used in a -k option, so you have to help out by supplying some extra markup. Here's a simple example: $ cat my-friends                           Show address file

# SORTKEY: Schloß, Hans Jürgen
Hans Jürgen Schloß
Unter den Linden 78
D-10117 Berlin
Germany

Henley-on-Thames RG9 4AJ
UK

# SORTKEY: Brown, Kim
Kim Brown
1841 S Main Street
Westchester, NY 10502
USA


The sorting trick is to use the ability of awk to handle more-general record separators to recognize paragraph breaks, temporarily replace the line breaks inside each address with an otherwise unused character, such as an unprintable control character, and replace the paragraph break with a newline. sort then sees lines that look like this:

# SORTKEY: Schloß, Hans Jürgen^ZHans Jürgen Schloß^ZUnter den Linden 78^Z...

# SORTKEY: Brown, Kim^ZKim Brown^Z1841 S Main Street^Z...


Here, ^Z is a Ctrl-Z character. A filter step downstream from sort restores the line breaks and paragraph breaks, and the sort key lines are easily removed, if desired, with grep . The entire pipeline looks like this:

cat my-friends |                                         Pipe in address file
awk -v RS="" { gsub("\n", "^Z"); print }' |            Convert addresses to single lines
sort -f |                                            Sort address bundles, ignoring case
awk -v ORS="\n\n" '{ gsub("^Z", "\n"); print }' |  Restore line structure
grep -v '# SORTKEY'                              Remove markup lines


The gsub( ) function performs "global substitutions." It is similar to the s/x/y/g construct in sed . The RS variable is the input Record Separator. Normally, input records are separated by newlines, making each line a separate record. Using RS=" " is a special case, whereby records are separated by blank lines; i.e., each block or "paragraph" of text forms a separate record. This is exactly the form of our input data. Finally, ORS is the Output Record Separator; each output record printed with print is terminated with its value. Its default is also normally a single newline; setting it here to " \n\n " preserves the input format with blank lines separating records. (More detail on these constructs may be found in Chapter 9 .)

The beauty of this approach is that we can easily include additional keys in each address that can be used for both sorting and selection: for example, an extra markup line of the form:

# COUNTRY: UK

in each address, and an additional pipeline stage of grep '# COUNTRY: UK ' just before the sort , would let us extract only the UK addresses for further processing.

You could, of course, go overboard and use XML markup to identify the parts of the address in excruciating detail:

<address>
<personalname>Hans Jürgen</personalname>
<familyname>Schloß</familyname>
<streetname>Unter den Linden<streetname>
<streetnumber>78</streetnumber>
<postalcode>D-10117</postalcode>
<city>Berlin</city>
<country>Germany</country>
</address>

With fancier data-processing filters, you could then please your post office by presorting your mail by country and postal code, but our minimal markup and simple pipeline are often good enough to get the job done.

4.1.4. Sort Efficiency

The obvious way to sort data requires comparing all pairs of items to see which comes first, and leads to algorithms known as bubble sort and insertion sort . These quick-and-dirty algorithms are fine for small amounts of data, but they certainly are not quick for large amounts, because their work to sort n records grows like n 2 . This is quite different from almost all of the filters that we discuss in this book: they read a record, process it, and output it, so their execution time is directly proportional to the number of records, n .

Fortunately, the sorting problem has had lots of attention in the computing community, and good sorting algorithms are known whose average complexity goes like n 3/2 ( shellsort ), n log n ( heapsort , mergesort , and quicksort ), and for restricted kinds of data, n ( distribution sort ). The Unix sort command implementation has received extensive study and optimization: you can be confident that it will do the job efficiently, and almost certainly better than you can do yourself without learning a lot more about sorting algorithms.

4.1.5. Sort Stability

An important question about sorting algorithms is whether or not they are stable : that is, is the input order of equal records preserved in the output? A stable sort may be desirable when records are sorted by multiple keys, or more than once in a pipeline. POSIX does not require that sort be stable, and most implementations are not, as this example shows:

$sort -t_ -k1,1 -k2,2 << EOF Sort four lines by first two fields > one_two > one_two_three > one_two_four > one_two_five > EOF one_two one_two_five one_two_four one_two_three  The sort fields are identical in each record, but the output differs from the input, so sort is not stable. Fortunately, the GNU implementation in the coreutils package [1] remedies that deficiency via the -- stable option: its output for this example correctly matches the input. [1] Available at ftp://ftp.gnu.org/gnu/coreutils/ . > #### [Sep 18, 2018] Getting started with Tmux ###### Sep 15, 2018 | linuxize.com ... ... ... When Tmux is started it reads its configuration parameters from ~/.tmux.conf  if the file is present. Here is a sample ~/.tmux.conf  configuration with customized status line and few additional options: ~/.tmux.conf # Improve colors set -g default-terminal 'screen-256color' # Set scrollback buffer to 10000 set -g history-limit 10000 # Customize the status line set -g status-fg green set -g status-bg black  Copy Basic Tmux Usage Below are the most basic steps for getting started with Tmux: 1. On the command prompt, type tmux new -s my_session , 2. Run the desired program. 3. Use the key sequence Ctrl-b  + d  to detach from the session. 4. Reattach to the Tmux session by typing tmux attach-session -t my_session . Conclusion In this tutorial, you learned how to use Tmux. Now you can start creating multiple Tmux windows in a single session, split windows by creating new panes, navigate between windows, detach and resume sessions and personalize your Tmux instance using the .tmux.conf  file. There's lots more to learn about Tmux at Tmux User's Manual page. #### [Jul 05, 2018] Can rsync resume after being interrupted ##### Notable quotes: ##### "... as if it were successfully transferred ..." ###### Jul 05, 2018 | unix.stackexchange.com Tim ,Sep 15, 2012 at 23:36 I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly. After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time? Gilles ,Sep 16, 2012 at 1:56 Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56 Tim ,Sep 16, 2012 at 2:30 @Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30 jwbensley ,Sep 16, 2012 at 16:15 There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15 Tim ,Sep 19, 2012 at 5:20 @Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20 Gilles ,Sep 19, 2012 at 9:25 @Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25 DanielSmedegaardBuus ,Nov 1, 2014 at 12:32 First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred. While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify . So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you. With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files. So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt. As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions. --append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target. Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences." That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name. This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets. It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :) So, in short: If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify . If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred. When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size. UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!) UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!) Alex ,Aug 28, 2015 at 3:49 According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49 DanielSmedegaardBuus ,Sep 1, 2015 at 13:29 Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29 Cees Timmerman ,Sep 15, 2015 at 17:21 This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21 DanielSmedegaardBuus ,May 10, 2016 at 19:31 @CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31 Jonathan Y. ,Jun 14, 2017 at 5:48 What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48 Alexander O'Mara ,Jan 3, 2016 at 6:34 TL;DR: Just specify a partial directory as the rsync man pages recommends: --partial-dir=.rsync-partial  Longer explanation: There is actually a built-in feature for doing this using the --partial-dir option, which has several advantages over the --partial and --append-verify / --append alternative. Excerpt from the rsync man pages: --partial-dir=DIR A better way to keep partial files than the --partial option is to specify a DIR that will be used to hold the partial data (instead of writing it out to the destination file). On the next transfer, rsync will use a file found in this dir as data to speed up the resumption of the transfer and then delete it after it has served its purpose. Note that if --whole-file is specified (or implied), any par- tial-dir file that is found for a file that is being updated will simply be removed (since rsync is sending files without using rsync's delta-transfer algorithm). Rsync will create the DIR if it is missing (just the last dir -- not the whole path). This makes it easy to use a relative path (such as "--partial-dir=.rsync-partial") to have rsync create the partial-directory in the destination file's directory when needed, and then remove it again when the partial file is deleted. If the partial-dir value is not an absolute path, rsync will add an exclude rule at the end of all your existing excludes. This will prevent the sending of any partial-dir files that may exist on the sending side, and will also prevent the untimely deletion of partial-dir items on the receiving side. An example: the above --partial-dir option would add the equivalent of "-f '-p .rsync-partial/'" at the end of any other filter rules.  By default, rsync uses a random temporary file name which gets deleted when a transfer fails. As mentioned, using --partial you can make rsync keep the incomplete file as if it were successfully transferred , so that it is possible to later append to it using the --append-verify / --append options. However there are several reasons this is sub-optimal. 1. Your backup files may not be complete, and without checking the remote file which must still be unaltered, there's no way to know. 2. If you are attempting to use --backup and --backup-dir , you've just added a new version of this file that never even exited before to your version history. However if we use --partial-dir , rsync will preserve the temporary partial file, and resume downloading using that partial file next time you run it, and we do not suffer from the above issues. trs ,Apr 7, 2017 at 0:00 This is really the answer. Hey everyone, LOOK HERE!! – trs Apr 7 '17 at 0:00 JKOlaf ,Jun 28, 2017 at 0:11 I agree this is a much more concise answer to the question. the TL;DR: is perfect and for those that need more can read the longer bit. Strong work. – JKOlaf Jun 28 '17 at 0:11 N2O ,Jul 29, 2014 at 18:24 You may want to add the -P option to your command. From the man page: --partial By default, rsync will delete any partially transferred file if the transfer is interrupted. In some circumstances it is more desirable to keep partially transferred files. Using the --partial option tells rsync to keep the partial file which should make a subsequent transfer of the rest of the file much faster. -P The -P option is equivalent to --partial --progress. Its pur- pose is to make it much easier to specify these two options for a long transfer that may be interrupted.  So instead of: sudo rsync -azvv /home/path/folder1/ /home/path/folder2  Do: sudo rsync -azvvP /home/path/folder1/ /home/path/folder2  Of course, if you don't want the progress updates, you can just use --partial , i.e.: sudo rsync --partial -azvv /home/path/folder1/ /home/path/folder2  gaoithe ,Aug 19, 2015 at 11:29 @Flimm not quite correct. If there is an interruption (network or receiving side) then when using --partial the partial file is kept AND it is used when rsync is resumed. From the manpage: "Using the --partial option tells rsync to keep the partial file which should <b>make a subsequent transfer of the rest of the file much faster</b>." – gaoithe Aug 19 '15 at 11:29 DanielSmedegaardBuus ,Sep 1, 2015 at 14:11 @Flimm and @gaoithe, my answer wasn't quite accurate, and definitely not up-to-date. I've updated it to reflect version 3 + of rsync . It's important to stress, though, that --partial does not itself resume a failed transfer. See my answer for details :) – DanielSmedegaardBuus Sep 1 '15 at 14:11 guettli ,Nov 18, 2015 at 12:28 @DanielSmedegaardBuus I tried it and the -P is enough in my case. Versions: client has 3.1.0 and server has 3.1.1. I interrupted the transfer of a single large file with ctrl-c. I guess I am missing something. – guettli Nov 18 '15 at 12:28 Yadunandana ,Sep 16, 2012 at 16:07 I think you are forcibly calling the rsync and hence all data is getting downloaded when you recall it again. use --progress option to copy only those files which are not copied and --delete option to delete any files if already copied and now it does not exist in source folder... rsync -avz --progress --delete -e /home/path/folder1/ /home/path/folder2  If you are using ssh to login to other system and copy the files, rsync -avz --progress --delete -e "ssh -o UserKnownHostsFile=/dev/null -o \ StrictHostKeyChecking=no" /home/path/folder1/ /home/path/folder2  let me know if there is any mistake in my understanding of this concept... Fabien ,Jun 14, 2013 at 12:12 Can you please edit your answer and explain what your special ssh call does, and why you advice to do it? – Fabien Jun 14 '13 at 12:12 DanielSmedegaardBuus ,Dec 7, 2014 at 0:12 @Fabien He tells rsync to set two ssh options (rsync uses ssh to connect). The second one tells ssh to not prompt for confirmation if the host he's connecting to isn't already known (by existing in the "known hosts" file). The first one tells ssh to not use the default known hosts file (which would be ~/.ssh/known_hosts). He uses /dev/null instead, which is of course always empty, and as ssh would then not find the host in there, it would normally prompt for confirmation, hence option two. Upon connecting, ssh writes the now known host to /dev/null, effectively forgetting it instantly :) – DanielSmedegaardBuus Dec 7 '14 at 0:12 DanielSmedegaardBuus ,Dec 7, 2014 at 0:23 ...but you were probably wondering what effect, if any, it has on the rsync operation itself. The answer is none. It only serves to not have the host you're connecting to added to your SSH known hosts file. Perhaps he's a sysadmin often connecting to a great number of new servers, temporary systems or whatnot. I don't know :) – DanielSmedegaardBuus Dec 7 '14 at 0:23 moi ,May 10, 2016 at 13:49 "use --progress option to copy only those files which are not copied" What? – moi May 10 '16 at 13:49 Paul d'Aoust ,Nov 17, 2016 at 22:39 There are a couple errors here; one is very serious: --delete will delete files in the destination that don't exist in the source. The less serious one is that --progress doesn't modify how things are copied; it just gives you a progress report on each file as it copies. (I fixed the serious error; replaced it with --remove-source-files .) – Paul d'Aoust Nov 17 '16 at 22:39 #### [Jun 24, 2018] Three Ways to Script Processes in Parallel by Rudis Muiznieks ###### Sep 02, 2015 | www.codeword.xyz Wednesday, September 02, 2015 | 9 Comments I was recently troubleshooting some issues we were having with Shippable , trying to get a bunch of our unit tests to run in parallel so that our builds would complete faster. I didn't care what order the different processes completed in, but I didn't want the shell script to exit until all the spawned unit test processes had exited. I ultimately wasn't able to satisfactorily solve the issue we were having, but I did learn more than I ever wanted to know about how to run processes in parallel in shell scripts. So here I shall impart unto you the knowledge I have gained. I hope someone else finds it useful! Wait The simplest way to achieve what I wanted was to use the wait command. You simply fork all of your processes with & , and then follow them with a wait command. Behold: #!/bin/sh /usr/bin/my-process-1 --args1 & /usr/bin/my-process-2 --args2 & /usr/bin/my-process-3 --args3 & wait echo all processes complete  It's really as easy as that. When you run the script, all three processes will be forked in parallel, and the script will wait until all three have completed before exiting. Anything after the wait command will execute only after the three forked processes have exited. Pros Damn, son! It doesn't get any simpler than that! Cons I don't think there's really any way to determine the exit codes of the processes you forked. That was a deal-breaker for my use case, since I needed to know if any of the tests failed and return an error code from the parent shell script if they did. Another downside is that output from the processes will be all mish-mashed together, which makes it difficult to follow. In our situation, it was basically impossible to determine which unit tests had failed because they were all spewing their output at the same time. GNU Parallel There is a super nifty program called GNU Parallel that does exactly what I wanted. It works kind of like xargs in that you can give it a collection of arguments to pass to a single command which will all be run, only this will run them in parallel instead of in serial like xargs does (OR DOES IT??</foreshadowing>). It is super powerful, and all the different ways you can use it are beyond the scope of this article, but here's a rough equivalent to the example script above: #!/bin/sh parallel /usr/bin/my-process-{} --args{} ::: 1 2 3 echo all processes complete  The official "10 seconds installation" method for the latest version of GNU Parallel (from the README) is as follows: (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash  Pros If any of the processes returns a non-zero exit code, parallel will return a non-zero exit code. This means you can use $? in your shell script to detect if any of the processes failed. Nice! GNU Parallel also (by default) collates the output of each process together, so you'll see the complete output of each process as it completes instead of a mash-up of all the output combined together as it's produced. Also nice!

I am such a damn fanboy I might even buy an official GNU Parallel mug and t-shirt . Actually I'll probably save the money and get the new Star Wars Battlefront game when it comes out instead. But I did seriously consider the parallel schwag for a microsecond or so.

Cons

Literally none.

Xargs

So it turns out that our old friend xargs has supported parallel processing all along! Who knew? It's like the nerdy chick in the movies who gets a makeover near the end and it turns out she's even hotter than the stereotypical hot cheerleader chicks who were picking on her the whole time. Just pass it a -Pn argument and it will run your commands using up to n threads. Check out this mega-sexy equivalent to the above scripts:

#!/bin/sh

printf "1\n2\n3" | xargs -n1 -P3 -I{} /usr/bin/my-process-{} --args{}
echo all processes complete


Pros

xargs returns a non-zero exit code if any of the processes fails, so you can again use $? in your shell script to detect errors. The difference is it will return 123 , unlike GNU Parallel which passes through the non-zero exit code of the process that failed (I'm not sure how parallel picks if more than one process fails, but I'd assume it's either the first or last process to fail). Another pro is that xargs is most likely already installed on your preferred distribution of Linux. Cons I have read reports that the non-GNU version of xargs does not support parallel processing, so you may or may not be out of luck with this option if you're on AIX or a BSD or something. xargs also has the same problem as the wait solution where the output from your processes will be all mixed together. Another con is that xargs is a little less flexible than parallel in how you specify the processes to run. You have to pipe your values into it, and if you use the -I argument for string-replacement then your values have to be separated by newlines (which is more annoying when running it ad-hoc). It's still pretty nice, but nowhere near as flexible or powerful as parallel . Also there's no place to buy an xargs mug and t-shirt. Lame! And The Winner Is After determining that the Shippable problem we were having was completely unrelated to the parallel scripting method I was using, I ended up sticking with parallel for my unit tests. Even though it meant one more dependency on our build machine, the ease #### [Jun 23, 2018] Queuing tasks for batch execution with Task Spooler by Ben Martin ###### Aug 12, 2008 | www.linux.com The Task Spooler project allows you to queue up tasks from the shell for batch execution. Task Spooler is simple to use and requires no configuration. You can view and edit queued commands, and you can view the output of queued commands at any time. Task Spooler has some similarities with other delayed and batch execution projects, such as " at ." While both Task Spooler and at handle multiple queues and allow the execution of commands at a later point, the at project handles output from commands by emailing the results to the user who queued the command, while Task Spooler allows you to get at the results from the command line instead. Another major difference is that Task Spooler is not aimed at executing commands at a specific time, but rather at simply adding to and executing commands from queues. The main repositories for Fedora, openSUSE, and Ubuntu do not contain packages for Task Spooler. There are packages for some versions of Debian, Ubuntu, and openSUSE 10.x available along with the source code on the project's homepage. In this article I'll use a 64-bit Fedora 9 machine and install version 0.6 of Task Spooler from source. Task Spooler does not use autotools to build, so to install it, simply run make; sudo make install . This will install the main Task Spooler command ts  and its manual page into /usr/local. A simple interaction with Task Spooler is shown below. First I add a new job to the queue and check the status. As the command is a very simple one, it is likely to have been executed immediately. Executing ts by itself with no arguments shows the executing queue, including tasks that have completed. I then use ts -c  to get at the stdout of the executed command. The -c  option uses cat  to display the output file for a task. Using ts -i  shows you information about the job. To clear finished jobs from the queue, use the ts -C  command, not shown in the example.$ ts echo "hello world"
6

$ts ID State Output E-Level Times(r/u/s) Command [run=0/1] 6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world$ ts -c 6
hello world

$ts -i 6 Command: echo hello world Enqueue time: Tue Jul 22 14:42:22 2008 Start time: Tue Jul 22 14:42:22 2008 End time: Tue Jul 22 14:42:22 2008 Time run: 0.003336s The -t  option operates like tail -f , showing you the last few lines of output and continuing to show you any new output from the task. If you would like to be notified when a task has completed, you can use the -m  option to have the results mailed to you, or you can queue another command to be executed that just performs the notification. For example, I might add a tar command and want to know when it has completed. The below commands will create a tarball and use libnotify commands to create an inobtrusive popup window on my desktop when the tarball creation is complete. The popup will be dismissed automatically after a timeout.$ ts tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
11
$ts notify-send "tarball creation" "the long running tar creation process is complete." 12$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/1]
11 finished /tmp/ts-out.O6epsS 0 4.64/4.31/0.29 tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
12 finished /tmp/ts-out.4KbPSE 0 0.05/0.00/0.02 notify-send tarball creation the long... is complete.

Notice in the output above, toward the far right of the header information, the run=0/1  line. This tells you that Task Spooler is executing nothing, and can possibly execute one task. Task spooler allows you to execute multiple tasks at once from your task queue to take advantage of multicore CPUs. The -S  option allows you to set how many tasks can be executed in parallel from the queue, as shown below.

$ts -S 2$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world

If you have two tasks that you want to execute with Task Spooler but one depends on the other having already been executed (and perhaps that the previous job has succeeded too) you can handle this by having one task wait for the other to complete before executing. This becomes more important on a quad core machine when you might have told Task Spooler that it can execute three tasks in parallel. The commands shown below create an explicit dependency, making sure that the second command is executed only if the first has completed successfully, even when the queue allows multiple tasks to be executed. The first command is queued normally using ts . I use a subshell to execute the commands by having ts  explicitly start a new bash shell. The second command uses the -d  option, which tells ts  to execute the command only after the successful completion of the last command that was appended to the queue. When I first inspect the queue I can see that the first command (28) is executing. The second command is queued but has not been added to the list of executing tasks because Task Spooler is aware that it cannot execute until task 28 is complete. The second time I view the queue, both tasks have completed.

$ts bash -c "sleep 10; echo hi" 28$ ts -d echo there
29
$ts ID State Output E-Level Times(r/u/s) Command [run=1/2] 28 running /tmp/ts-out.hKqDva bash -c sleep 10; echo hi 29 queued (file) && echo there$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
28 finished /tmp/ts-out.hKqDva 0 10.01/0.00/0.01 bash -c sleep 10; echo hi
29 finished /tmp/ts-out.VDtVp7 0 0.00/0.00/0.00 && echo there
$cat /tmp/ts-out.hKqDva hi$ cat /tmp/ts-out.VDtVp7
there

You can also explicitly set dependencies on other tasks as shown below. Because the ts  command prints the ID of a new task to the console, the first command puts that ID into a shell variable for use in the second command. The second command passes the task ID of the first task to ts, telling it to wait for the task with that ID to complete before returning. Because this is joined with the command we wish to execute with the &&  operation, the second command will execute only if the first one has finished and succeeded.

The first time we view the queue you can see that both tasks are running. The first task will be in the sleep  command that we used explicitly to slow down its execution. The second command will be executing ts , which will be waiting for the first task to complete. One downside of tracking dependencies this way is that the second command is added to the running queue even though it cannot do anything until the first task is complete.

$FIRST_TASKID=ts bash -c "sleep 10; echo hi"$ ts sh -c "ts -w $FIRST_TASKID && echo there" 25$ ts
ID State Output E-Level Times(r/u/s) Command [run=2/2]
24 running /tmp/ts-out.La9Gmz bash -c sleep 10; echo hi
25 running /tmp/ts-out.Zr2n5u sh -c ts -w 24 && echo there
$ts ID State Output E-Level Times(r/u/s) Command [run=0/2] 24 finished /tmp/ts-out.La9Gmz 0 10.01/0.00/0.00 bash -c sleep 10; echo hi 25 finished /tmp/ts-out.Zr2n5u 0 9.47/0.00/0.01 sh -c ts -w 24 && echo there$ ts -c 24
hi
$ts -c 25 there Wrap-up Task Spooler allows you to convert a shell command to a queued command by simply prepending ts  to the command line. One major advantage of using ts over something like the at  command is that you can effectively run tail -f  on the output of a running task and also get at the output of completed tasks from the command line. The utility's ability to execute multiple tasks in parallel is very handy if you are running on a multicore CPU. Because you can explicitly wait for a task, you can set up very complex interactions where you might have several tasks running at once and have jobs that depend on multiple other tasks to complete successfully before they can execute. Because you can make explicitly dependant tasks take up slots in the actively running task queue, you can effectively delay the execution of the queue until a time of your choosing. For example, if you queue up a task that waits for a specific time before returning successfully and have a small group of other tasks that are dependent on this first task to complete, then no tasks in the queue will execute until the first task completes. Category: • Tools & Utilities Click Here! #### [Jun 23, 2018] at, batch, atq, and atrm examples ###### Jun 23, 2018 | www.computerhope.com at -m 01:35 < my-at-jobs.txt  Run the commands listed in the ' my-at-jobs.txt ' file at 1:35 AM. All output from the job will be mailed to the user running the task. When this command has been successfully entered you should receive a prompt similar to the example below: commands will be executed using /bin/sh job 1 at Wed Dec 24 00:22:00 2014  at -l  This command will list each of the scheduled jobs in a format like the following: 1 Wed Dec 24 00:22:00 2003  ...this is the same as running the command atq . at -r 1  Deletes job 1 . This command is the same as running the command atrm 1 . atrm 23  Deletes job 23. This command is the same as running the command at -r 23 . #### [Jun 23, 2018] Bash script processing limited number of commands in parallel ###### Jun 23, 2018 | stackoverflow.com AL-Kateb ,Oct 23, 2013 at 13:33 I have a bash script that looks like this: #!/bin/bash wget LINK1 >/dev/null 2>&1 wget LINK2 >/dev/null 2>&1 wget LINK3 >/dev/null 2>&1 wget LINK4 >/dev/null 2>&1 # .. # .. wget LINK4000 >/dev/null 2>&1  But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed. I thought of wget LINK1 >/dev/null 2>&1 & to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea. One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop: while [$(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do
sleep 1
done


Of course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it.

So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good.

kojiro ,Oct 23, 2013 at 13:46

wait is the right answer here, but your while [ $(ps would be much better written while pkill -0$KEYWORD – using proctools that is, for legitimate reasons to check if a process with a specific name is still running. – kojiro Oct 23 '13 at 13:46

VasyaNovikov ,Jan 11 at 19:01

I think this question should be re-opened. The "possible duplicate" QA is all about running a finite number of programs in parallel. Like 2-3 commands. This question, however, is focused on running commands in e.g. a loop. (see "but there are 4000 lines"). – VasyaNovikov Jan 11 at 19:01

robinCTS ,Jan 11 at 23:08

@VasyaNovikov Have you read all the answers to both this question and the duplicate? Every single answer to this question here, can also be found in the answers to the duplicate question. That is precisely the definition of a duplicate question. It makes absolutely no difference whether or not you are running the commands in a loop. – robinCTS Jan 11 at 23:08

VasyaNovikov ,Jan 12 at 4:09

@robinCTS there are intersections, but questions themselves are different. Also, 6 of the most popular answers on the linked QA deal with 2 processes only. – VasyaNovikov Jan 12 at 4:09

Dan Nissenbaum ,Apr 20 at 15:35

I recommend reopening this question because its answer is clearer, cleaner, better, and much more highly upvoted than the answer at the linked question, though it is three years more recent. – Dan Nissenbaum Apr 20 at 15:35

devnull ,Oct 23, 2013 at 13:35

Use the wait built-in:
process1 &
process2 &
process3 &
process4 &
wait
process5 &
process6 &
process7 &
process8 &
wait


For the above example, 4 processes process1 .. process4 would be started in the background, and the shell would wait until those are completed before starting the next set ..

From the manual :

wait [jobspec or pid ...]


Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.

kojiro ,Oct 23, 2013 at 13:48

So basically i=0; waitevery=4; for link in "${links[@]}"; do wget "$link" & (( i++%waitevery==0 )) && wait; done >/dev/null 2>&1kojiro Oct 23 '13 at 13:48

rsaw ,Jul 18, 2014 at 17:26

Unless you're sure that each process will finish at the exact same time, this is a bad idea. You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer. – rsaw Jul 18 '14 at 17:26

DomainsFeatured ,Sep 13, 2016 at 22:55

Is there a way to do this in a loop? – DomainsFeatured Sep 13 '16 at 22:55

Bobby ,Apr 27, 2017 at 7:55

I've tried this but it seems that variable assignments done in one block are not available in the next block. Is this because they are separate processes? Is there a way to communicate the variables back to the main process? – Bobby Apr 27 '17 at 7:55

choroba ,Oct 23, 2013 at 13:38

See parallel . Its syntax is similar to xargs , but it runs the commands in parallel.

chepner ,Oct 23, 2013 at 14:35

This is better than using wait , since it takes care of starting new jobs as old ones complete, instead of waiting for an entire batch to finish before starting the next. – chepner Oct 23 '13 at 14:35

Mr. Llama ,Aug 13, 2015 at 19:30

For example, if you have the list of links in a file, you can do cat list_of_links.txt | parallel -j 4 wget {} which will keep four wget s running at a time. – Mr. Llama Aug 13 '15 at 19:30

0x004D44 ,Nov 2, 2015 at 21:42

There is a new kid in town called pexec which is a replacement for parallel . – 0x004D44 Nov 2 '15 at 21:42

mat ,Mar 1, 2016 at 21:04

Not to be picky, but xargs can also parallelize commands. – mat Mar 1 '16 at 21:04

Vader B ,Jun 27, 2016 at 6:41

In fact, xargs can run commands in parallel for you. There is a special -P max_procs command-line option for that. See man xargs .

> ,

You can run 20 processes and use the command:
wait


Your script will wait and continue when all your background jobs are finished.

#### [Jun 23, 2018] parallelism - correct xargs parallel usage

###### Jun 23, 2018 | unix.stackexchange.com

Yan Zhu ,Apr 19, 2015 at 6:59

I am using xargs to call a python script to process about 30 million small files. I hope to use xargs to parallelize the process. The command I am using is:
find ./data -name "*.json" -print0 |
xargs -0 -I{} -P 40 python Convert.py {} > log.txt


Basically, Convert.py will read in a small json file (4kb), do some processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no other CPU-intense process is running on this server.

By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I find that -P 40 is not as fast as expected. Sometimes all cores will freeze and decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease the number of parallel processes to -P 20-30 , but it's still not very fast. The ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs ?

Ole Tange ,Apr 19, 2015 at 8:45

You are most likely hit by I/O: The system cannot read the files fast enough. Try starting more than 40: This way it will be fine if some of the processes have to wait for I/O. – Ole Tange Apr 19 '15 at 8:45

Fox ,Apr 19, 2015 at 10:30

What kind of processing does the script do? Any database/network/io involved? How long does it run? – Fox Apr 19 '15 at 10:30

PSkocik ,Apr 19, 2015 at 11:41

I second @OleTange. That is the expected behavior if you run as many processes as you have cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep), then they will process, and then repeat. If you add more processes, then the additional processes that currently aren't running on a physical core will have kicked off parallel IO operations, which will, when finished, eliminate or at least reduce the sleep periods on your cores. – PSkocik Apr 19 '15 at 11:41

Bichoy ,Apr 20, 2015 at 3:32

1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually overwritten with each call to convert.py ... not sure if this is the intended behavior or not. – Bichoy Apr 20 '15 at 3:32

Ole Tange ,May 11, 2015 at 18:38

xargs -P and > is opening up for race conditions because of the half-line problem gnu.org/software/parallel/ Using GNU Parallel instead will not have that problem. – Ole Tange May 11 '15 at 18:38

James Scriven ,Apr 24, 2015 at 18:00

I'd be willing to bet that your problem is python . You didn't say what kind of processing is being done on each file, but assuming you are just doing in-memory processing of the data, the running time will be dominated by starting up 30 million python virtual machines (interpreters).

If you can restructure your python program to take a list of files, instead of just one, you will get a huge improvement in performance. You can then still use xargs to further improve performance. For example, 40 processes, each processing 1000 files:

find ./data -name "*.json" -print0 |
xargs -0 -L1000 -P 40 python Convert.py


This isn't to say that python is a bad/slow language; it's just not optimized for startup time. You'll see this with any virtual machine-based or interpreted language. Java, for example, would be even worse. If your program was written in C, there would still be a cost of starting a separate operating system process to handle each file, but it would be much less.

From there you can fiddle with -P to see if you can squeeze out a bit more speed, perhaps by increasing the number of processes to take advantage of idle processors while data is being read/written.

Stephen ,Apr 24, 2015 at 13:03

So firstly, consider the constraints:

What is the constraint on each job? If it's I/O you can probably get away with multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its going to be worse than pointless running more jobs concurrently than you have CPU cores.

My understanding of these things is that GNU Parallel would give you better control over the queue of jobs etc.

See GNU parallel vs & (I mean background) vs xargs -P for a more detailed explanation of how the two differ.

,

As others said, check whether you're I/O-bound. Also, xargs' man page suggests using -n with -P , you don't mention the number of Convert.py processes you see running in parallel.

As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try doing the processing in a tmpfs (of course, in this case you should check for enough memory, avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in the first place).

#### [Jun 23, 2018] Linux/Bash, how to schedule commands in a FIFO queue?

###### Jun 23, 2018 | superuser.com

Andrei ,Apr 10, 2013 at 14:26

I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be run at a specified time in the future as would be the case with the "at" command. I want them to start running now, but not simultaneously. The next scheduled command in the queue should be run only after the first command finishes executing. Alternatively, it would be nice if I could specify a maximum number of commands from the queue that could be run simultaneously; for example if the maximum number of simultaneous commands is 2, then only at most 2 commands scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the next command in the remaining queue being started only when one of the currently 2 running commands finishes.

I've heard task-spooler could do something like this, but this package doesn't appear to be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what I'm using). If that's the best alternative then let me know and I'll use task-spooler, otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free, canonical way to do such a thing with bash.

UPDATE:

Simple solutions like ; or && from bash do not work. I need to schedule these commands from an external program, when an event occurs. I just don't want to have hundreds of instances of my command running simultaneously, hence the need for a queue. There's an external program that will trigger events where I can run my own commands. I want to handle ALL triggered events, I don't want to miss any event, but I also don't want my system to crash, so that's why I want a queue to handle my commands triggered from the external program.

Andrei ,Apr 11, 2013 at 11:40

http://vicerveza.homeunix.net/~viric/soft/ts/

Does the trick very well. Hopefully it will be included in Ubuntu's package repos.

Hennes ,Apr 10, 2013 at 15:00

Use ;

For example:
ls ; touch test ; ls

That will list the directory. Only after ls has run it will run touch test which will create a file named test. And only after that has finished it will run the next command. (In this case another ls which will show the old contents and the newly created file).

Similar commands are || and && .

; will always run the next command.

&& will only run the next command it the first returned success.
Example: rm -rf *.mp3 && echo "Success! All MP3s deleted!"

|| will only run the next command if the first command returned a failure (non-zero) return value. Example: rm -rf *.mp3 || echo "Error! Some files could not be deleted! Check permissions!"

If you want to run a command in the background, append an ampersand ( & ).
Example:
make bzimage &
mp3blaster sound.mp3
make mytestsoftware ; ls ; firefox ; make clean

Will run two commands int he background (in this case a kernel build which will take some time and a program to play some music). And in the foregrounds it runs another compile job and, once that is finished ls, firefox and a make clean (all sequentially)

For more details, see man bash

[Edit after comment]

in pseudo code, something like this?

Program run_queue:

While(true)
{
Wait_for_a_signal();

While( queue not empty )
{
run next command from the queue.
remove this command from the queue.
// If commands where added to the queue during execution then
// the queue is not empty, keep processing them all.
}
// Queue is now empty, returning to wait_for_a_signal
}

//
// Wait forever on commands and add them to a queue
// Signal run_quueu when something gets added.
//
{
While(true)
{
Wait_for_event();
Append command to queue
signal run_queue
}
}


terdon ,Apr 10, 2013 at 15:03

The easiest way would be to simply run the commands sequentially:
cmd1; cmd2; cmd3; cmdN


If you want the next command to run only if the previous command exited successfully, use && :

cmd1 && cmd2 && cmd3 && cmdN


That is the only bash native way I know of doing what you want. If you need job control (setting a number of parallel jobs etc), you could try installing a queue manager such as TORQUE but that seems like overkill if all you want to do is launch jobs sequentially.

psusi ,Apr 10, 2013 at 15:24

You are looking for at 's twin brother: batch . It uses the same daemon but instead of scheduling a specific time, the jobs are queued and will be run whenever the system load average is low.

mpy ,Apr 10, 2013 at 14:59

Apart from dedicated queuing systems (like the Sun Grid Engine ) which you can also use locally on one machine and which offer dozens of possibilities, you can use something like
 command1 && command2 && command3


which is the other extreme -- a very simple approach. The latter neither does provide multiple simultaneous processes nor gradually filling of the "queue".

Bogdan Dumitru ,May 3, 2016 at 10:12

I went on the same route searching, trying out task-spooler and so on. The best of the best is this:

GNU Parallel --semaphore --fg It also has -j for parallel jobs.

#### [Jun 23, 2018] Task Spooler

##### "... doesn't work anymore ..."
###### Jun 23, 2018 | vicerveza.homeunix.net

As in freshmeat.net :

task spooler is a Unix batch system where the tasks spooled run one after the other. The amount of jobs to run at once can be set at any time. Each user in each system has his own job queue. The tasks are run in the correct context (that of enqueue) from any shell/process, and its output/results can be easily watched. It is very useful when you know that your commands depend on a lot of RAM, a lot of disk use, give a lot of output, or for whatever reason it's better not to run them all at the same time, while you want to keep your resources busy for maximum benfit. Its interface allows using it easily in scripts.

For your first contact, you can read an article at linux.com , which I like as overview, guide and examples (original url) . On more advanced usage, don't neglect the TRICKS file in the package.

Features

I wrote Task Spooler because I didn't have any comfortable way of running batch jobs in my linux computer. I wanted to:

• Queue jobs from different terminals.
• Use it locally in my machine (not as in network queues).
• Have a good way of seeing the output of the processes (tail, errorlevels, ...).
• Easy use: almost no configuration.
• Easy to use in scripts.

At the end, after some time using and developing ts , it can do something more:

• It works in most systems I use and some others, like GNU/Linux, Darwin, Cygwin, and FreeBSD.
• No configuration at all for a simple queue.
• Good integration with renice, kill, etc. (through ts -p and process groups).
• Have any amount of queues identified by name, writting a simple wrapper script for each (I use ts2, tsio, tsprint, etc).
• Control how many jobs may run at once in any queue (taking profit of multicores).
• It never removes the result files, so they can be reached even after we've lost the ts task list.
• Transparent if used as a subprogram with -nf .
• Optional separation of stdout and stderr.

You can look at an old (but representative) screenshot of ts-0.2.1 if you want.

Mailing list

I created a GoogleGroup for the program. You look for the archive and the join methods in the taskspooler google group page .

Alessandro Öhler once maintained a mailing list for discussing newer functionalities and interchanging use experiences. I think this doesn't work anymore , but you can look at the old archive or even try to subscribe .

How it works

The queue is maintained by a server process. This server process is started if it isn't there already. The communication goes through a unix socket usually in /tmp/ .

When the user requests a job (using a ts client), the client waits for the server message to know when it can start. When the server allows starting , this client usually forks, and runs the command with the proper environment, because the client runs run the job and not the server, like in 'at' or 'cron'. So, the ulimits, environment, pwd,. apply.

When the job finishes, the client notifies the server. At this time, the server may notify any waiting client, and stores the output and the errorlevel of the finished job.

Moreover the client can take advantage of many information from the server: when a job finishes, where does the job output go to, etc.

Look at the version repository if you are interested in its development.

Андрей Пантюхин (Andrew Pantyukhin) maintains the BSD port .

Alessandro Öhler provided a Gentoo ebuild for 0.4 , which with simple changes I updated to the ebuild for 0.6.4 . Moreover, the Gentoo Project Sunrise already has also an ebuild ( maybe old ) for ts .

Alexander V. Inyukhin maintains unofficial debian packages for several platforms. Find the official packages in the debian package system .

Pascal Bleser packed the program for SuSE and openSuSE in RPMs for various platforms .

Gnomeye maintains the AUR package .

Eric Keller wrote a nodejs web server showing the status of the task spooler queue ( github project ).

Manual

Look at its manpage (v0.6.1). Here you also have a copy of the help for the same version:

usage: ./ts [action] [-ngfmd] [-L <lab>] [cmd...]
Env vars:
TS_SOCKET  the path to the unix socket used by the ts command.
TS_MAILTO  where to mail the result (on -m). Local user by default.
TS_MAXFINISHED  maximum finished jobs in the queue.
TS_ONFINISH  binary called on job end (passes jobid, error, outfile, command).
TS_ENV  command called on enqueue. Its output determines the job information.
TS_SAVELIST  filename which will store the list, if the server dies.
TS_SLOTS   amount of jobs which can run at once, read on server start.
Actions:
-K       kill the task spooler server
-C       clear the list of finished jobs
-l       show the job list (default action)
-S [num] set the number of max simultanious jobs of the server.
-t [id]  tail -f the output of the job. Last run if not specified.
-c [id]  cat the output of the job. Last run if not specified.
-p [id]  show the pid of the job. Last run if not specified.
-o [id]  show the output file. Of last job run, if not specified.
-i [id]  show job information. Of last job run, if not specified.
-s [id]  show the job state. Of the last added, if not specified.
-r [id]  remove a job. The last added, if not specified.
-w [id]  wait for a job. The last added, if not specified.
-u [id]  put that job first. The last added, if not specified.
-U <id-id>  swap two jobs in the queue.
-h       show this help
-V       show the program version
-n       don't store the output of the command.
-g       gzip the stored output (if not -n).
-f       don't fork into background.
-m       send the output by e-mail (uses sendmail).
-d       the job will be run only if the job before ends well
-L <lab> name this task with a label, to be distinguished on listing.

Thanks
• To Raúl Salinas, for his inspiring ideas
• To Alessandro Öhler, the first non-acquaintance user, who proposed and created the mailing list.
• Андрею Пантюхину, who created the BSD port .
• To the useful, although sometimes uncomfortable, UNIX interface.
• To Alexander V. Inyukhin, for the debian packages.
• To Pascal Bleser, for the SuSE packages.
• To Sergio Ballestrero, who sent code and motivated the development of a multislot version of ts.
• To GNU, an ugly but working and helpful ol' UNIX implementation.

#### [Jun 23, 2018] bash - Shell Scripting Using xargs to execute parallel instances of a shell function

###### Jun 23, 2018 | stackoverflow.com

Gnats ,Jul 23, 2010 at 19:33

I'm trying to use xargs in a shell script to run parallel instances of a function I've defined in the same script. The function times the fetching of a page, and so it's important that the pages are actually fetched concurrently in parallel processes, and not in background processes (if my understanding of this is wrong and there's negligible difference between the two, just let me know).

The function is:

function time_a_url ()
{
oneurltime=$($time_command -p wget -p $1 -O /dev/null 2>&1 1>/dev/null | grep real | cut -d" " -f2) echo "Fetching$1 took $oneurltime seconds." }  How does one do this with an xargs pipe in a form that can take number of times to run time_a_url in parallel as an argument? And yes, I know about GNU parallel, I just don't have the privilege to install software where I'm writing this. Dennis Williamson ,Jul 23, 2010 at 23:03 Here's a demo of how you might be able to get your function to work: $ f() { echo "[$@]"; }$ export -f f
$echo -e "b 1\nc 2\nd 3 4" | xargs -P 0 -n 1 -I{} bash -c f\ \{\} [b 1] [d 3 4] [c 2]  The keys to making this work are to export the function so the bash that xargs spawns will see it and to escape the space between the function name and the escaped braces. You should be able to adapt this to work in your situation. You'll need to adjust the arguments for -P and -n (or remove them) to suit your needs. You can probably get rid of the grep and cut . If you're using the Bash builtin time , you can specify an output format using the TIMEFORMAT variable. If you're using GNU /usr/bin/time , you can use the --format argument. Either of these will allow you to drop the -p also. You can replace this part of your wget command: 2>&1 1>/dev/null with -q . In any case, you have those reversed. The correct order would be >/dev/null 2>&1 . Lee Netherton ,Aug 30, 2011 at 16:32 I used xargs -P0 -n1 -I{} bash -c "f {}" which still works, and seems a little tidier. – Lee Netherton Aug 30 '11 at 16:32 tmpvar ,Jul 24, 2010 at 15:21 On Mac OS X: xargs: max. processes must be >0 (for: xargs -P [>0]) f() { echo "[$@]"; }
export -f f

echo -e "b 1\nc 2\nd 3 4" | sed 's/ /\\ /g' | xargs -P 10 -n 1 -I{} bash -c f\ \{\}

echo -e "b 1\nc 2\nd 3 4" | xargs -P 10 -I '{}' bash -c 'f "$@"' arg0 '{}'  , If you install GNU Parallel on another system, you will see the functionality is in a single file (called parallel). You should be able to simply copy that file to your own ~/bin. #### [Jun 13, 2018] parsync - a parallel rsync wrapper for large data transfers by Harry Mangalam ###### Jan 22, 2017 | nac.uci.edu v1.67 (Mac Beta) Table of Contents 1. Download If you already know you want it, get it here: parsync+utils.tar.gz (contains parsync plus the kdirstat-cache-writer , stats , and scut utilities below) Extract it into a dir on your$PATH and after verifying the other dependencies below, give it a shot.

While parsync is developed for and test on Linux, the latest version of parsync has been modified to (mostly) work on the Mac (tested on OSX 10.9.5). A number of the Linux-specific dependencies have been removed and there are a number of Mac-specific work arounds.

Thanks to Phil Reese < preese@stanford.edu > for the code mods needed to get it started. It's the same package and instructions for both platforms.

2. Dependencies

parsync requires the following utilities to work:

• stats - self-writ Perl utility for providing descriptive stats on STDIN
• scut - self-writ Perl utility like cut that allows regex split tokens
• kdirstat-cache-writer (included in the tarball mentioned above), requires a
non-default Perl utility: URI::Escape qw(uri_escape)
sudo yum install perl-URI  # CentOS-like

sudo apt-get install liburi-perl  # Debian-like
parsync needs to be installed only on the SOURCE end of the transfer and uses whatever rsync is available on the TARGET. It uses a number of Linux- specific utilities so if you're transferring between Linux and a FreeBSD host, install parsync on the Linux side. In fact, as currently written, it will only PUSH data to remote targets ; it will not pull data as rsync itself can do. This will probably in the near future. 3. Overview rsync is a fabulous data mover. Possibly more bytes have been moved (or have been prevented from being moved) by rsync than by any other application. So what's not to love? For transferring large, deep file trees, rsync will pause while it generates lists of files to process. Since Version 3, it does this pretty fast, but on sluggish filesystems, it can take hours or even days before it will start to actually exchange rsync data. Second, due to various bottlenecks, rsync will tend to use less than the available bandwidth on high speed networks. Starting multiple instances of rsync can improve this significantly. However, on such transfers, it is also easy to overload the available bandwidth, so it would be nice to both limit the bandwidth used if necessary and also to limit the load on the system. parsync tries to satisfy all these conditions and more by:
• using the kdir-cache-writer utility from the beautiful kdirstat directory browser which can produce lists of files very rapidly
• allowing re-use of the cache files so generated.
• doing crude loadbalancing of the number of active rsyncs, suspending and un-suspending the processes as necessary.
• using rsync's own bandwidth limiter (--bwlimit) to throttle the total bandwidth.
• using rsync's own vast option selection is available as a pass-thru (tho limited to those compatible with the --files-from option).
 Only use for LARGE data transfers The main use case for parsync is really only very large data transfers thru fairly fast network connections (>1Gb/s). Below this speed, a single rsync can saturate the connection, so there's little reason to use parsync and in fact the overhead of testing the existence of and starting more rsyncs tends to worsen its performance on small transfers to slightly less than rsync alone.
Beyond this introduction, parsync's internal help is about all you'll need to figure out how to use it; below is what you'll see when you type parsync -h . There are still edge cases where parsync will fail or behave oddly, especially with small data transfers, so I'd be happy to hear of such misbehavior or suggestions to improve it. Download the complete tarball of parsync, plus the required utilities here: parsync+utils.tar.gz Unpack it, move the contents to a dir on your $PATH , chmod it executable, and try it out. parsync --help or just parsync Below is what you should see: 4. parsync help parsync version 1.67 (Mac compatibility beta) Jan 22, 2017 by Harry Mangalam <hjmangalam@gmail.com> || <harry.mangalam@uci.edu> parsync is a Perl script that wraps Andrew Tridgell's miraculous 'rsync' to provide some load balancing and parallel operation across network connections to increase the amount of bandwidth it can use. parsync is primarily tested on Linux, but (mostly) works on MaccOSX as well. parsync needs to be installed only on the SOURCE end of the transfer and only works in local SOURCE -> remote TARGET mode (it won't allow remote local SOURCE <- remote TARGET, emitting an error and exiting if attempted). It uses whatever rsync is available on the TARGET. It uses a number of Linux-specific utilities so if you're transferring between Linux and a FreeBSD host, install parsync on the Linux side. The only native rsync option that parsync uses is '-a' (archive) & '-s' (respect bizarro characters in filenames). If you need more, then it's up to you to provide them via '--rsyncopts'. parsync checks to see if the current system load is too heavy and tries to throttle the rsyncs during the run by monitoring and suspending / continuing them as needed. It uses the very efficient (also Perl-based) kdirstat-cache-writer from kdirstat to generate lists of files which are summed and then crudely divided into NP jobs by size. It appropriates rsync's bandwidth throttle mechanism, using '--maxbw' as a passthru to rsync's 'bwlimit' option, but divides it by NP so as to keep the total bw the same as the stated limit. It monitors and shows network bandwidth, but can't change the bw allocation mid-job. It can only suspend rsyncs until the load decreases below the cutoff. If you suspend parsync (^Z), all rsync children will suspend as well, regardless of current state. Unless changed by '--interface', it tried to figure out how to set the interface to monitor. The transfer will use whatever interface routing provides, normally set by the name of the target. It can also be used for non-host-based transfers (between mounted filesystems) but the network bandwidth continues to be (usually pointlessly) shown. [[NB: Between mounted filesystems, parsync sometimes works very poorly for reasons still mysterious. In such cases (monitor with 'ifstat'), use 'cp' or 'tnc' (https://goo.gl/5FiSxR) for the initial data movement and a single rsync to finalize. I believe the multiple rsync chatter is interfering with the transfer.]] It only works on dirs and files that originate from the current dir (or specified via "--rootdir"). You cannot include dirs and files from discontinuous or higher-level dirs. ** the ~/.parsync files ** The ~/.parsync dir contains the cache (*.gz), the chunk files (kds*), and the time-stamped log files. The cache files can be re-used with '--reusecache' (which will re-use ALL the cache and chunk files. The log files are datestamped and are NOT overwritten. ** Odd characters in names ** parsync will sometimes refuse to transfer some oddly named files, altho recent versions of rsync allow the '-s' flag (now a parsync default) which tries to respect names with spaces and properly escaped shell characters. Filenames with embedded newlines, DOS EOLs, and other odd chars will be recorded in the log files in the ~/.parsync dir. ** Because of the crude way that files are chunked, NP may be adjusted slightly to match the file chunks. ie '--NP 8' -> '--NP 7'. If so, a warning will be issued and the rest of the transfer will be automatically adjusted. OPTIONS ======= [i] = integer number [f] = floating point number [s] = "quoted string" ( ) = the default if any --NP [i] (sqrt(#CPUs)) ............... number of rsync processes to start optimal NP depends on many vars. Try the default and incr as needed --startdir [s] (pwd) .. the directory it works relative to. If you omit it, the default is the CURRENT dir. You DO have to specify target dirs. See the examples below. --maxbw [i] (unlimited) .......... in KB/s max bandwidth to use (--bwlimit passthru to rsync). maxbw is the total BW to be used, NOT per rsync. --maxload [f] (NP+2) ........ max total system load - if sysload > maxload, sleeps an rsync proc for 10s --checkperiod [i] (5) .......... sets the period in seconds between updates --rsyncopts [s] ... options passed to rsync as a quoted string (CAREFUL!) this opt triggers a pause before executing to verify the command. --interface [s] ............. network interface to /monitor/, not nec use. default: /sbin/route -n | grep "^0.0.0.0" | rev | cut -d' ' -f1 | rev above works on most simple hosts, but complex routes will confuse it. --reusecache .......... don't re-read the dirs; re-use the existing caches --email [s] ..................... email address to send completion message (requires working mail system on host) --barefiles ..... set to allow rsync of individual files, as oppo to dirs --nowait ................ for scripting, sleep for a few s instead of wait --version ................................. dumps version string and exits --help ......................................................... this help Examples ======== -- Good example 1 -- % parsync --maxload=5.5 --NP=4 --startdir='/home/hjm' dir1 dir2 dir3 hjm@remotehost:~/backups where = "--startdir='/home/hjm'" sets the working dir of this operation to '/home/hjm' and dir1 dir2 dir3 are subdirs from '/home/hjm' = the target "hjm@remotehost:~/backups" is the same target rsync would use = "--NP=4" forks 4 instances of rsync = -"-maxload=5.5" will start suspending rsync instances when the 5m system load gets to 5.5 and then unsuspending them when it goes below it. It uses 4 instances to rsync dir1 dir2 dir3 to hjm@remotehost:~/backups -- Good example 2 -- % parsync --rsyncopts="--ignore-existing" --reusecache --NP=3 --barefiles *.txt /mount/backups/txt where = "--rsyncopts='--ignore-existing'" is an option passed thru to rsync telling it not to disturb any existing files in the target directory. = "--reusecache" indicates that the filecache shouldn't be re-generated, uses the previous filecache in ~/.parsync = "--NP=3" for 3 copies of rsync (with no "--maxload", the default is 4) = "--barefiles" indicates that it's OK to transfer barefiles instead of recursing thru dirs. = "/mount/backups/txt" is the target - a local disk mount instead of a network host. It uses 3 instances to rsync *.txt from the current dir to "/mount/backups/txt". -- Error Example 1 -- % pwd /home/hjm # executing parsync from here % parsync --NP4 --compress /usr/local /media/backupdisk why this is an error: = '--NP4' is not an option (parsync will say "Unknown option: np4") It should be '--NP=4' = if you were trying to rsync '/usr/local' to '/media/backupdisk', it will fail since there is no /home/hjm/usr/local dir to use as a source. This will be shown in the log files in ~/.parsync/rsync-logfile-<datestamp>_# as a spew of "No such file or directory (2)" errors = the '--compress' is a native rsync option, not a native parsync option. You have to pass it to rsync with "--rsyncopts='--compress'" The correct version of the above command is: % parsync --NP=4 --rsyncopts='--compress' --startdir=/usr local /media/backupdisk -- Error Example 2 -- % parsync --start-dir /home/hjm mooslocal hjm@moo.boo.yoo.com:/usr/local why this is an error: = this command is trying to PULL data from a remote SOURCE to a local TARGET. parsync doesn't support that kind of operation yet. The correct version of the above command is: # ssh to hjm@moo, install parsync, then: % parsync --startdir=/usr local hjm@remote:/home/hjm/mooslocal #### [Jun 02, 2018] How to run Linux commands simultaneously with GNU Parallel ###### Jun 02, 2018 | www.techrepublic.com Scratching the surface We've only just scratched the surface of GNU Parallel. I highly recommend you give the official GNU Parallel tutorial a read, and watch this video tutorial series on Yutube , so you can understand the complexities of the tool (of which there are many). But this will get you started on a path to helping your data center Linux servers use commands with more efficiency. #### [Jun 02, 2018] Parallelise rsync using GNU Parallel ###### Jun 02, 2018 | unix.stackexchange.com Mandar Shinde ,Mar 13, 2015 at 6:51 I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB. In order to sync those files, I have been using rsync command as follows: rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/  The contents of proj.lst are as follows: + proj1 + proj1/* + proj1/*/* + proj1/*/*/*.tar + proj1/*/*/*.pdf + proj2 + proj2/* + proj2/*/* + proj2/*/*/*.tar + proj2/*/*/*.pdf ... ... ... - *  As a test, I picked up two of those projects (8.5GB of data) and I executed the command above. Being a sequential process, it tool 14 minutes 58 seconds to complete. So, for 1.2TB of data it would take several hours. If I would could multiple rsync processes in parallel (using & , xargs or parallel ), it would save my time. I tried with below command with parallel (after cd ing to source directory) and it took 12 minutes 37 seconds to execute: parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .  This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere. How can I run multiple rsync processes in order to reduce the execution time? Ole Tange ,Mar 13, 2015 at 7:25 Are you limited by network bandwidth? Disk iops? Disk bandwidth? – Ole Tange Mar 13 '15 at 7:25 Mandar Shinde ,Mar 13, 2015 at 7:32 If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsync s is our first priority. – Mandar Shinde Mar 13 '15 at 7:32 Ole Tange ,Mar 13, 2015 at 7:41 Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used? – Ole Tange Mar 13 '15 at 7:41 Mandar Shinde ,Mar 13, 2015 at 7:47 In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsync s in parallel is the primary focus now. – Mandar Shinde Mar 13 '15 at 7:47 Mandar Shinde ,Apr 11, 2015 at 13:53 Following steps did the job for me: 1. Run the rsync --dry-run first in order to get the list of files those would be affected. rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log 1. I fed the output of cat transfer.log to parallel in order to run 5 rsync s in parallel, as follows: cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log Here, --relative option ( link ) ensured that the directory structure for the affected files, at the source and destination, remains the same (inside /data/ directory), so the command must be run in the source folder (in example, /data/projects ). Sandip Bhattacharya ,Nov 17, 2016 at 21:22 That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/ – Sandip Bhattacharya Nov 17 '16 at 21:22 Mike D ,Sep 19, 2017 at 16:42 How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/ . – Mike D Sep 19 '17 at 16:42 Cheetah ,Oct 12, 2017 at 5:31 On newer versions of rsync (3.1.0+), you can use --info=name in place of -v , and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them. – Cheetah Oct 12 '17 at 5:31 Mikhail ,Apr 10, 2017 at 3:28 I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations. I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1 . The source drive was mounted like: mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0  Using a single rsync process: rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod  the io meter reads: StoragePod 30.0T 144T 0 1.61K 0 130M StoragePod 30.0T 144T 0 1.61K 0 130M StoragePod 30.0T 144T 0 1.62K 0 130M  This in synthetic benchmarks (crystal disk), performance for sequential write approaches 900 MB/s which means the link is saturated. 130MB/s is not very good, and the difference between waiting a weekend and two weeks. So, I built the file list and tried to run the sync again (I have a 64 core machine): cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log  and it had the same performance! StoragePod 29.9T 144T 0 1.63K 0 130M StoragePod 29.9T 144T 0 1.62K 0 130M StoragePod 29.9T 144T 0 1.56K 0 129M  As an alternative I simply ran rsync on the root folders: rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell  This actually boosted performance: StoragePod 30.1T 144T 13 3.66K 112K 343M StoragePod 30.1T 144T 24 5.11K 184K 469M StoragePod 30.1T 144T 25 4.30K 196K 373M  In conclusion, as @Sandip Bhattacharya brought up, write a small script to get the directories and parallel that. Alternatively, pass a file list to rsync. But don't create new instances for each file. Julien Palard ,May 25, 2016 at 14:15 I personally use this simple one: ls -1 | parallel rsync -a {} /destination/directory/  Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone. Ole Tange ,Mar 13, 2015 at 7:25 A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections. The following will start one rsync per big file in src-dir to dest-dir on the server fooserver: cd src-dir; find . -type f -size +100000 | \ parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \ rsync -s -Havessh {} fooserver:/dest-dir/{}  The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time: rsync -Havessh src-dir/ fooserver:/dest-dir/  If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do: seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/  Mandar Shinde ,Mar 13, 2015 at 7:34 Any other alternative in order to avoid find ? – Mandar Shinde Mar 13 '15 at 7:34 Ole Tange ,Mar 17, 2015 at 9:20 Limit the -maxdepth of find. – Ole Tange Mar 17 '15 at 9:20 Mandar Shinde ,Apr 10, 2015 at 3:47 If I use --dry-run option in rsync , I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process? – Mandar Shinde Apr 10 '15 at 3:47 Ole Tange ,Apr 10, 2015 at 5:51 cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; rsync -s -Havessh {} fooserver:/dest-dir/{} – Ole Tange Apr 10 '15 at 5:51 Mandar Shinde ,Apr 10, 2015 at 9:49 Can you please explain the mkdir -p /dest-dir/{//}\; part? Especially the {//} thing is a bit confusing. – Mandar Shinde Apr 10 '15 at 9:49 , For multi destination syncs, I am using parallel rsync -avi /path/to/source ::: host1: host2: host3:  Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys #### [Jun 02, 2018] Parallelizing rsync ###### Jun 02, 2018 | www.gnu.org rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections. The following will start one rsync per big file in src-dir to dest-dir on the server fooserver :  cd src-dir; find . -type f -size +100000 | \ parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \ rsync -s -Havessh {} fooserver:/dest-dir/{}  The dirs created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:  rsync -Havessh src-dir/ fooserver:/dest-dir/  If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:  seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/  #### [May 28, 2018] TIP 7-zip s XZ compression on a multiprocessor system is often faster and compresses better than gzip linuxadmin ###### May 28, 2018 | www.reddit.com TyIzaeL line"> [–] kristopolous 4 years ago (4 children) I did this a while back also. Here's a graph: http://i.imgur.com/gPOQBfG.png X axis is compression level (min to max) Y is the size of the file that was compressed I forget what the file was. TyIzaeL 4 years ago (3 children) That is a great start (probably better than what I am doing). Do you have time comparisons as well? kristopolous 4 years ago (1 child) http://www.reddit.com/r/linuxquestions/comments/1gdvnc/best_file_compression_format/caje4hm there's the post TyIzaeL 4 years ago (0 children) Very nice. I might work on something similar to this soon next time I'm bored. kristopolous 4 years ago (0 children) nope. TyIzaeL 4 years ago (0 children) That's a great point to consider among all of this. Compression is always a tradeoff between how much CPU and memory you want to throw at something and how much space you would like to save. In my case, hammering the server for 3 minutes in order to take a backup is necessary because the uncompressed data would bottleneck at the LAN speed. randomfrequency 4 years ago (0 children) You might want to play with 'pigz' - it's gzip, multi-threaded. You can 'pv' to restrict the rate of the output, and it accepts signals to control the rate limiting. rrohbeck 4 years ago (1 child) Also pbzip2 -1 to -9 and pigz -1 to -9. With -9 you can surely make backup CPU bound. I've given up on compression though: rsync is much faster than straight backup and I use btrfs compression/deduplication/snapshotting on the backup server. TyIzaeL 4 years ago (0 children) pigz -9 is already on the chart as pigz --best. I'm working on adding the others though. TyIzaeL 4 years ago (0 children) I'm running gzip, bzip2, and pbzip2 now (not at the same time, of course) and will add results soon. But in my case the compression keeps my db dumps from being IO bound by the 100mbit LAN connection. For example, lzop in the results above puts out 6041.632 megabits in 53.82 seconds for a total compressed data rate of 112 megabits per second, which would make the transfer IO bound. Whereas the pigz example puts out 3339.872 megabits in 81.892 seconds, for an output data rate of 40.8 megabits per second. This is just on my dual-core box with a static file, on the 8-core server I see the transfer takes a total of about three minutes. It's probably being limited more by the rate at which the MySQL server can dump text from the database, but if there was no compression it'd be limited by the LAN speed. If we were dumping 2.7GB over the LAN directly, we would need 122mbit/s of real throughput to complete it in three minutes. Shammyhealz 4 years ago (2 children) I thought the best compression was supposed to be LZMA? Which is what the .7z archives are. I have no idea of the relative speed of LZMA and gzip TyIzaeL 4 years ago (1 child) xz archives use the LZMA2 format (which is also used in 7z archives). LZMA2 speed seems to range from a little slower than gzip to much slower than bzip2, but results in better compression all around. primitive_screwhead 4 years ago (0 children) However LZMA2 decompression speed is generally much faster than bzip2, in my experience, though not as fast as gzip. This is why we use it, as we decompress our data much more often than we compress it, and the space saving/decompression speed tradeoff is much more favorable for us than either gzip of bzip2. crustang 4 years ago (2 children) I mentioned how 7zip was superior to all other zip programs in /r/osx a few days ago and my comment was burried in favor of the the osx circlejerk .. it feels good seeing this data. I love 7zip RTFMorGTFO 4 years ago (1 child) Why... Tar supports xz, lzma, lzop, lzip, and any other kernel based compression algorithms. Its also much more likely to be preinstalled on your given distro. crustang 4 years ago (0 children) I've used 7zip at my old job for a backup of our business software's database. We needed speed, high level of compression, and encryption. Portability wasn't high on the list since only a handful of machines needed access to the data. All machines were multi-processor and 7zip gave us the best of everything given the requirements. I haven't really looked at anything deeply - including tar, which my old boss didn't care for. #### [May 28, 2018] RPM RedHat EL 6 p7zip 9.20.1 x86_64 rpm ###### May 28, 2018 | rpm.pbone.net p7zip rpm build for : RedHat EL 6 . For other distributions click p7zip .  Name : p7zip Version : 9.20.1 Vendor : Dag Apt Repository, http://dag_wieers_com/apt/ Release : 1.el6.rf Date : 2011-04-20 15:23:34 Group : Applications/Archiving Source RPM : p7zip-9.20.1-1.el6.rf.src.rpm Size : 14.84 MB Packager : Dag Wieers < dag_wieers_com> Summary : Very high compression ratio file archiver Description : p7zip is a port of 7za.exe for Unix. 7-Zip is a file archiver with a very high compression ratio. The original version can be found at http://www.7-zip.org/. RPM found in directory: /mirror/apt.sw.be/redhat/el6/en/x86_64/rpmforge/RPMS Content of RPM Changelog Provides Requires Download  ftp.univie.ac.at p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.rediris.es p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.icm.edu.pl p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.pbone.net p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.pbone.net p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.pbone.net p7zip-9.20.1-1.el6.rf.x86_64.rpm ftp.is.co.za p7zip-9.20.1-1.el6.rf.x86_64.rpm #### [May 28, 2018] TIL pigz exists A parallel implementation of gzip for modern multi-processor, multi-core machines linux ###### May 28, 2018 | www.reddit.com submitted 3 years ago by msiekkinen y unvoted"> [–] tangre 3 years ago (74 children) Why wouldn't gzip be updated with this functionality instead? Is there a point in keeping it separate? ilikerackmounts 3 years ago (59 children) There are certain file sizes were pigz makes no difference, in general you need at least 2 cores to feel the benefits, there are quite a few reasons. That being said, pigz and its bzip counterpart pbzip2 can be symlinked in place when emerged with gentoo and using the "symlink" use flag. adam@eggsbenedict ~$ eix pigz
[I] app-arch/pigz
Available versions:  2.2.5 2.3 2.3.1 (~)2.3.1-r1 {static symlink |test}
Installed versions:  2.3.1-r1(02:06:01 01/25/14)(symlink -static -|test)
Homepage:            http://www.zlib.net/pigz/
Description:         A parallel implementation of gzip

msiekkinen 3 years ago (38 children)

in general you need at least 2 cores to feel the benefits

Is it even possible to buy any single core cpus outside of some kind of specialized embedded system these days?

exdirrk 3 years ago (5 children)
Virtualization.
tw4 3 years ago (2 children)
Yes, but nevertheless it's possible to allocate only one.
too_many_secrets 3 years ago (0 children)

Giving a VM more than one CPU is quite a rare circumstance.

Depends on your circumstances. It's rare that we have any VMs with a single CPU, but we have thousands of servers and a lot of things going on.

FakingItEveryDay 3 years ago (0 children)
You can, but often shouldn't. I can only speak for vmware here, other hypervisors may work differently. Generally you want to size your VMware vm's so that they are around 80% cpu utilization. When any VM with multiple cores needs compute power the hypervisor will make it wait to until it can free that number of CPUs, even if the task in the VM only needs one core. This makes the multi-core VM slower by having to wait longer to do it's work, as well as makes other VMs on the hypervisor slower as they must all wait for it to finish before they can get a core allocated.

#### [May 28, 2018] Solaris: Parallel Compression/Decompression

##### "... compare gzip's 52s decompression time with pigz's 18s ..."
###### May 28, 2018 | hadafq8.wordpress.com

Posted on January 26, 2015 by Sandeep Shenoy This topic is not Solaris specific, but certainly helps Solaris users who are frustrated with the single threaded implementation of all officially supported compression tools such as compress, gzip, zip. pigz (pig-zee) is a parallel implementation of gzip that suits well for the latest multi-processor, multi-core machines. By default, pigz breaks up the input into multiple chunks of size 128 KB, and compress each chunk in parallel with the help of light-weight threads. The number of compress threads is set by default to the number of online processors. The chunk size and the number of threads are configurable. Compressed files can be restored to their original form using -d option of pigz or gzip tools. As per the man page, decompression is not parallelized out of the box, but may show some improvement compared to the existing old tools. The following example demonstrates the advantage of using pigz over gzip in compressing and decompressing a large file. eg.,
Original file, and the target hardware. $ls -lh PT8.53.04.tar -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar$ psrinfo -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7) The core has 8 virtual processors (56-63) SPARC-T5 (chipid 0, clock 3600 MHz)
gzip compression.
$time gzip –fast PT8.53.04.tar real 3m40.125s user 3m27.105s sys 0m13.008s$ ls -lh PT8.53* -rw-r–r– 1 psft dba 3.1G Feb 28 14:03 PT8.53.04.tar.gz /* the following prstat, vmstat outputs show that gzip is compressing the tar file using a single thread – hence low CPU utilization. */
$prstat -p 42510 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 42510 psft 2616K 2200K cpu16 10 0 0:01:00 1.5% gzip/ 1$ prstat -m -p 42510 PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 42510 psft 95 4.6 0.0 0.0 0.0 0.0 0.0 0.0 0 35 7K 0 gzip/1 $vmstat 2 r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 776242104 917016008 0 7 0 0 0 0 0 0 0 52 52 3286 2606 2178 2 0 98 1 0 0 776242104 916987888 0 14 0 0 0 0 0 0 0 0 0 3851 3359 2978 2 1 97 0 0 0 776242104 916962440 0 0 0 0 0 0 0 0 0 0 0 3184 1687 2023 1 0 98 0 0 0 775971768 916930720 0 0 0 0 0 0 0 0 0 39 37 3392 1819 2210 2 0 98 0 0 0 775971768 916898016 0 0 0 0 0 0 0 0 0 0 0 3452 1861 2106 2 0 98 pigz compression.$ time ./pigz PT8.53.04.tar real 0m25.111s <== wall clock time is 25s compared to gzip's 3m 27s
user 17m18.398s sys 0m37.718s
/* the following prstat, vmstat outputs show that pigz is compressing the tar file using many threads – hence busy system with high CPU utilization. */
$prstat -p 49734 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 49734 psft 59M 58M sleep 11 0 0:12:58 38% pigz/ 66$ vmstat 2 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 778097840 919076008 6 113 0 0 0 0 0 0 0 40 36 39330 45797 74148 61 4 35
0 0 0 777956280 918841720 0 1 0 0 0 0 0 0 0 0 0 38752 43292 71411 64 4 32
0 0 0 777490336 918334176 0 3 0 0 0 0 0 0 0 17 15 46553 53350 86840 60 4 35
1 0 0 777274072 918141936 0 1 0 0 0 0 0 0 0 39 34 16122 20202 28319 88 4 9
1 0 0 777138800 917917376 0 0 0 0 0 0 0 0 0 3 3 46597 51005 86673 56 5 39

$ls -lh PT8.53.04.tar.gz -rw-r–r– 1 psft dba 3.0G Feb 28 14:03 PT8.53.04.tar.gz$ gunzip PT8.53.04.tar.gz <== shows that the pigz compressed file is compatible with gzip/gunzip

$ls -lh PT8.53* -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar Decompression.$ time ./pigz -d PT8.53.04.tar.gz real 0m18.068s
user 0m22.437s sys 0m12.857s
$time gzip -d PT8.53.04.tar.gz real 0m52.806s <== compare gzip's 52s decompression time with pigz's 18s user 0m42.068s sys 0m10.736s$ ls -lh PT8.53.04.tar -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar
Of course, there are other tools such as Parallel BZIP2 (PBZIP2), which is a parallel implementation of the bzip2 tool are worth a try too. The idea here is to highlight the fact that there are better tools out there to get the job done in a quick manner compared to the existing/old tools that are bundled with the operating system distribution.

#### [Apr 22, 2018] Happy Sysadmin Appreciation Day 2016 Opensource.com

###### Apr 22, 2018 | opensource.com

Necessity is frequently the mother of invention. I knew very little about BASH scripting but that was about to change rapidly. Working with the existing script and using online help forums, search engines, and some printed documentation, I setup Linux network attached storage computer running on Fedora Core. I learned how to create an SSH keypair and configure that along with rsync to move the backup file from the email server to the storage server. That worked well for a few days until I noticed that the storage servers disk space was rapidly disappearing. What was I going to do?

That's when I learned more about Bash scripting. I modified my rsync command to delete backed up files older than ten days. In both cases I learned that a little knowledge can be a dangerous thing but in each case my experience and confidence as Linux user and system administrator grew and due to that I functioned as a resource for other. On the plus side, we soon realized that the disk to disk backup system was superior to tape when it came to restoring email files. In the long run it was a win but there was a lot of uncertainty and anxiety along the way.

#### [Apr 04, 2018] The gzip Recovery Toolkit

###### Apr 04, 2018 | www.urbanophile.com

So you thought you had your files backed up - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well.

I'm very eager for feedback on this program . If you download and try it, I'd appreciate and email letting me know what your results were. My email is arenn@urbanophile.com . Thanks.

ATTENTION

99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted.

Disclaimer and Warning

This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified.

Note that version 0.8 contains major bug fixes and improvements. See the ChangeLog for details. Upgrading is recommended. The old version is provided in the event you run into troubles with the new release.

You need the following packages:

First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make . Install manually by copying to the directory of your choice.

Usage

Run gzrecover on a corrupted .gz file. If you leave the filename blank, gzrecover will read from the standard input. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. The default filename when reading from the standard input is "stdin.recovered". To write recovered data to the standard output, use the -p option. (Note that -p and -o cannot be used together).

To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect the stderr stream to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. Note that gzrecover will take longer than regular gunzip. The more corrupt your data the longer it takes. If your archive is a tarball, read on.

For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.6 or higher) handles corrupted files out of the box.

Here's an example:

$ls *.gz my-corrupted-backup.tar.gz$ gzrecover my-corrupted-backup.tar.gz
$ls *.recovered my-corrupted-backup.tar.recovered$ cpio -F my-corrupted-backup.tar.recovered -i -v


Note that newer versions of cpio can spew voluminous error messages to your terminal. You may want to redirect the stderr stream to /dev/null. Also, cpio might take quite a long while to run.

The gzip Recovery Toolkit v0.8
Copyright (c) 2002-2013 Aaron M. Renn ( arenn@urbanophile.com )

#### [Jan 14, 2018] How to remount filesystem in read write mode under Linux

###### Jan 14, 2018 | kerneltalks.com

Most of the time on newly created file systems of NFS filesystems we see error like below :

 1 2 3 4 root @ kerneltalks # touch file1 touch : cannot touch ' file1 ' : Read - only file system

This is because file system is mounted as read only. In such scenario you have to mount it in read-write mode. Before that we will see how to check if file system is mounted in read only mode and then we will get to how to re mount it as a read write filesystem.

How to check if file system is read only

To confirm file system is mounted in read only mode use below command –

 1 2 3 4 # cat /proc/mounts | grep datastore / dev / xvdf / datastore ext3 ro , seclabel , relatime , data = ordered 0 0

Grep your mount point in cat /proc/mounts and observer third column which shows all options which are used in mounted file system. Here ro denotes file system is mounted read-only.

You can also get these details using mount -v command

 1 2 3 4 root @ kerneltalks # mount -v |grep datastore / dev / xvdf on / datastore type ext3 ( ro , relatime , seclabel , data = ordered )

In this output. file system options are listed in braces at last column.

Re-mount file system in read-write mode

To remount file system in read-write mode use below command –

 1 2 3 4 5 6 root @ kerneltalks # mount -o remount,rw /datastore root @ kerneltalks # mount -v |grep datastore / dev / xvdf on / datastore type ext3 ( rw , relatime , seclabel , data = ordered )

Observe after re-mounting option ro changed to rw . Now, file system is mounted as read write and now you can write files in it.

Note : It is recommended to fsck file system before re mounting it.

You can check file system by running fsck on its volume.

 1 2 3 4 5 6 7 8 9 10 root @ kerneltalks # df -h /datastore Filesystem Size Used Avail Use % Mounted on / dev / xvda2 10G 881M 9.2G 9 % / root @ kerneltalks # fsck /dev/xvdf fsck from util - linux 2.23.2 e2fsck 1.42.9 ( 28 - Dec - 2013 ) / dev / xvdf : clean , 12 / 655360 files , 79696 / 2621440 blocks

Sometimes there are some corrections needs to be made on file system which needs reboot to make sure there are no processes are accessing file system.

#### [Jan 14, 2018] Linux yes Command Tutorial for Beginners (with Examples)

###### Jan 14, 2018 | www.howtoforge.com

You can see that user has to type 'y' for each query. It's in situation like these where yes can help. For the above scenario specifically, you can use yes in the following way:

yes | rm -ri test Q3. Is there any use of yes when it's used alone?

Yes, there's at-least one use: to tell how well a computer system handles high amount of loads. Reason being, the tool utilizes 100% processor for systems that have a single processor. In case you want to apply this test on a system with multiple processors, you need to run a yes process for each processor.

#### [Dec 09, 2017] How to rsync only a specific list of files - Stack Overflow

##### "... The filenames that are read from the FILE are all relative to the source dir ..."
###### Dec 09, 2017 | stackoverflow.com

ash, May 11, 2015 at 20:05

There is a flag --files-from that does exactly what you want. From man rsync :
--files-from=FILE

Using this option allows you to specify the exact list of files to transfer (as read from the specified FILE or - for standard input). It also tweaks the default behavior of rsync to make transferring just the specified files and directories easier:

• The --relative (-R) option is implied, which preserves the path information that is specified for each item in the file (use --no-relative or --no-R if you want to turn that off).
• The --dirs (-d) option is implied, which will create directories specified in the list on the destination rather than noisily skipping them (use --no-dirs or --no-d if you want to turn that off).
• The --archive (-a) option's behavior does not imply --recursive (-r), so specify it explicitly, if you want it.
• These side-effects change the default state of rsync, so the position of the --files-from option on the command-line has no bearing on how other options are parsed (e.g. -a works the same before or after --files-from, as does --no-R and all other options).

The filenames that are read from the FILE are all relative to the source dir -- any leading slashes are removed and no ".." references are allowed to go higher than the source dir. For example, take this command:

rsync -a --files-from=/tmp/foo /usr remote:/backup

If /tmp/foo contains the string "bin" (or even "/bin"), the /usr/bin directory will be created as /backup/bin on the remote host. If it contains "bin/" (note the trailing slash), the immediate contents of the directory would also be sent (without needing to be explicitly mentioned in the file -- this began in version 2.6.4). In both cases, if the -r option was enabled, that dir's entire hierarchy would also be transferred (keep in mind that -r needs to be specified explicitly with --files-from, since it is not implied by -a). Also note that the effect of the (enabled by default) --relative option is to duplicate only the path info that is read from the file -- it does not force the duplication of the source-spec path (/usr in this case).

In addition, the --files-from file can be read from the remote host instead of the local host if you specify a "host:" in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of ":" to mean "use the remote end of the transfer". For example:

rsync -a --files-from=:/path/file-list src:/ /tmp/copy

This would copy all the files specified in the /path/file-list file that was located on the remote "src" host.

If the --iconv and --protect-args options are specified and the --files-from filenames are being sent from one host to another, the filenames will be translated from the sending host's charset to the receiving host's charset.

NOTE: sorting the list of files in the --files-from input helps rsync to be more efficient, as it will avoid re-visiting the path elements that are shared between adjacent entries. If the input is not sorted, some path elements (implied directories) may end up being scanned multiple times, and rsync will eventually unduplicate them after they get turned into file-list elements.

Nicolas Mattia, Feb 11, 2016 at 11:06

Note that you still have to specify the directory where the files listed are located, for instance: rsync -av --files-from=file-list . target/ for copying files from the current dir. – Nicolas Mattia Feb 11 '16 at 11:06

ash, Feb 12, 2016 at 2:25

Yes, and to reiterate: The filenames that are read from the FILE are all relative to the source dir . – ash Feb 12 '16 at 2:25

Michael ,Nov 2, 2016 at 0:09

if the files-from file has anything starting with .. rsync appears to ignore the .. giving me an error like rsync: link_stat "/home/michael/test/subdir/test.txt" failed: No such file or directory (in this case running from the "test" dir and trying to specify "../subdir/test.txt" which does exist. – Michael Nov 2 '16 at 0:09

xxx,

--files-from= parameter needs trailing slash if you want to keep the absolute path intact. So your command would become something like below:
rsync -av --files-from=/path/to/file / /tmp/

This could be done like there are a large number of files and you want to copy all files to x path. So you would find the files and throw output to a file like below:

find /var/* -name *.log > file

#### [Nov 13, 2017] 20 Sed (Stream Editor) Command Examples for Linux Users

###### Nov 13, 2017 | www.linuxtechi.com

20 Sed (Stream Editor) Command Examples for Linux Users

by Pradeep Kumar · Published November 9, 2017 · Updated November 9, 2017

Sed command or Stream Editor is very powerful utility offered by Linux/Unix systems. It is mainly used for text substitution , find & replace but it can also perform other text manipulations like insertion deletion search etc. With SED, we can edit complete files without actually having to open it. Sed also supports the use of regular expressions, which makes sed an even more powerful test manipulation tool

In this article, we will learn to use SED command with the help some examples. Basic syntax for using sed command is,

sed OPTIONS [SCRIPT] [INPUTFILE ]

Now let's see some examples.

Example :1) Displaying partial text of a file

With sed, we can view only some part of a file rather than seeing whole file. To see some lines of the file, use the following command,

[linuxtechi@localhost ~]$sed -n 22,29p testfile.txt  here, option 'n' suppresses printing of whole file & option 'p' will print only line lines from 22 to 29. Example :2) Display all except some lines To display all content of a file except for some portion, use the following command, [linuxtechi@localhost ~]$ sed 22,29d testfile.txt


Option 'd' will remove the mentioned lines from output.

Example :3) Display every 3rd line starting with Nth line

Do display content of every 3rd line starting with line number 2 or any other line, use the following command

[linuxtechi@localhost ~]$sed -n '2-3p' file.txt  Example :4 ) Deleting a line using sed command To delete a line with sed from a file, use the following command, [linuxtechi@localhost ~]$ sed Nd testfile.txt


where 'N' is the line number & option 'd' will delete the mentioned line number. To delete the last line of the file, use

[linuxtechi@localhost ~]$sed$d testfile.txt

Example :5) Deleting a range of lines

To delete a range of lines from the file, run

[linuxtechi@localhost ~]$sed '29-34d' testfile.txt  This will delete lines 29 to 34 from testfile.txt file. Example :6) Deleting lines other than the mentioned To delete lines other than the mentioned lines from a file, we will use '!' [linuxtechi@localhost ~]$ sed '29-34!d' testfile.txt


here '!' option is used as not, so it will reverse the condition i.e. will not delete the lines mentioned. All the lines other 29-34 will be deleted from the files testfile.txt.

To add a blank line after every non-blank line, we will use option 'G',

[linuxtechi@localhost ~]$sed G testfile.txt  Example :8) Search and Replacing a string using sed To search & replace a string from the file, we will use the following example, [linuxtechi@localhost ~]$ sed 's/danger/safety/' testfile.txt


here option 's' will search for word 'danger' & replace it with 'safety' on every line for the first occurrence only.

Example :9) Search and replace a string from whole file using sed

To replace the word completely from the file, we will use option 'g' with 's',

[linuxtechi@localhost ~]$sed 's/danger/safety/g' testfile.txt  Example :10) Replace the nth occurrence of string pattern We can also substitute a string on nth occurrence from a file. Like replace 'danger' with 'safety' only on second occurrence, [linuxtechi@localhost ~]$ sed 's/danger/safety/2' testfile.txt


To replace 'danger' on 2nd occurrence of every line from whole file, use

[linuxtechi@localhost ~]$sed 's/danger/safety/2g' testfile.txt  Example :11) Replace a string on a particular line To replace a string only from a particular line, use [linuxtechi@localhost ~]$ sed '4 s/danger/safety/' testfile.txt


This will only substitute the string from 4th line of the file. We can also mention a range of lines instead of a single line,

[linuxtechi@localhost ~]$sed '4-9 s/danger/safety/' testfile.txt  Example :12) Add a line after/before the matched search To add a new line with some content after every pattern match, use option 'a' , [linuxtechi@localhost ~]$ sed '/danger/a "This is new line with text after match"' testfile.txt


To add a new line with some content a before every pattern match, use option 'i',

[linuxtechi@localhost ~]$sed '/danger/i "This is new line with text before match" ' testfile.txt  Example :13) Change a whole line with matched pattern To change a whole line to a new line when a search pattern matches we need to use option 'c' with sed, [linuxtechi@localhost ~]$ sed '/danger/c "This will be the new line" ' testfile.txt


So when the pattern matches 'danger', whole line will be changed to the mentioned line.

Up until now we were only using simple expressions with sed, now we will discuss some advanced uses of sed with regex,

Example :14) Running multiple sed commands

If we need to perform multiple sed expressions, we can use option 'e' to chain the sed commands,

[linuxtechi@localhost ~]$sed -e 's/danger/safety/g' -e 's/hate/love/' testfile.txt  Example :15) Making a backup copy before editing a file To create a backup copy of a file before we edit it, use option '-i.bak', [linuxtechi@localhost ~]$ sed -i.bak -e 's/danger/safety/g'  testfile.txt


This will create a backup copy of the file with extension .bak. You can also use other extension if you like.

Example :16) Delete a file line starting with & ending with a pattern

To delete a file line starting with a particular string & ending with another string, use

[linuxtechi@localhost ~]$sed -e 's/danger.*stops//g' testfile.txt  This will delete the line with 'danger' on start & 'stops' in the end & it can have any number of words in between , '.*' defines that part. Example :17) Appending lines To add some content before every line with sed & regex, use [linuxtechi@localhost ~]$ sed -e 's/.*/testing sed &/' testfile.txt


So now every line will have 'testing sed' before it.

Example :18) Removing all commented lines & empty lines

To remove all commented lines i.e. lines with # & all the empty lines, use

[linuxtechi@localhost ~]$sed -e 's/#.*//;/^$/d' testfile.txt


To only remove commented lines, use

[linuxtechi@localhost ~]$sed -e 's/#.*//' testfile.txt  Example :19) Get list of all usernames from /etc/passwd To get the list of all usernames from /etc/passwd file, use [linuxtechi@localhost ~]$  sed 's/$$[^:]*$$.*/\1/' /etc/passwd

a complete list all usernames will be generated on screen as output.

Example :20) Prevent overwriting of system links with sed command

'sed -i' command has been know to remove system links & create only regular files in place of the link file. So to avoid such a situation & prevent ' sed -i ' from destroying the links, use ' –follow-symklinks ' options with the command being executed.

Let's assume i want to disable SELinux on CentOS or RHEL Severs

[linuxtechi@localhost ~]# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux

These were some examples to show sed, we can use these reference to employ them as & when needed. If you guys have any queries related to this or any article, do share with us.

#### [Nov 09, 2017] TERM strings by Tom Ryder

###### Jan 26, 2013 | sanctum.geek.nz

A certain piece of very misleading advice is often given online to users having problems with the way certain command-line applications are displaying in their terminals. This is to suggest that the user change the value of their TERM environment variable from within the shell, doing something like this:

$TERM=xterm-256color This misinformation sometimes extends to suggesting that users put the forced TERM change into their shell startup scripts. The reason this is such a bad idea is that it forces your shell to assume what your terminal is, and thereby disregards the initial terminal identity string sent by the emulator. This leads to a lot of confusion when one day you need to connect with a very different terminal emulator. Accounting for differences All terminal emulators are not created equal. Certainly, not all of them are xterm(1) , although many other terminal emulators do a decent but not comprehensive job of copying it. The value of the TERM environment variable is used by the system running the shell to determine what the terminal connecting to it can and cannot do, what control codes to send to the program to use those features, and how the shell should understand the input of certain key codes, such as the Home and End keys. These things in particular are common causes of frustration for new users who turn out to be using a forced TERM string. Instead, focus on these two guidelines for setting TERM : 1. Avoid setting TERM from within the shell, especially in your startup scripts like .bashrc or .bash_profile . If that ever seems like the answer, then you are probably asking the wrong question! The terminal identification string should always be sent by the terminal emulator you are using; if you do need to change it, then change it in the settings for the emulator. 2. Always use an appropriate TERM string that accurately describes what your choice of terminal emulator can and cannot display. Don't make an rxvt(1) terminal identify itself as xterm ; don't make a linux console identify itself as vt100 ; and don't make an xterm(1) compiled without 256 color support refer to itself as xterm-256color . In particular, note that sometimes for compatibility reasons, the default terminal identification used by an emulator is given as something generic like xterm , when in fact a more accurate or comprehensive terminal identity file is more than likely available for your particular choice of terminal emulator with a little searching. An example that surprises a lot of people is the availability of the putty terminal identity file, when the application defaults to presenting itself as an imperfect xterm(1) emulator. Configuring your emulator's string Before you change your terminal string in its settings, check whether the default it uses is already the correct one, with one of these: $ echo $TERM$ tset -q


Most builds of rxvt(1) , for example, should already use the correct TERM string by default, such as rxvt-unicode-256color for builds with 256 colors and Unicode support.

Where to configure which TERM string your terminal uses will vary depending on the application. For xterm(1) , your .Xresources file should contain a definition like the below:

XTerm*termName: xterm-256color


For rxvt(1) , the syntax is similar:

URxvt*termName: rxvt-unicode-256color


Other GTK and Qt emulators sometimes include the setting somewhere in their preferences. Look for mentions of xterm , a common fallback default.

For Windows PuTTY, it's configurable under the "'Connections > Data"' section:

More detail about configuring PuTTY for connecting to modern systems can be found in my article on configuring PuTTY .

Testing your TERM string

On GNU/Linux systems, an easy way to test the terminal capabilities (particularly effects like colors and reverse video) is using the msgcat(1) utility:

$msgcat --color=test  This will output a large number of tests of various features to the terminal, so that you can check their appearance is what you expect. Finding appropriate terminfo(5) definitions On GNU/Linux systems, the capabilities and behavior of various terminal types is described using terminfo(5) files, usually installed as part of the ncurses package. These files are often installed in /lib/terminfo or /usr/share/terminfo , in subdirectories by first letter. In order to use a particular TERM string, an appropriate file must exist in one of these directories. On Debian-derived systems, a large collection of terminal types can be installed to the system with the ncurses-term package. For example, the following variants of the rxvt terminal emulator are all available: $ cd /usr/share/terminfo/r
$ls rxvt* rxvt-16color rxvt-256color rxvt-88color rxvt-color rxvt-cygwin rxvt-cygwin-native rxvt+pcfkeys rxvt-unicode-256color rxvt-xpm  Private and custom terminfo(5) files If you connect to a system that doesn't have a terminfo(5) definition to match the TERM definition for your particular terminal, you might get a message similar to this on login: setterm: rxvt-unicode-256color: unknown terminal type tput: unknown terminal "rxvt-unicode-256color"$


If you're not able to install the appropriate terminal definition system-wide, one technique is to use a private .terminfo directory in your home directory containing the definitions you need:

$cd ~/.terminfo$ find
.
./x
./x/xterm-256color
./x/xterm
./r
./r/rxvt-256color
./r/rxvt-unicode-256color
./r/rxvt
./s
./s/screen
./s/screen-256color
./p
./p/putty-256color
./p/putty


You can copy this to your home directory on the servers you manage with a tool like scp :

$scp -r .terminfo server: TERM and multiplexers Terminal multiplexers like screen(1) and tmux(1) are special cases, and they cause perhaps the most confusion to people when inaccurate TERM strings are used. The tmux FAQ even opens by saying that most of the display problems reported by people are due to incorrect TERM settings, and a good portion of the codebase in both multiplexers is dedicated to negotiating the differences between terminal capacities. This is because they are "terminals within terminals", and provide their own functionality only within the bounds of what the outer terminal can do. In addition to this, they have their own type for terminals within them; both of them use screen and its variants, such as screen-256color . It's therefore very important to check that both the outer and inner definitions for TERM are correct. In .screenrc it usually suffices to use a line like the following: term screen Or in .tmux.conf : set-option -g default-terminal screen If the outer terminals you use consistently have 256 color capabilities, you may choose to use the screen-256color variant instead. If you follow all of these guidelines, your terminal experience will be much smoother, as your terminal and your system will understand each other that much better. You may find that this fixes a lot of struggles with interactive tools like vim(1) , for one thing, because if the application is able to divine things like the available color space directly from terminal information files, it saves you from having to include nasty hacks on the t_Co variable in your .vimrc . Posted in Terminal Tagged term strings , terminal types , terminfo #### [Nov 09, 2017] PuTTY configuration by Tom Ryder ###### Dec 22, 2012 | sanctum.geek.nz Posted on PuTTY is a terminal emulator with a free software license, including an SSH client. While it has cross-platform ports, it's used most frequently on Windows systems, because they otherwise lack a built-in terminal emulator that interoperates well with Unix-style TTY systems. While it's very popular and useful, PuTTY's defaults are quite old, and are chosen for compatibility reasons rather than to take advantage of all the features of a more complete terminal emulator. For new users, this is likely an advantage as it can avoid confusion, but more advanced users who need to use a Windows client to connect to a modern GNU/Linux system may find the defaults frustrating, particularly when connecting to a more capable and custom-configured server. Here are a few of the problems with the default configuration: • It identifies itself as an xterm(1) , when terminfo(5) definitions are available named putty and putty-256color , which more precisely define what the terminal can and cannot do, and their various custom escape sequences. • It only allows 16 colors, where most modern terminals are capable of using 256; this is partly tied into the terminal type definition. • It doesn't use UTF-8 by default, which should be used whenever possible for reasons of interoperability and compatibility, and is well-supported by modern locale definitions on GNU/Linux. • It uses Courier New, a workable but rather harsh monospace font, which should be swapped out for something more modern if available. • It uses audible terminal bells, which tend to be annoying. • Its default palette based on xterm(1) is rather garish and harsh; softer colors are more pleasant to read. All of these things are fixable. Terminal type Usually the most important thing in getting a terminal working smoothly is to make sure it identifies itself correctly to the machine to which it's connecting, using an appropriate $TERM string. By default, PuTTY identifies itself as an xterm(1) terminal emulator, which most systems will support.

However, there's a terminfo(5) definition for putty and putty-256color available as part of ncurses , and if you have it available on your system then you should use it, as it slightly more precisely describes the features available to PuTTY as a terminal emulator.

You can check that you have the appropriate terminfo(5) definition installed by looking in /usr/share/terminfo/p :

$ls -1 /usr/share/terminfo/p/putty* /usr/share/terminfo/p/putty /usr/share/terminfo/p/putty-256color /usr/share/terminfo/p/putty-sco /usr/share/terminfo/p/putty-vt100  On Debian and Ubuntu systems, these files can be installed with: # apt-get install ncurses-term  If you can't install the files via your system's package manager, you can also keep a private repository of terminfo(5) files in your home directory, in a directory called  .terminfo : $ ls -1 $HOME/.terminfo/p putty putty-256color  Once you have this definition installed, you can instruct PuTTY to identify with that $TERM string in the Connection > Data section:

Here, I've used putty-256color ; if you don't need or want a 256 color terminal you could just use putty .

Once connected, make sure that your $TERM string matches what you specified, and hasn't been mangled by any of your shell or terminal configurations: $ echo $TERM putty-256color  Color space Certain command line applications like Vim and Tmux can take advantage of a full 256 colors in the terminal. If you'd like to use this, set PuTTY's $TERM string to putty-256color as outlined above, and select Allow terminal to use xterm 256-colour mode in Window > Colours

You can test this is working by using a 256 color application, or by trying out the terminal colours directly in your shell using tput :

$for ((color = 0; color <= 255; color++)); do > tput setaf "$color"
> printf "test"
> done


If you see the word test in many different colors, then things are probably working. Type reset to fix your terminal after this:

$reset  Using UTF-8 If you're connecting to a modern GNU/Linux system, it's likely that you're using a UTF-8 locale. You can check which one by typing locale . In my case, I'm using the en_NZ locale with UTF-8 character encoding: $ locale
LANG=en_NZ.UTF-8
LANGUAGE=en_NZ:en
LC_CTYPE="en_NZ.UTF-8"
LC_NUMERIC="en_NZ.UTF-8"
LC_TIME="en_NZ.UTF-8"
LC_COLLATE="en_NZ.UTF-8"
LC_MONETARY="en_NZ.UTF-8"
LC_MESSAGES="en_NZ.UTF-8"
LC_PAPER="en_NZ.UTF-8"
LC_NAME="en_NZ.UTF-8"
LC_TELEPHONE="en_NZ.UTF-8"
LC_MEASUREMENT="en_NZ.UTF-8"
LC_IDENTIFICATION="en_NZ.UTF-8"
LC_ALL=


If the output of locale does show you're using a UTF-8 character encoding, then you should configure PuTTY to interpret terminal output using that character set; it can't detect it automatically (which isn't PuTTY's fault; it's a known hard problem). You do this in the Window > Translation section:

While you're in this section, it's best to choose the Use Unicode line drawing code points option as well. Line-drawing characters are most likely to work properly with this setting for UTF-8 locales and modern fonts:

If Unicode and its various encodings is new to you, I highly recommend Joel Spolsky's classic article about what programmers should know about both.

Fonts

Courier New is a workable monospace font, but modern Windows systems include Consolas , a much nicer terminal font. You can change this in the Window > Appearance section:

There's no reason you can't use another favourite Bitmap or TrueType font instead once it's installed on your system; DejaVu Sans Mono , Inconsolata , and Terminus are popular alternatives. I personally favor Ubuntu Mono .

Bells

Terminal bells by default in PuTTY emit the system alert sound. Most people find this annoying; some sort of visual bell tends to be much better if you want to use the bell at all. Configure this in Terminal > Bell

Given the purpose of the alert is to draw attention to the window, I find that using a flashing taskbar icon works well; I use this to draw my attention to my prompt being displayed after a long task completes, or if someone mentions my name or directly messages me in irssi(1) .

Another option is using the Visual bell (flash window) option, but I personally find this even worse than the audible bell.

Default palette

The default colours for PuTTY are rather like those used in xterm(1) , and hence rather harsh, particularly if you're used to the slightly more subdued colorscheme of terminal emulators like gnome-terminal(1) , or have customized your palette to something like Solarized .

If you have decimal RGB values for the colours you'd prefer to use, you can enter those in the Window > Colours section, making sure that Use system colours and Attempt to use logical palettes are unchecked:

There are a few other default annoyances in PuTTY, but the above are the ones that seem to annoy advanced users most frequently. Dag Wieers has a similar post with a few more defaults to fix.

#### [Nov 09, 2017] Searching files

##### "... Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep , but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available. ..."
###### sanctum.geek.nz

More often than attributes of a set of files, however, you want to find files based on their contents, and it's no surprise that grep, in particular grep -R, is useful here. This searches the current directory tree recursively for anything matching 'someVar':

$grep -FR someVar .  Don't forget the case insensitivity flag either, since by default grep works with fixed case: $ grep -iR somevar .


Also, you can print a list of files that match without printing the matches themselves with grep -l:

$grep -lR someVar .  If you write scripts or batch jobs using the output of the above, use a while loop with read to handle spaces and other special characters in filenames: grep -lR someVar | while IFS= read -r file; do head "$file"
done


If you're using version control for your project, this often includes metadata in the .svn, .git, or .hg directories. This is dealt with easily enough by excluding (grep -v) anything matching an appropriate fixed (grep -F) string:

$grep -R someVar . | grep -vF .svn  Some versions of grep include --exclude and --exclude-dir options, which may be tidier. With all this said, there's a very popular alternative to grep called ack, which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep, and being a Perl script it's otherwise very simple to install. Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep, but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available.   #### [Nov 01, 2017] Cron best practices by Tom Ryder ###### May 08, 2016 | sanctum.geek.nz The time-based job scheduler cron(8) has been around since Version 7 Unix, and its crontab(5) syntax is familiar even for people who don't do much Unix system administration. It's standardised , reasonably flexible, simple to configure, and works reliably, and so it's trusted by both system packages and users to manage many important tasks. However, like many older Unix tools, cron(8) 's simplicity has a drawback: it relies upon the user to know some detail of how it works, and to correctly implement any other safety checking behaviour around it. Specifically, all it does is try and run the job at an appropriate time, and email the output. For simple and unimportant per-user jobs, that may be just fine, but for more crucial system tasks it's worthwhile to wrap a little extra infrastructure around it and the tasks it calls. There are a few ways to make the way you use cron(8) more robust if you're in a situation where keeping track of the running job is desirable. Apply the principle of least privilege The sixth column of a system crontab(5) file is the username of the user as which the task should run: 0 * * * * root cron-task To the extent that is practical, you should run the task as a user with only the privileges it needs to run, and nothing else. This can sometimes make it worthwhile to create a dedicated system user purely for running scheduled tasks relevant to your application. 0 * * * * myappcron cron-task This is not just for security reasons, although those are good ones; it helps protect you against nasties like scripting errors attempting to remove entire system directories . Similarly, for tasks with database systems such as MySQL, don't use the administrative root user if you can avoid it; instead, use or even create a dedicated user with a unique random password stored in a locked-down ~/.my.cnf file, with only the needed permissions. For a MySQL backup task, for example, only a few permissions should be required, including SELECT , SHOW VIEW , and LOCK TABLES . In some cases, of course, you really will need to be root . In particularly sensitive contexts you might even consider using sudo(8) with appropriate NOPASSWD options, to allow the dedicated user to run only the appropriate tasks as root , and nothing else. Test the tasks Before placing a task in a crontab(5) file, you should test it on the command line, as the user configured to run the task and with the appropriate environment set. If you're going to run the task as root , use something like su or sudo -i to get a root shell with the user's expected environment first: $ sudo -i -u cronuser
$cron-task  Once the task works on the command line, place it in the crontab(5) file with the timing settings modified to run the task a few minutes later, and then watch /var/log/syslog with tail -f to check that the task actually runs without errors, and that the task itself completes properly: May 7 13:30:01 yourhost CRON[20249]: (you) CMD (cron-task) This may seem pedantic at first, but it becomes routine very quickly, and it saves a lot of hassles down the line as it's very easy to make an assumption about something in your environment that doesn't actually hold in the one that cron(8) will use. It's also a necessary acid test to make sure that your crontab(5) file is well-formed, as some implementations of cron(8) will refuse to load the entire file if one of the lines is malformed. If necessary, you can set arbitrary environment variables for the tasks at the top of the file: MYVAR=myvalue 0 * * * * you cron-task Don't throw away errors or useful output You've probably seen tutorials on the web where in order to keep the crontab(5) job from sending standard output and/or standard error emails every five minutes, shell redirection operators are included at the end of the job specification to discard both the standard output and standard error. This kluge is particularly common for running web development tasks by automating a request to a URL with curl(1) or wget(1) : */5 * * * root curl https://example.com/cron.php >/dev/null 2>&1 Ignoring the output completely is generally not a good idea, because unless you have other tasks or monitoring ensuring the job does its work, you won't notice problems (or know what they are), when the job emits output or errors that you actually care about. In the case of curl(1) , there are just way too many things that could go wrong, that you might notice far too late: • The script could get broken and return 500 errors. • The URL of the cron.php task could change, and someone could forget to add a HTTP 301 redirect. • Even if a HTTP 301 redirect is added, if you don't use -L or --location for curl(1) , it won't follow it. • The client could get blacklisted, firewalled, or otherwise impeded by automatic or manual processes that falsely flag the request as spam. • If using HTTPS, connectivity could break due to cipher or protocol mismatch. The author has seen all of the above happen, in some cases very frequently. As a general policy, it's worth taking the time to read the manual page of the task you're calling, and to look for ways to correctly control its output so that it emits only the output you actually want. In the case of curl(1) , for example, I've found the following formula works well: curl -fLsS -o /dev/null http://example.com/  • -f : If the HTTP response code is an error, emit an error message rather than the 404 page. • -L : If there's an HTTP 301 redirect given, try to follow it. • -sS : Don't show progress meter ( -S stops -s from also blocking error messages). • -o /dev/null : Send the standard output (the actual page returned) to /dev/null . This way, the curl(1) request should stay silent if everything is well, per the old Unix philosophy Rule of Silence . You may not agree with some of the choices above; you might think it important to e.g. log the complete output of the returned page, or to fail rather than silently accept a 301 redirect, or you might prefer to use wget(1) . The point is that you take the time to understand in more depth what the called program will actually emit under what circumstances, and make it match your requirements as closely as possible, rather than blindly discarding all the output and (worse) the errors. Work with Murphy's law ; assume that anything that can go wrong eventually will. Send the output somewhere useful Another common mistake is failing to set a useful MAILTO at the top of the  crontab(5) file, as the specified destination for any output and errors from the tasks. cron(8) uses the system mail implementation to send its messages, and typically, default configurations for mail agents will simply send the message to an mbox file in  /var/mail/$USER , that they may not ever read. This defeats much of the point of mailing output and errors.

This is easily dealt with, though; ensure that you can send a message to an address you actually do check from the server, perhaps using mail(1) :

$printf '%s\n' 'Test message' | mail -s 'Test subject' you@example.com  Once you've verified that your mail agent is correctly configured and that the mail arrives in your inbox, set the address in a MAILTO variable at the top of your file: MAILTO=you@example.com 0 * * * * you cron-task-1 */5 * * * * you cron-task-2  If you don't want to use email for routine output, another method that works is sending the output to syslog with a tool like logger(1) : 0 * * * * you cron-task | logger -it cron-task  Alternatively, you can configure aliases on your system to forward system mail destined for you on to an address you check. For Postfix, you'd use an aliases(5) file. I sometimes use this setup in cases where the task is expected to emit a few lines of output which might be useful for later review, but send stderr output via MAILTO as normal. If you'd rather not use syslog , perhaps because the output is high in volume and/or frequency, you can always set up a log file /var/log/cron-task.log but don't forget to add a logrotate(8) rule for it! Put the tasks in their own shell script file Ideally, the commands in your crontab(5) definitions should only be a few words, in one or two commands. If the command is running off the screen, it's likely too long to be in the crontab(5) file, and you should instead put it into its own script. This is a particularly good idea if you want to reliably use features of bash or some other shell besides POSIX/Bourne /bin/sh for your commands, or even a scripting language like Awk or Perl; by default, cron(8) uses the system's /bin/sh implementation for parsing the commands. Because crontab(5) files don't allow multi-line commands, and have other gotchas like the need to escape percent signs % with backslashes, keeping as much configuration out of the actual crontab(5) file as you can is generally a good idea. If you're running cron(8) tasks as a non-system user, and can't add scripts into a system bindir like /usr/local/bin , a tidy method is to start your own, and include a reference to it as part of your PATH . I favour ~/.local/bin , and have seen references to ~/bin as well. Save the script in ~/.local/bin/cron-task , make it executable with chmod +x , and include the directory in the PATH environment definition at the top of the file: PATH=/home/you/.local/bin:/usr/local/bin:/usr/bin:/bin MAILTO=you@example.com 0 * * * * you cron-task  Having your own directory with custom scripts for your own purposes has a host of other benefits, but that's another article Avoid /etc/crontab If your implementation of cron(8) supports it, rather than having an /etc/crontab file a mile long, you can put tasks into separate files in /etc/cron.d : $ ls /etc/cron.d
system-a
system-b
raid-maint


This approach allows you to group the configuration files meaningfully, so that you and other administrators can find the appropriate tasks more easily; it also allows you to make some files editable by some users and not others, and reduces the chance of edit conflicts. Using sudoedit(8) helps here too. Another advantage is that it works better with version control; if I start collecting more than a few of these task files or to update them more often than every few months, I start a Git repository to track them:

$cd /etc/cron.d$ sudo git init
$sudo git add --all$ sudo git commit -m "First commit"


If you're editing a crontab(5) file for tasks related only to the individual user, use the crontab(1) tool; you can edit your own crontab(5) by typing crontab -e , which will open your $EDITOR to edit a temporary file that will be installed on exit. This will save the files into a dedicated directory, which on my system is /var/spool/cron/crontabs . On the systems maintained by the author, it's quite normal for /etc/crontab never to change from its packaged template. Include a timeout cron(8) will normally allow a task to run indefinitely, so if this is not desirable, you should consider either using options of the program you're calling to implement a timeout, or including one in the script. If there's no option for the command itself, the timeout(1) command wrapper in coreutils is one possible way of implementing this: 0 * * * * you timeout 10s cron-task  Greg's wiki has some further suggestions on ways to implement timeouts . Include file locking to prevent overruns cron(8) will start a new process regardless of whether its previous runs have completed, so if you wish to avoid locking for long-running task, on GNU/Linux you could use the flock(1) wrapper for the flock(2) system call to set an exclusive lockfile, in order to prevent the task from running more than one instance in parallel. 0 * * * * you flock -nx /var/lock/cron-task cron-task  Greg's wiki has some more in-depth discussion of the file locking problem for scripts in a general sense, including important information about the caveats of "rolling your own" when flock(1) is not available. If it's important that your tasks run in a certain order, consider whether it's necessary to have them in separate tasks at all; it may be easier to guarantee they're run sequentially by collecting them in a single shell script. Do something useful with exit statuses If your cron(8) task or commands within its script exit non-zero, it can be useful to run commands that handle the failure appropriately, including cleanup of appropriate resources, and sending information to monitoring tools about the current status of the job. If you're using Nagios Core or one of its derivatives, you could consider using send_nsca to send passive checks reporting the status of jobs to your monitoring server. I've written a simple script called nscaw to do this for me: 0 * * * * you nscaw CRON_TASK -- cron-task  Consider alternatives to cron(8) If your machine isn't always on and your task doesn't need to run at a specific time, but rather needs to run once daily or weekly, you can install anacron and drop scripts into the cron.hourly , cron.daily , cron.monthly , and cron.weekly directories in /etc , as appropriate. Note that on Debian and Ubuntu GNU/Linux systems, the default /etc/crontab contains hooks that run these, but they run only if anacron(8) is not installed. If you're using cron(8) to poll a directory for changes and run a script if there are such changes, on GNU/Linux you could consider using a daemon based on inotifywait(1) instead. Finally, if you require more advanced control over when and how your task runs than cron(8) can provide, you could perhaps consider writing a daemon to run on the server consistently and fork processes for its task. This would allow running a task more often than once a minute, as an example. Don't get too bogged down into thinking that cron(8) is your only option for any kind of asynchronous task management! #### [Nov 01, 2017] Listing files ###### www.tecmint.com Using ls is probably one of the first commands an administrator will learn for getting a simple list of the contents of the directory. Most administrators will also know about the -a and -l switches, to show all files including dot files and to show more detailed data about files in columns, respectively. There are other switches to GNU ls which are less frequently used, some of which turn out to be very useful for programming: • -t - List files in order of last modification date, newest first. This is useful for very large directories when you want to get a quick list of the most recent files changed, maybe piped through head or sed 10q. Probably most useful combined with -l. If you want the oldest files, you can add -r to reverse the list. • -X - Group files by extension; handy for polyglot code, to group header files and source files separately, or to separate source files from directories or build files. • -v - Naturally sort version numbers in filenames. • -S - Sort by filesize. • -R - List files recursively. This one is good combined with -l and piped through a pager like less. Since the listing is text like anything else, you could, for example, pipe the output of this command into a vim process, so you could add explanations of what each file is for and save it as an inventory file or add it to a README: $ ls -XR | vim -


This kind of stuff can even be automated by make with a little work, which I'll cover in another article later in the series.

#### [Nov 01, 2017] Default grep options by Tom Ryder

###### May 18, 2012 | sanctum.geek.nz

When you're searching a set of version-controlled files for a string with grep , particularly if it's a recursive search, it can get very annoying to be presented with swathes of results from the internals of the hidden version control directories like .svn or .git , or include metadata you're unlikely to have wanted in files like .gitmodules .

GNU grep uses an environment variable named GREP_OPTIONS to define a set of options that are always applied to every call to grep . This comes in handy when exported in your .bashrc file to set a "standard" grep environment for your interactive shell. Here's an example of a definition of GREP_OPTIONS that excludes a lot of patterns which you'd very rarely if ever want to search with grep :

GREP_OPTIONS=
for pattern in .cvs .git .hg .svn; do
GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern
done
export GREP_OPTIONS


Note that --exclude-dir is a relatively recent addition to the options for GNU grep , but it should only be missing on very legacy GNU/Linux machines by now. If you want to keep your .bashrc file compatible, you could apply a little extra hackery to make sure the option is available before you set it up to be used:

GREP_OPTIONS=
if grep --help | grep -- --exclude-dir &>/dev/null; then
for pattern in .cvs .git .hg .svn; do
GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern"
done
fi
export GREP_OPTIONS


Similarly, you can ignore single files with --exclude . There's also --exclude-from=FILE if your list of excluded patterns starts getting too long.

Other useful options available in GNU grep that you might wish to add to this environment variable include:

• --color -- On appropriate terminal types, highlight the pattern matches in output, among other color changes that make results more readable
• -s -- Suppresses error messages about files not existing or being unreadable; helps if you find this behaviour more annoying than useful.
• -E, -F, or -P -- Pick a favourite "mode" for grep ; devotees of PCRE may find adding -P for grep 's experimental PCRE support makes grep behave in a much more pleasing way, even though it's described in the manual as being experimental and incomplete

If you don't want to use GREP_OPTIONS , you could instead simply set up an  alias :

alias grep='grep --exclude-dir=.git'


You may actually prefer this method as it's essentially functionally equivalent, but if you do it this way, when you want to call grep without your standard set of options, you only have to prepend a backslash to its call:

$\grep pattern file  Commenter Andy Pearce also points out that using this method can avoid some build problems where GREP_OPTIONS would interfere. Of course, you could solve a lot of these problems simply by using ack but that's another post. Posted in Bash Tagged ack , alias , color , default , environment , exclude , grep , grep_options , options , pcre , variable , version control #### [Oct 31, 2017] Bash job control by Tom Ryder ###### Jan 31, 2012 | sanctum.geek.nz Oftentimes you may wish to start a process on the Bash shell without having to wait for it to actually complete, but still be notified when it does. Similarly, it may be helpful to temporarily stop a task while it's running without actually quitting it, so that you can do other things with the terminal. For these kinds of tasks, Bash's built-in job control is very useful. Backgrounding processes If you have a process that you expect to take a long time, such as a long cp or scp operation, you can start it in the background of your current shell by adding an ampersand to it as a suffix: $ cp -r /mnt/bigdir /home &
[1] 2305


This will start the copy operation as a child process of your bash instance, but will return you to the prompt to enter any other commands you might want to run while that's going.

The output from this command shown above gives both the job number of 1, and the process ID of the new task, 2305. You can view the list of jobs for the current shell with the builtin jobs :

$jobs [1]+ Running cp -r /mnt/bigdir /home &  If the job finishes or otherwise terminates while it's backgrounded, you should see a message in the terminal the next time you update it with a newline: [1]+ Done cp -r /mnt/bigdir /home &  Foregrounding processes If you want to return a job in the background to the foreground, you can type fg : $ fg
cp -r /mnt/bigdir /home &


If you have more than one job backgrounded, you should specify the particular job to bring to the foreground with a parameter to fg :

$fg %1  In this case, for shorthand, you can optionally omit fg and it will work just the same: $ %1

Suspending processes

To temporarily suspend a process, you can press Ctrl+Z:

$cp -r /mnt/bigdir /home ^Z [1]+ Stopped cp -r /mnt/bigdir /home  You can then continue it in the foreground or background with fg %1 or bg %1 respectively, as above. This is particularly useful while in a text editor; instead of quitting the editor to get back to a shell, or dropping into a subshell from it, you can suspend it temporarily and return to it with fg once you're ready. Dealing with output While a job is running in the background, it may still print its standard output and standard error streams to your terminal. You can head this off by redirecting both streams to /dev/null for verbose commands: $ cp -rv /mnt/bigdir /home &>/dev/null


However, if the output of the task is actually of interest to you, this may be a case where you should fire up another terminal emulator, perhaps in GNU Screen or tmux , rather than using simple job control.

Suspending SSH sessions

As a special case, you can suspend an SSH session using an SSH escape sequence . Type a newline followed by a ~ character, and finally press Ctrl+Z to background your SSH session and return to the terminal from which you invoked it.

tom@conan:~$ssh crom tom@crom:~$ ~^Z [suspend ssh]
[1]+  Stopped  ssh crom
tom@conan:~$ You can then resume it as you would any job by typing fg : tom@conan:~$ fg %1
ssh crom
tom@crom:~$ #### [Oct 31, 2017] Elegant Awk usage by Tom Ryder ##### It's better to use Perl for this pupose... ###### Feb 06, 2012 | sanctum.geek.nz For many system administrators, Awk is used only as a way to print specific columns of data from programs that generate columnar output, such as netstat or ps . For example, to get a list of all the IP addresses and ports with open TCP connections on a machine, one might run the following: # netstat -ant | awk '{print$5}'

This works pretty well, but among the data you actually wanted it also includes the fifth word of the opening explanatory note, and the heading of the fifth column:

and
0.0.0.0:*
205.188.17.70:443
172.20.0.236:5222
72.14.203.125:5222

There are varying ways to deal with this.

Matching patterns

One common way is to pipe the output further through a call to grep , perhaps to only include results with at least one number:

# netstat -ant | awk '{print $5}' | grep '[0-9]' In this case, it's instructive to use the awk call a bit more intelligently by setting a regular expression which the applicable line must match in order for that field to be printed, with the standard / characters as delimiters. This eliminates the need for the call to grep : # netstat -ant | awk '/[0-9]/ {print$5}'

We can further refine this by ensuring that the regular expression should only match data in the fifth column of the output, using the ~ operator:

# netstat -ant | awk '$5 ~ /[0-9]/ {print$5}'

Skipping lines

Another approach you could take to strip the headers out might be to use sed to skip the first two lines of the output:

# netstat -ant | awk '{print $5}' | sed 1,2d  However, this can also be incorporated into the awk call, using the NR variable and making it part of a conditional checking the line number is greater than two: # netstat -ant | awk 'NR>2 {print$5}'

Combining and excluding patterns

Another common idiom on systems that don't have the special pgrep command is to filter ps output for a string, but exclude the grep process itself from the output with grep -v grep :

# ps -ef | grep apache | grep -v grep | awk '{print $2}'  If you're using Awk to get columnar data from the output, in this case the second column containing the process ID, both calls to grep can instead be incorporated into the awk call: # ps -ef | awk '/apache/ && !/awk/ {print$2}'


Again, this can be further refined if necessary to ensure you're only matching the expressions against the command name by specifying the field number for each comparison:

# ps -ef | awk '$8 ~ /apache/ &&$8 !~ /awk/ {print $2}'  If you're used to using Awk purely as a column filter, the above might help to increase its utility for you and allow you to write shorter and more efficient command lines. The Awk Primer on Wikibooks is a really good reference for using Awk to its fullest for the sorts of tasks for which it's especially well-suited. #### [Oct 31, 2017] Counting with grep and uniq by Tom Ryder ###### Feb 18, 2012 | sanctum.geek.nz A common idiom in Unix is to count the lines of output in a file or pipe with wc -l : $ wc -l example.txt
43
$ps -e | wc -l 97  Sometimes you want to count the number of lines of output from a grep call, however. You might do it this way: $ ps -ef | grep apache | wc -l
6


But grep has built-in counting of its own, with the -c option:

$ps -ef | grep -c apache 6 The above is more a matter of good style than efficiency, but another tool with a built-in counting option that could save you time is the oft-used uniq . The below example shows a use of uniq to filter a sorted list into unique rows: $ ps -ef | awk '{print $1}' | sort | uniq 105 daemon lp mysql nagios postfix root snmp tom UID www-data  If it would be useful to know in this case how many processes were being run by each of these users, you can include the -c option for uniq : $ ps -ef | awk '{print $1}' | sort | uniq -c 1 105 1 daemon 1 lp 1 mysql 1 nagios 2 postfix 78 root 1 snmp 7 tom 1 UID 5 www-data  You could even sort this output itself to show the users running the most processes first with sort -rn : $ ps -ef | awk '{print $1}' | sort | uniq -c | sort -rn 78 root 8 tom 5 www-data 2 postfix 1 UID 1 snmp 1 nagios 1 mysql 1 lp 1 daemon 1 105  Incidentally, if you're not counting results and really do just want a list of unique users, you can leave out the uniq and just add the -u flag to sort : $ ps -ef | awk '{print $1}' | sort -u 105 daemon lp mysql nagios postfix root snmp tom UID www-data  The above means I actually find myself using uniq with no options quite seldom. #### [Oct 31, 2017] 256 colour terminals by Tom Ryder ##### Notable quotes: ##### "... An earlier version of this post suggested changing the TERM definition in .bashrc , which is generally not a good idea, even if bounded with conditionals as my example was. You should always set the terminal string in the emulator itself if possible, if you do it at all. ..." ##### "... Similarly, to use 256 colours in GNU Screen, add the following to your .screenrc : ..." ###### February 23, 2012 | sanctum.geek.nz Using 256 colours in terminals is well-supported in GNU/Linux distributions these days, and also in Windows terminal emulators like PuTTY. Using 256 colours is great for Vim colorschemes in particular, but also very useful for Tmux colouring or any other terminal application where a slightly wider colour space might be valuable. Be warned that once you get this going reliably, there's no going back if you spend a lot of time in the terminal. Xterm To set this up for xterm or emulators that use xterm as the default value for $TERM , such as xfce4-terminal or gnome-terminal , it generally suffices to check the options for your terminal emulator to ensure that it will allow 256 colors, and then use the TERM string xterm-256color for it.

An earlier version of this post suggested changing the TERM definition in .bashrc , which is generally not a good idea, even if bounded with conditionals as my example was. You should always set the terminal string in the emulator itself if possible, if you do it at all.

Be aware that older systems may not have terminfo definitions for this terminal, but you can always copy them in using a private .terminfo directory if need be.

Tmux

To use 256 colours in Tmux, you should set the default terminal in .tmux.conf to be screen-256color :

set -g default-terminal "screen-256color"


This will allow you to use color definitions like colour231 in your status lines and other configurations. Again, this particular terminfo definition may not be present on older systems, so you should copy it into ~/.terminfo/s/screen-256color on those systems if you want to use it everywhere.

GNU Screen

Similarly, to use 256 colours in GNU Screen, add the following to your .screenrc :

term screen-256color
Vim

With the applicable options from the above set, you should not need to change anything in Vim to be able to use 256-color colorschemes. If you're wanting to write or update your own 256-colour compatible scheme, it should either begin with set t_Co=256 , or more elegantly, check the value of the corresponding option value is &t_Co is 256 before trying to use any of the extra colour set.

The Vim Tips Wiki contains a detailed reference of the colour codes for schemes in 256-color terminals.

#### [Oct 22, 2017] Unix text editing - sed, tr, cut, od

###### Oct 22, 2017 | seismo.berkeley.edu

A tr script to remove all non-printing characters from a file is below. Non-printing characters may be invisible, but cause problems with printing or sending the file via electronic mail. You run it from Unix command prompt, everything on one line:

> tr -d '\001'-'\011''\013''\014''\016'-'\037''\200'-'\377'
< filein > fileout

What is the meaning of this tr script is, that it deletes all charactes with octal value from 001 to 011, characters 013, 014, characters from 016 to 037 and characters from 200 to 377. Other characters are copied over from filein to fileout and these are printable. Please remember, you can not fold a line containing tr command, everything must be on one line, how long it would be. In practice, this script solves some mysterious Unix printing problems.

Type in a text file named "f127.TR" with the line starting tr above. Print the file on screen with cat f127.TR command, replace "filein" and "fileout" with your file names, not same the file, then copy and paste the line and run (execute) it. Please, remember this does not solve Unix end-of-file problem, that is the character '\000', also known as a 'null', in the file. Nor does it handle binary file problem, that is a file starting with two zeroes '\060' and '\060'

Sometimes there are some invisible characters causing havoc. This tr command line converts tabulate- characters into hashes (#) and formfeed- characters into stars (*).

> tr '\011\014' '#*'  < filein > fileout

The numeric value of tabulate is 9, hex 09, octal 011 and in C-notation it is \t or \011. Formfeed is 12, hex 0C, octal 014 and in C-notation it is \f or \014. Please note, tr replaces character from the first (leftmost) group with corresponding character in the second group. Characters in octal format, like \014 are counted as one character each.

#### [Oct 01, 2017] How to Use Script Command To Record Linux Terminal Session

###### Oct 01, 2017 | linoxide.com

script command records a shell session for you so that you can look at the output that you saw at the time and you can even record with timing so that you can have a real-time playback. It is really useful and comes in handy in the strangest kind of times and places.

The script command keeps action log for various tasks. The script records everything in a session such as things you type, things you see. To do this you just type script command on the terminal and type exit when finished. Everything between the script and the exit command is logged to the file. This includes the confirmation messages from script itself.

script makes a typescript of everything printed on your terminal. If the argument file is given, script saves all dialogue in the indicated file in the current directory. If no file name is given, the typescript is saved in default file typescript. To record your shell session so what you are doing in the current shell, just use the command below

# script shell_record1
Script started, file is shell_record1


It indicates that a file shell_record1 is created. Let's check the file

# ls -l shell_*
-rw-r--r-- 1 root root 0 Jun 9 17:50 shell_record1


After completion of your task, you can enter exit or Ctrl-d to close down the script session and save the file.

# exit
exit
Script done, file is shell_record1


You can see that script indicates the filename.

2. Check the content of a recorded terminal session

When you use script command, it records everything in a session such as things you type so all your output. As the output is saved into a file, it is possible after to check its content after existing a recorded session. You can simply use a text editor command or a text file command viewer.

# cat shell_record1
Script started on Fri 09 Jun 2017 06:23:41 PM UTC
[root@centos-01 ~]# date
Fri Jun 9 18:23:46 UTC 2017
[root@centos-01 ~]# uname -a
Linux centos-01 3.10.0-514.16.1.el7.x86_64 #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@centos-01 ~]# whoami
root
[root@centos-01 ~]# pwd
/root
[root@centos-01 ~]# exit
exit

Script done on Fri 09 Jun 2017 06:25:11 PM UTC


While you view the file you realize that the script also stores line feeds and backspaces. It also indicates the time of the recording to the top and the end of the file.

3. Record several terminal session

You can record several terminal session as you want. When you finish a record, just begin another new session record. It can be helpful if you want to record several configurations that you are doing to show it to your team or students for example. You just need to name each recording file.

For example, let us assume that you have to do OpenLDAP , DNS , Machma configurations. You will need to record each configuration. To do this, just create recording file corresponding to each configuration when finished.

# script openldap_record
...............
configuration step
..............
# exit


When you have finished with the first configuration, begin to record the next configuration

# script machma_record
............
configuration steps
.............
# exit


And so on for the other. Note that if you script command followed by existing filename, the file will be replaced. So you will lost everything.

Now, let us imagine that you have begun Machma configuration but you have to abort its configuration in order to finish DNS configuration because of some emergency case. Now you want to continue the machma configuration where you left. It means you want to record the next steps into the existing file machma_record without deleting its previous content; to do this you will use script -a command to append the new output to the file.

This is the content of our recorded file

Now if we want to continue our recording in this file without deleting the content already present, we will do

# script -a machma_record
Script started, file is machma_record


Now continue the configuration, then exit when finished and let's check the content of the recorded file.

Note the new time of the new record which appears. You can see that the file has the previous and actual records.

4. Replay a linux terminal session

We have seen that it is possible to see the content of the recorded file with commands to display a text file content. The script command also gives the possibility to see the recorded session as a video. It means that you will review exactly what you have done step by step at the moment you were entering the commands as if you were looking a video. So you will playback/replay the recorded terminal session.

To do it, you have to use --timing option of script command when you will start the record.

# script --timing=file_time shell_record1
Script started, file is shell_record1


See that the file into which to record is shell_record1. When the record is finished, exit normally

# exit
exit
Script done, file is shell_record1


Let's see check the content of file_time

# cat file_time
0.807440 49
0.030061 1
116.131648 1
0.226914 1
0.033997 1
0.116936 1
0.104201 1
0.392766 1
0.301079 1
0.112105 2
0.363375 152


The --timing option outputs timing data to the file indicated. This data contains two fields, separated by a space which indicates how much time elapsed since the previous output how many characters were output this time. This information can be used to replay typescripts with realistic typing and output delays.

Now to replay the terminal session, we use scriptreplay command instead of script command with the same syntax when recording the session. Look below

# scriptreplay --timing=file_time shell_record1


You will that the recorded session with be played as if you were looking a video which was recording all that you were doing. You can just insert the timing file without indicating all the --timing=file_time. Look below

# scriptreplay file_time shell_record1


So you understand that the first parameter is the timing file and the second is the recorded file.

Conclusion

The script command can be your to-go tool for documenting your work and showing others what you did in a session. It can be used as a way to log what you are doing in a shell session. When you run script, a new shell is forked. It reads standard input and output for your terminal tty and stores the data in a file.

#### [Aug 28, 2017] Rsync over ssh with root access on both sides

###### Aug 28, 2017 | serverfault.com

I have one older ubuntu server, and one newer debian server and I am migrating data from the old one to the new one. I want to use rsync to transfer data across to make final migration easier and quicker than the equivalent tar/scp/untar process.

As an example, I want to sync the home folders one at a time to the new server. This requires root access at both ends as not all files at the source side are world readable and the destination has to be written with correct permissions into /home. I can't figure out how to give rsync root access on both sides.

I've seen a few related questions, but none quite match what I'm trying to do.

I have sudo set up and working on both servers. ubuntu ssh debian rsync root

 share improve this question asked Apr 28 '10 at 9:18 Tim Abell 732 20
up vote down vote accepted Actually you do NOT need to allow root authentication via SSH to run rsync as Antoine suggests. The transport and system authentication can be done entirely over user accounts as long as you can run rsync with sudo on both ends for reading and writing the files.

As a user on your destination server you can suck the data from your source server like this:

sudo rsync -aPe ssh --rsync-path='sudo rsync' boron:/home/fred /home/


The user you run as on both servers will need passwordless* sudo access to the rsync binary, but you do NOT need to enable ssh login as root anywhere. If the user you are using doesn't match on the other end, you can add user@boron: to specify a different remote user.

Good luck.

*or you will need to have entered the password manually inside the timeout window.

 share improve this answer edited Jun 30 '10 at 13:51 answered Apr 28 '10 at 22:06 Caleb 9,089 27 43
 1
Although this is an old question I'd like to add word of CAUTION to this accepted answer. From my understanding allowing passwordless "sudo rsync" is equivalent to open the root account to remote login. This is because with this it is very easy to gain full root access, e.g. because all system files can be downloaded, modified and replaced without a password. – Ascurion Jan 8 '16 at 16:30
up vote down vote If your data is not highly sensitive, you could use tar and socat. In my experience this is often faster as rsync over ssh.

You need socat or netcat on both sides.

On the target host, go to the directory where you would like to put your data, after that run: socat TCP-LISTEN:4444 - | tar xzf -

If the target host is listening, start it on the source like: tar czf - /home/fred /home/ | socat - TCP:ip-of-remote-server:4444

For this setup you'll need a reliably connection between the 2 servers.

 share improve this answer answered Apr 28 '10 at 21:20 Jeroen Moors
Good point. In a trusted environment, you'll pick up a lot of speed by not encrypting. It might not matter on small files, but with GBs of data it will. – pboin May 18 '10 at 10:53
 up vote down vote Ok, i've pieced together all the clues to get something that works for me. Lets call the servers "src" & "dst". Set up a key pair for root on the destination server, and copy the public key to the source server: dest $sudo -i dest # ssh-keygen dest # exit dest$ scp /root/id_rsa.pub src:  Add the public key to root's authorized keys on the source server src $sudo -i src # cp /home/tim/id_rsa.pub .ssh/authorized_keys  Back on the destination server, pull the data across with rsync: dest$ sudo -i dest # rsync -aP src:/home/fred /home/ 

#### [Aug 28, 2017] Unix Rsync Copy Hidden Dot Files and Directories Only by Vivek Gite

###### Feb 06, 2014 | www.cyberciti.biz
November 9, 2012 February 6, 2014 in Categories Commands , File system , Linux , UNIX last updated February 6, 2014

How do I use the rsync tool to copy only the hidden files and directory (such as ~/.ssh/, ~/.foo, and so on) from /home/jobs directory to the /mnt/usb directory under Unix like operating system?

The rsync program is used for synchronizing files over a network or local disks. To view or display only hidden files with ls command:

ls -ld ~/.??*

OR

ls -ld ~/.[^.]*

Sample outputs:

rsync not synchronizing all hidden .dot files?

In this example, you used the pattern .[^.]* or .??* to select and display only hidden files using ls command . You can use the same pattern with any Unix command including rsync command. The syntax is as follows to copy hidden files with rsync:

 rsync -av /path/to/dir/.??* /path/to/dest rsync -avzP /path/to/dir/.??* /mnt/usb rsync -avzP $HOME/.??* user1@server1.cyberciti.biz:/path/to/backup/users/u/user1 rsync -avzP ~/.[^.]* user1@server1.cyberciti.biz:/path/to/backup/users/u/user1  rsync -av /path/to/dir/.??* /path/to/dest rsync -avzP /path/to/dir/.??* /mnt/usb rsync -avzP$HOME/.??* user1@server1.cyberciti.biz:/path/to/backup/users/u/user1 rsync -avzP ~/.[^.]* user1@server1.cyberciti.biz:/path/to/backup/users/u/user1

In this example, copy all hidden files from my home directory to /mnt/test:

 rsync -avzP ~/.[^.]* /mnt/test 

rsync -avzP ~/.[^.]* /mnt/test

Sample outputs:

Vivek Gite is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter , Facebook , Google+ .

#### [Aug 28, 2017] rsync doesn't copy files with restrictive permissions

###### Aug 28, 2017 | superuser.com
up vote down vote favorite Trying to copy files with rsync, it complains:
rsync: send_files failed to open "VirtualBox/Machines/Lubuntu/Lubuntu.vdi" \
(in media): Permission denied (13)


That file is not copied. Indeed the file permissions of that file are very restrictive on the server side:

-rw-------    1 1000     1000     3133181952 Nov  1  2011 Lubuntu.vdi


I call rsync with

sudo rsync -av --fake-super root@sheldon::media /mnt/media


The rsync daemon runs as root on the server. root can copy that file (of course). rsyncd has "fake super = yes" set in /etc/rsyncd.conf.

What can I do so that the file is copied without changing the permissions of the file on the server? rsync file-permissions

 share improve this question asked Dec 29 '12 at 10:15 Torsten Bronger 207
If you use RSync as daemon on destination, please post grep rsync /var/log/daemon  to improve your question – F. Hauri Dec 29 '12 at 13:23
up vote down vote As you appear to have root access to both servers have you tried a: --force ?

Alternatively you could bypass the rsync daemon and try a direct sync e.g.

rsync -optg --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose --recursive --delete-after --force  root@sheldon::media /mnt/media

 share improve this answer edited Jan 2 '13 at 10:55 answered Dec 29 '12 at 13:21 arober11 376
Using ssh means encryption, which makes things slower. --force does only affect directories, if I read the man page correctly. – Torsten Bronger Jan 1 '13 at 23:08
Unless your using ancient kit, the CPU overhead of encrypting / decrypting the traffic shouldn't be noticeable, but you will loose 10-20% of your bandwidth, through the encapsulation process. Then again 80% of a working link is better than 100% of a non working one :) – arober11 Jan 2 '13 at 10:52
do have an "ancient kit". ;-) (Slow ARM CPU on a NAS.) But I now mount the NAS with NFS and use rsync (with "sudo") locally. This solves the problem (and is even faster). However, I still think that my original problem must be solvable using the rsync protocol (remote, no ssh). – Torsten Bronger Jan 4 '13 at 7:55

#### [Aug 28, 2017] Using rsync under target user to copy home directories

###### Aug 28, 2017 | unix.stackexchange.com

nixnotwin , asked Sep 21 '12 at 5:11

On my Ubuntu server there are about 150 shell accounts. All usernames begin with the prefix u12.. I have root access and I am trying to copy a directory named "somefiles" to all the home directories. After copying the directory the user and group ownership of the directory should be changed to user's. Username, group and home-dir name are same. How can this be done?

Gilles , answered Sep 21 '12 at 23:44

Do the copying as the target user. This will automatically make the target files. Make sure that the original files are world-readable (or at least readable by all the target users). Run chmod afterwards if you don't want the copied files to be world-readable.
getent passwd |
awk -F : '$1 ~ /^u12/ {print$1}' |
while IFS= read -r user; do
su "$user" -c 'cp -Rp /original/location/somefiles ~/' done  #### [Aug 28, 2017] rsync over SSH preserve ownership only for www-data owned files ###### Aug 28, 2017 | stackoverflow.com up vote 10 down vote favorite 4 jeffery_the_wind , asked Mar 6 '12 at 15:36 I am using rsync to replicate a web folder structure from a local server to a remote server. Both servers are ubuntu linux. I use the following command, and it works well: rsync -az /var/www/ user@10.1.1.1:/var/www/  The usernames for the local system and the remote system are different. From what I have read it may not be possible to preserve all file and folder owners and groups. That is OK, but I would like to preserve owners and groups just for the www-data user, which does exist on both servers. Is this possible? If so, how would I go about doing that? Thanks! ** EDIT ** There is some mention of rsync being able to preserve ownership and groups on remote file syncs here: http://lists.samba.org/archive/rsync/2005-August/013203.html ** EDIT 2 ** I ended up getting the desired affect thanks to many of the helpful comments and answers here. Assuming the IP of the source machine is 10.1.1.2 and the IP of the destination machine is 10.1.1.1. I can use this line from the destination machine: sudo rsync -az user@10.1.1.2:/var/www/ /var/www/  This preserves the ownership and groups of the files that have a common user name, like www-data. Note that using rsync  without sudo  does not preserve these permissions. ghoti , answered Mar 6 '12 at 19:01 You can also sudo the rsync on the target host by using the --rsync-path  option: # rsync -av --rsync-path="sudo rsync" /path/to/files user@targethost:/path  This lets you authenticate as user  on targethost, but still get privileged write permission through sudo  . You'll have to modify your sudoers file on the target host to avoid sudo's request for your password. man sudoers  or run sudo visudo  for instructions and samples. You mention that you'd like to retain the ownership of files owned by www-data, but not other files. If this is really true, then you may be out of luck unless you implement chown  or a second run of rsync  to update permissions. There is no way to tell rsync to preserve ownership for just one user . That said, you should read about rsync's --files-from  option. rsync -av /path/to/files user@targethost:/path find /path/to/files -user www-data -print | \ rsync -av --files-from=- --rsync-path="sudo rsync" /path/to/files user@targethost:/path  I haven't tested this, so I'm not sure exactly how piping find's output into --files-from=-  will work. You'll undoubtedly need to experiment. xato , answered Mar 6 '12 at 15:39 As far as I know, you cannot chown  files to somebody else than you, if you are not root. So you would have to rsync  using the www-data  account, as all files will be created with the specified user as owner. So you need to chown  the files afterwards. user2485267 , answered Jun 14 '13 at 8:22 I had a similar problem and cheated the rsync command, rsync -avz --delete root@x.x.x.x:/home//domains/site/public_html/ /home/domains2/public_html && chown -R wwwusr:wwwgrp /home/domains2/public_html/ the && runs the chown against the folder when the rsync completes successfully (1x '&' would run the chown regardless of the rsync completion status) Graham , answered Mar 6 '12 at 15:51 The root users for the local system and the remote system are different. What does this mean? The root user is uid 0. How are they different? Any user with read permission to the directories you want to copy can determine what usernames own what files. Only root can change the ownership of files being written . You're currently running the command on the source machine, which restricts your writes to the permissions associated with user@10.1.1.1. Instead, you can try to run the command as root on the target machine. Your read access on the source machine isn't an issue. So on the target machine (10.1.1.1), assuming the source is 10.1.1.2: # rsync -az user@10.1.1.2:/var/www/ /var/www/  Make sure your groups match on both machines. Also, set up access to user@10.1.1.2 using a DSA or RSA key, so that you can avoid having passwords floating around. For example, as root on your target machine, run: # ssh-keygen -d  Then take the contents of the file /root/.ssh/id_dsa.pub  and add it to ~user/.ssh/authorized_keys  on the source machine. You can ssh user@10.1.1.2  as root from the target machine to see if it works. If you get a password prompt, check your error log to see why the key isn't working. ghoti , answered Mar 6 '12 at 18:54 Well, you could skip the challenges of rsync altogether, and just do this through a tar tunnel. sudo tar zcf - /path/to/files | \ ssh user@remotehost "cd /some/path; sudo tar zxf -"  You'll need to set up your SSH keys as Graham described. Note that this handles full directory copies, not incremental updates like rsync. The idea here is that: • you tar up your directory, • instead of creating a tar file, you send the tar output to stdout, • that stdout is piped through an SSH command to a receiving tar on the other host, • but that receiving tar is run by sudo, so it has privileged write access to set usernames. #### [Aug 28, 2017] rsync and file permissions ###### Aug 28, 2017 | superuser.com up vote down vote favorite I'm trying to use rsync to copy a set of files from one system to another. I'm running the command as a normal user (not root). On the remote system, the files are owned by apache and when copied they are obviously owned by the local account (fred). My problem is that every time I run the rsync command, all files are re-synched even though they haven't changed. I think the issue is that rsync sees the file owners are different and my local user doesn't have the ability to change ownership to apache, but I'm not including the -a  or -o  options so I thought this would not be checked. If I run the command as root, the files come over owned by apache and do not come a second time if I run the command again. However I can't run this as root for other reasons. Here is the command: /usr/bin/rsync --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose root@server.example.com:/src/dir/ /local/dir  unix rsync  share improve this question edited May 2 '11 at 23:53 Gareth 13.9k 11 44 58 asked May 2 '11 at 23:43 Fred Snertz 11 Why can't you run rsync as root? On the remote system, does fred have read access to the apache-owned files? – chrishiestand May 3 '11 at 0:32 Ah, I left out the fact that there are ssh keys set up so that local fred can become remote root, so yes fred/root can read them. I know this is a bit convoluted but its real. – Fred Snertz May 3 '11 at 14:50 Always be careful when root can ssh into the machine. But if you have password and challenge response authentication disabled it's not as bad. – chrishiestand May 3 '11 at 17:32 add a comment | 1 Answer active oldest votes up vote down vote Here's the answer to your problem: -c, --checksum This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a "quick check" that (by default) checks if each file's size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly. The sending side generates its checksums while it is doing the file-system scan that builds the list of the available files. The receiver generates its checksums when it is scanning for changed files, and will checksum any file that has the same size as the corresponding sender's file: files with either a changed size or a changed checksum are selected for transfer. Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check. For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5. For older protocols, the checksum used is MD4.  So run: /usr/bin/rsync -c --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose root@server.example.com:/src/dir/ /local/dir  Note there may be a time+disk churn tradeoff by using this option. Personally, I'd probably just sync the file's mtimes too: /usr/bin/rsync -t --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose root@server.example.com:/src/dir/ /local/dir   share improve this answer edited May 3 '11 at 17:55 answered May 3 '11 at 17:48 chrishiestand 1,098 10 Awesome. Thank you. Looks like the second option is going to work for me and I found the first very interesting. – Fred Snertz May 3 '11 at 18:40 psst, hit the green checkbox to give my answer credit ;-) Thx. – chrishiestand May 12 '11 at 1:56 #### [Aug 28, 2017] Why does rsync fail to copy files from /sys in Linux? ##### Notable quotes: ##### "... pseudo file system ..." ##### "... pseudo filesystems ..." ###### Aug 28, 2017 | unix.stackexchange.com Eugene Yarmash , asked Apr 24 '13 at 16:35 I have a bash script which uses rsync  to backup files in Archlinux. I noticed that rsync  failed to copy a file from /sys  , while cp  worked just fine: # rsync /sys/class/net/enp3s1/address /tmp rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61) rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61) ERROR: address failed verification -- update discarded. rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] # cp /sys/class/net/enp3s1/address /tmp ## this works  I wonder why does rsync  fail, and is it possible to copy the file with it? mattdm , answered Apr 24 '13 at 18:20 Rsync has code which specifically checks if a file is truncated during read and gives this error ! ENODATA  . I don't know why the files in /sys  have this behavior, but since they're not real files, I guess it's not too surprising. There doesn't seem to be a way to tell rsync to skip this particular check. I think you're probably better off not rsyncing /sys  and using specific scripts to cherry-pick out the particular information you want (like the network card address). Runium , answered Apr 25 '13 at 0:23 First off /sys  is a pseudo file system . If you look at /proc/filesystems  you will find a list of registered file systems where quite a few has nodev  in front. This indicates they are pseudo filesystems . This means they exists on a running kernel as a RAM-based filesystem. Further they do not require a block device. $ cat /proc/filesystems
nodev   sysfs
nodev   rootfs
nodev   bdev
...


At boot the kernel mount this system and updates entries when suited. E.g. when new hardware is found during boot or by udev  .

In /etc/mtab  you typically find the mount by:

sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0


For a nice paper on the subject read Patric Mochel's – The sysfs Filesystem .

stat of /sys files

If you go into a directory under /sys  and do a ls -l  you will notice that all files has one size. Typically 4096 bytes. This is reported by sysfs  .

:/sys/devices/pci0000:00/0000:00:19.0/net/eth2$ls -l -r--r--r-- 1 root root 4096 Apr 24 20:09 addr_assign_type -r--r--r-- 1 root root 4096 Apr 24 20:09 address -r--r--r-- 1 root root 4096 Apr 24 20:09 addr_len ...  Further you can do a stat  on a file and notice another distinct feature; it occupies 0 blocks. Also inode of root (stat /sys) is 1. /stat/fs  typically has inode 2. etc. rsync vs. cp The easiest explanation for rsync failure of synchronizing pseudo files is perhaps by example. Say we have a file named address  that is 18 bytes. An ls  or stat  of the file reports 4096 bytes. rsync 1. Opens file descriptor, fd. 2. Uses fstat(fd) to get information such as size. 3. Set out to read size bytes, i.e. 4096. That would be line 253 of the code linked by @mattdm . read_size == 4096  1. Ask; read: 4096 bytes. 2. A short string is read i.e. 18 bytes. nread == 18  3. read_size = read_size - nread (4096 - 18 = 4078)  4. Ask; read: 4078 bytes 5. 0 bytes read (as first read consumed all bytes in file). 6. nread == 0  , line 255 7. Unable to read 4096  bytes. Zero out buffer. 8. Set error ENODATA  . 9. Return. 4. Report error. 5. Retry. (Above loop). 6. Fail. 7. Report error. 8. FINE. During this process it actually reads the entire file. But with no size available it cannot validate the result – thus failure is only option. cp 1. Opens file descriptor, fd. 2. Uses fstat(fd) to get information such as st_size (also uses lstat and stat). 3. Check if file is likely to be sparse. That is the file has holes etc. copy.c:1010 /* Use a heuristic to determine whether SRC_NAME contains any sparse * blocks. If the file has fewer blocks than would normally be * needed for a file of its size, then at least one of the blocks in * the file is a hole. */ sparse_src = is_probably_sparse (&src_open_sb);  As stat  reports file to have zero blocks it is categorized as sparse. 4. Tries to read file by extent-copy (a more efficient way to copy normal sparse files), and fails. 5. Copy by sparse-copy. 1. Starts out with max read size of MAXINT. Typically 18446744073709551615  bytes on a 32 bit system. 2. Ask; read 4096 bytes. (Buffer size allocated in memory from stat information.) 3. A short string is read i.e. 18 bytes. 4. Check if a hole is needed, nope. 5. Write buffer to target. 6. Subtract 18 from max read size. 7. Ask; read 4096 bytes. 8. 0 bytes as all got consumed in first read. 9. Return success. 6. All OK. Update flags for file. 7. FINE. , Might be related, but extended attribute calls will fail on sysfs: [root@hypervisor eth0]# lsattr address lsattr: Inappropriate ioctl for device While reading flags on address [root@hypervisor eth0]# Looking at my strace it looks like rsync tries to pull in extended attributes by default: 22964 <... getxattr resumed> , 0x7fff42845110, 132) = -1 ENODATA (No data available) I tried finding a flag to give rsync to see if skipping extended attributes resolves the issue but wasn't able to find anything ( --xattrs  turns them on at the destination). #### [Aug 28, 2017] Rsync doesn't copy everyting s ###### Aug 28, 2017 | ubuntuforums.org View Full Version : [ubuntu] Rsync doesn't copy everyting Scormen May 31st, 2009, 10:09 AM Hi all, I'm having some trouble with rsync. I'm trying to sync my local /etc directory to a remote server, but this won't work. The problem is that it seems he doesn't copy all the files. The local /etc dir contains 15MB of data, after a rsync, the remote backup contains only 4.6MB of data. Rsync is running by root. I'm using this command: rsync --rsync-path="sudo rsync" -e "ssh -i /root/.ssh/backup" -avz --delete --delete-excluded -h --stats /etc kris@192.168.1.3:/home/kris/backup/laptopkris I hope someone can help. Thanks! Kris Scormen May 31st, 2009, 11:05 AM I found that if I do a local sync, everything goes fine. But if I do a remote sync, it copies only 4.6MB. Any idea? LoneWolfJack May 31st, 2009, 05:14 PM never used rsync on a remote machine, but "sudo rsync" looks wrong. you probably can't call sudo like that so the ssh connection needs to have the proper privileges for executing rsync. just an educated guess, though. Scormen May 31st, 2009, 05:24 PM Thanks for your answer. In /etc/sudoers I have added next line, so "sudo rsync" will work. kris ALL=NOPASSWD: /usr/bin/rsync I also tried without --rsync-path="sudo rsync", but without success. I have also tried on the server to pull the files from the laptop, but that doesn't work either. LoneWolfJack May 31st, 2009, 05:30 PM in the rsync help file it says that --rsync-path is for the path to rsync on the remote machine, so my guess is that you can't use sudo there as it will be interpreted as a path. so you will have to do --rsync-path="/path/to/rsync" and make sure the ssh login has root privileges if you need them to access the files you want to sync. --rsync-path="sudo rsync" probably fails because a) sudo is interpreted as a path b) the space isn't escaped c) sudo probably won't allow itself to be called remotely again, this is not more than an educated guess. Scormen May 31st, 2009, 05:45 PM I understand what you mean, so I tried also: rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@192.168.1.3:/home/kris/backup/laptopkris Then I get this error: sending incremental file list rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/pap": Permission denied (13) rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/provider": Permission denied (13) rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.crt" -> "/etc/ssl/certs/ssl-cert-snakeoil.pem" failed: Permission denied (13) rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.key" -> "/etc/ssl/private/ssl-cert-snakeoil.key" failed: Permission denied (13) rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ppp/peers/provider": Permission denied (13) rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ssl/private/ssl-cert-snakeoil.key": Permission denied (13) sent 86.85K bytes received 306 bytes 174.31K bytes/sec total size is 8.71M speedup is 99.97 rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1058) [sender=3.0.5] And the same command with "root" instead of "kris". Then, I get no errors, but I still don't have all the files synced. Scormen June 1st, 2009, 09:00 AM Sorry for this bump. I'm still having the same problem. Any idea? Thanks. binary10 June 1st, 2009, 10:36 AM I understand what you mean, so I tried also: rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@192.168.1.3:/home/kris/backup/laptopkris Then I get this error: And the same command with "root" instead of "kris". Then, I get no errors, but I still don't have all the files synced. Maybe there's a nicer way but you could place /usr/bin/rsync into a private protected area and set the owner to root place the sticky bit on it and change your rsync-path argument such like: # on the remote side, aka kris@192.168.1.3 mkdir priv-area # protect it from normal users running a priv version of rsync chmod 700 priv-area cd priv-area cp -p /usr/local/bin/rsync ./rsync-priv sudo chown 0:0 ./rsync-priv sudo chmod +s ./rsync-priv ls -ltra # rsync-priv should now be 'bold-red' in bash Looking at your flags, you've specified a cvs ignore factor, ignore files that are updated on the target, and you're specifying a backup of removed files. rsync -Cavuhzb --rsync-path="/home/kris/priv-area/rsync-priv" -e "ssh -i /root/.ssh/backup" /etc kris@192.168.1.3:/home/kris/backup/laptopkris From those qualifiers you're not going to be getting everything sync'd. It's doing what you're telling it to do. If you really wanted to perform a like for like backup.. (not keeping stuff that's been changed/deleted from the source. I'd go for something like the following. rsync --archive --delete --hard-links --one-file-system --acls --xattrs --dry-run -i --rsync-path="/home/kris/priv-area/rsync-priv" --rsh="ssh -i /root/.ssh/backup" /etc/ kris@192.168.1.3:/home/kris/backup/laptopkris/etc/ Remove the --dry-run and -i when you're happy with the output, and it should do what you want. A word of warning, I get a bit nervous when not seeing trailing (/) on directories as it could lead to all sorts of funnies if you end up using rsync on softlinks. Scormen June 1st, 2009, 12:19 PM Thanks for your help, binary10. I've tried what you have said, but still, I only receive 4.6MB on the remote server. Thanks for the warning, I'll not that! Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me... Thanks. binary10 June 1st, 2009, 01:22 PM Thanks for your help, binary10. I've tried what you have said, but still, I only receive 4.6MB on the remote server. Thanks for the warning, I'll not that! Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me... Thanks. Ok so I've gone back and looked at your original post, how are you calculating 15MB of data under etc - via a du -hsx /etc/ ?? I do daily drive to drive backup copies via rsync and drive to network copies.. and have used them recently for restoring. Sure my du -hsx /etc/ reports 17MB of data of which 10MB gets transferred via an rsync. My backup drives still operate. rsync 3.0.6 has some fixes to do with ACLs and special devices rsyncing between solaris. but I think 3.0.5 is still ok with ubuntu to ubuntu systems. Here is my test doing exactly what you you're probably trying to do. I even check the remote end.. binary10@jsecx25:~/bin-priv$ ./rsync --archive --delete --hard-links --one-file-system --stats --acls --xattrs --human-readable --rsync-path="~/bin/rsync-priv-os-specific" --rsh="ssh" /etc/ rsyncbck@10.0.0.21:/home/kris/backup/laptopkris/etc/

Number of files: 3121
Number of files transferred: 1812
Total file size: 10.04M bytes
Total transferred file size: 10.00M bytes
Literal data: 10.00M bytes
Matched data: 0 bytes
File list size: 109.26K
File list generation time: 0.002 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 10.20M

sent 10.20M bytes received 38.70K bytes 4.09M bytes/sec
total size is 10.04M speedup is 0.98

binary10@jsecx25:~/bin-priv$sudo du -hsx /etc/ 17M /etc/ binary10@jsecx25:~/bin-priv$

And then on the remote system I do the du -hsx

binary10@lenovo-n200:/home/kris/backup/laptopkris/etc$cd .. binary10@lenovo-n200:/home/kris/backup/laptopkris$ sudo du -hsx etc
17M etc
binary10@lenovo-n200:/home/kris/backup/laptopkris$Scormen June 1st, 2009, 01:35 PM ow are you calculating 15MB of data under etc - via a du -hsx /etc/ ?? Indeed, on my laptop I see: root@laptopkris:/home/kris# du -sh /etc/ 15M /etc/ If I do the same thing after a fresh sync to the server, I see: root@server:/home/kris# du -sh /home/kris/backup/laptopkris/etc/ 4.6M /home/kris/backup/laptopkris/etc/ On both sides, I have installed Ubuntu 9.04, with version 3.0.5 of rsync. So strange... binary10 June 1st, 2009, 01:45 PM it does seem a bit odd. I'd start doing a few diffs from the outputs find etc/ -printf "%f %s %p %Y\n" | sort And see what type of files are missing. - edit - Added the %Y file type. Scormen June 1st, 2009, 01:58 PM Hmm, it's going stranger. Now I see that I have all my files on the server, but they don't have their full size (bytes). I have uploaded the files, so you can look into them. Laptop: http://www.linuxontdekt.be/files/laptop.files Server: http://www.linuxontdekt.be/files/server.files binary10 June 1st, 2009, 02:16 PM If you look at the files that are different aka the ssl's they are links to local files else where aka linked to /usr and not within /etc/ aka they are different on your laptop and the server Scormen June 1st, 2009, 02:25 PM I understand that soft links are just copied, and not the "full file". But, you have run the same command to test, a few posts ago. How is it possible that you can see the full 15MB? binary10 June 1st, 2009, 02:34 PM I was starting to think that this was a bug with du. The de-referencing is a bit topsy. If you rsync copy the remote backup back to a new location back onto the laptop and do the du command. I wonder if you'll end up with 15MB again. Scormen June 1st, 2009, 03:20 PM Good tip. On the server side, the backup of the /etc was still 4.6MB. I have rsynced it back to the laptop, to a new directory. If I go on the laptop to that new directory and do a du, it says 15MB. binary10 June 1st, 2009, 03:34 PM Good tip. On the server side, the backup of the /etc was still 4.6MB. I have rsynced it back to the laptop, to a new directory. If I go on the laptop to that new directory and do a du, it says 15MB. I think you've now confirmed that RSYNC DOES copy everything.. just tht du confusing what you had expected by counting the end link sizes. It might also think about what you're copying, maybe you need more than just /etc of course it depends on what you are trying to do with the backup :) enjoy. Scormen June 1st, 2009, 03:37 PM Yeah, it seems to work well. So, the "problem" where just the soft links, that couldn't be counted on the server side? binary10 June 1st, 2009, 04:23 PM Yeah, it seems to work well. So, the "problem" where just the soft links, that couldn't be counted on the server side? The links were copied as links as per the design of the --archive in rsync. The contents of the pointing links were different between your two systems. These being that that reside outside of /etc/ in /usr And so DU reporting them differently. Scormen June 1st, 2009, 05:36 PM Okay, I got it. Many thanks for the support, binarty10! Scormen June 1st, 2009, 05:59 PM Just to know, is it possible to copy the data from these links as real, hard data? Thanks. binary10 June 2nd, 2009, 09:54 AM Just to know, is it possible to copy the data from these links as real, hard data? Thanks. Yep absolutely You should then look at other possibilities of: -L, --copy-links transform symlink into referent file/dir --copy-unsafe-links only "unsafe" symlinks are transformed --safe-links ignore symlinks that point outside the source tree -k, --copy-dirlinks transform symlink to a dir into referent dir -K, --keep-dirlinks treat symlinked dir on receiver as dir but then you'll have to start questioning why you are backing them up like that especially stuff under /etc/. If you ever wanted to restore it you'd be restoring full files and not symlinks the restore result could be a nightmare as well as create future issues (upgrades etc) let alone your backup will be significantly larger, could be 150MB instead of 4MB. Scormen June 2nd, 2009, 10:04 AM Okay, now I'm sure what its doing :) Is it also possible to show on a system the "real disk usage" of e.g. that /etc directory? So, without the links, that we get a output of 4.6MB. Thank you very much for your help! binary10 June 2nd, 2009, 10:22 AM What does the following respond with. sudo du --apparent-size -hsx /etc If you want the real answer then your result from a dry-run rsync will only be enough for you. sudo rsync --dry-run --stats -h --archive /etc/ /tmp/etc/ #### [Aug 27, 2017] Diff A Directory Recursively, Ignoring All Binary Files ##### It is now possible to use -r to recursively compare directories ###### Aug 27, 2017 | stackoverflow.com diff -r dir1/ dir2/ | sed '/Binary\ files\ /d' >outputfile This recursively compares dir1 to dir2, sed removes the lines for binary files (begins with " Binary files "), then it's redirected to the outputfile. #### [Aug 14, 2017] Cut command on RHEL 6.8 compatibility issues Unix Linux Forums Shell Programming and Scripting ##### Notable quotes: ##### "... Last edited by RudiC; 06-30-2016 at 04:53 AM .. Reason: Added code tags. ..." ##### "... Last edited by rbatte1; 06-30-2016 at 11:38 AM .. Reason: Code tags ..." ##### "... Last edited by Scrutinizer; 07-02-2016 at 02:28 AM .. ..." ##### "... Much better: change your scripts. Run the following fix_cut script on your scripts: ..." ###### Aug 14, 2017 | www.unix.com 06-29-2016Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts Cut command on RHEL 6.8 compatibility issues We have a lot of scripts using cut as : cut -c 0-8 --works for cut (GNU coreutils) 5.97, but does not work for cut (GNU coreutils) 8.4. Gives error - Code: cut: fields and positions are numbered from 1 Try cut --help' for more information.  The position needs to start with 1 for later version of cut and this is causing an issue. Is there a way where I can have multiple cut versions installed and use the older version of cut for the user which runs the script? or any other work around without having to change the scripts? Thanks. Last edited by RudiC; 06-30-2016 at 04:53 AM .. Reason: Added code tags. Vikram Jain Don Cragun AdministratorJoin Date: Jul 2012 Last Activity: 14 August 2017, 3:59 PM EDT Location: San Jose, CA, USA Posts: 10,455 Thanks: 533 Thanked 3,654 Times in 3,118 Posts What are you trying to do when you invoke Code: cut -c 0-8  with your old version of cut With that old version of cut , is there any difference in the output produced by the two pipelines: Code: echo 0123456789abcdef | cut -c 0-8  and: Code: echo 0123456789abcdef | cut -c 1-8  or do they produce the same output? Don Cragun # 06-30-2016 Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts I am trying to get a value from the 1st line of the file and check if that value is a valid date or not. ------------------------------------------------------------------ Below is the output for the cut command from new version Code: $ echo 0123456789abcdef | cut -c 0-8
cut: fields and positions are numbered from 1
$echo 0123456789abcdef | cut -c 1-8 01234567 ------------------------------------------------------------------- With old version, both have same results:  Code: $ echo 0123456789abcdef | cut -c 0-8
01234567
$echo 0123456789abcdef | cut -c 1-8 01234567  Please wrap all code, files, input & output/errors in CODE tags It makes them far easier to read and preserves spaces for indenting or fixed-width data. Last edited by rbatte1; 06-30-2016 at 11:38 AM .. Reason: Code tags Vikram Jain 06-30-2016 Scrutinizer ModeratorJoin Date: Nov 2008 Last Activity: 14 August 2017, 2:48 PM EDT Location: Amsterdam Posts: 11,509 Thanks: 497 Thanked 3,326 Times in 2,934 Posts The use of 0 is not according to specification. Alternatively, you can just omit it, which should work across versions Code: $ echo 0123456789abcdef | cut -c -8
01234567

If you cannot adjust the scripts, you could perhaps create a wrapper script for cut, so that the 0 gets stripped..
Last edited by Scrutinizer; 07-02-2016 at 02:28 AM ..

Scrutinizer

06-30-2016

Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts

Yes, don't want to adjust my scripts.
Wrapper for cut looks like something that would work.

could you please tell me how would I use it, as in, how would I make sure that the wrapper is called and not the cut command which causes the issue.

Vikram Jain

Don Cragun AdministratorJoin Date: Jul 2012 Last Activity: 14 August 2017, 3:59 PM EDT Location: San Jose, CA, USA Posts: 10,455 Thanks: 533 Thanked 3,654 Times in 3,118 Posts

The only way to make sure that your wrapper is always called instead of the OS supplied utility is to move the OS supplied utility to a different location and install your wrapper in the location where your OS installed cut originally.

Of course, once you have installed this wrapper, your code might or might not work properly (depending on the quality of your wrapper) and no one else on your system will be able to look at the diagnostics produced by scripts that have bugs in the way they specify field and character ranges so they can identify and fix their code.

My personal opinion is that you should spend time fixing your scripts that call cut -c 0.... , cut -f 0... , and lots of other possible misuses of 0 that are now correctly diagnosed as errors by the new version of cut instead of debugging code to be sure that it changes all of the appropriate 0 characters in its argument list to 1 characters and doesn't change any 0 characters that are correctly specified and do not reference a character 0 or field 0.

vgersh99 (06-30-2016), Vikram Jain (06-30-2016)

06-30-2016

MadeInGermany ModeratorJoin Date: May 2012 Last Activity: 14 August 2017, 2:33 PM EDT Location: Simplicity Posts: 3,666 Thanks: 295 Thanked 1,226 Times in 1,108 Posts

An update of "cut" will overwrite your wrapper.

Much better: change your scripts. Run the following fix_cut script on your scripts:

Code:

#!/bin/sh
# fix_cut
PATH=/bin:/usr/bin
PRE="\b(cut\s+(-\S*\s+)*-[cf]\s*0*)0-"
for arg
do
perl -ne 'exit 1 if m/'"$PRE"'/' "$arg" || {
perl -i -pe 's/'"$PRE"'/${1}1-/g' "arg" } done Example: fix all .sh scripts Code: fix_cut *.sh The Following User Says Thank You to MadeInGermany For This Useful Post: Vikram Jain (07-08-2016) #### [Jul 17, 2017] Setup Centralized Rsyslog Server On CentOS 7 ###### Jul 17, 2017 | www.linuxtoday.com Install and configure Rsyslog server and client configuration on CentOS 7 server. YUM configuration in Linux (Mar 24, 2017, 06:00) kerneltalks: Learn YUM configuration in Linux.  How to Install a DHCP Server in CentOS, RHEL and Fedora (Mar 26, 2017, 14:00) tecmint: In this tutorial, we will cover how to install and configure a DHCP server in CentOS/RHEL and Fedora distributions. How to find process using high memory in Linux (Mar 26, 2017, 10:00) KernelTalks: Learn how to find process using high memory on linux server. 8 Practical Examples of Linux Xargs Command for Beginners (Mar 27, 2017, 13:00) HowToForge: The Linux xargs command may not be a hugely popular command line tool, but this doesn't take away the fact that it's extremely useful  Using vi-mode in your shell (Mar 27, 2017, 11:00) opensource.com: Get an introduction to using vi-mode for line editing at the command line. Notepadqq Source Code Editor for Linux (Mar 27, 2017, 10:00) Notepadqq is a free, open source code editor and Notepad replacement, that helps developers to work more efficiently. 14 Practical Examples of Linux Find Command for Beginners (Mar 27, 2017, 04:00) HowToForge: Find is one of the most frequently used Linux commands, and it offers a plethora of features in the form of command line options. #### [Jul 16, 2017] How to use a man page Faster than a Google search ###### Jul 16, 2017 | opensource.com It's easy to get into the habit of googling anything you want to know about a command or operation in Linux, but I'd argue there's something even better: a living and breathing, complete reference, the man pages , which is short for manual pages. The history of man pages predates Linux, all the way back to the early days of Unix. According to Wikipedia , Dennis Ritchie and Ken Thompson wrote the first man pages in 1971, well before the days of personal computers, around the time when many calculators in use were the size of toaster ovens. Man pages also have a reputation of being terse and, in a way, have a language of their own. Just like Unix and Linux, the man pages have not been static, and they continue to be developed and maintained just like the kernel. Man pages are divided into sections referenced by numbers: 1. General user commands 2. System calls 3. Library functions 4. Special files and drivers 5. File formats 6. Games and screensavers 7. Miscellanea 8. System administration commands and daemons Even so, users generally don't need to know the section where a particular command lies to find what they need. The files are formatted in a way that may look odd to many users today. Originally, they were written in in an old form of markup called troff because they were designed to be printed through a PostScript printer, so they included formatting for headers and other layout aspects. In Linux, groff is used instead. In my Fedora, the man pages are located in /usr/share/man with subdirectories (like man1 for Section 1 commands) as well as additional subdirectories for translations of the man pages. If you look up the man page for the command man , you'll see the file man.1.gz , which is the man pages compressed with the gzip utility. To access a man page, type a command such as:  man man  for example, to show the man page for man . This uncompresses the man page, interprets the formatting commands, and displays the results with less , so navigation is the same as when you use less . All man pages should have the following subsections: Name , Synopsis , Description , Examples , and See Also . Many have additional sections, like Options , Exit Status , Environment , Bugs , Files , Author , Reporting Bugs , History , and Copyright . Breaking down a man page To explain how to interpret a typical man page, let's use the man page for ls as an example. Under Name , we see  ls - list directory contents  which tells us what ls means in the simplest terms. Under Synopsis , we begin to see the terseness:  ls [ OPTION ] ... [ FILE ]  Any element that occurs inside brackets is optional. The above command means you can legitimately type ls and nothing else. The ellipsis after each element indicates that you can include as many options as you want (as long as they're compatible with each other) and as many files as you want. You can specify a directory name, and you can also use * as a wildcard. For example:  ls Documents /* .txt  Under Description , we see a more verbose description of what the command does, followed by a list of the available options for the command. The first option for ls is -a, --all do not ignore entries starting with . If we want to use this option, we can either type the short form syntax, -a , or the long form --all . Not all options have two forms (e.g., --author ), and even when they do, they aren't always so obviously related (e.g., - F, --classify ). When you want to use multiple options, you can either type the short forms with spaces in between or type them with a single hyphen and no spaces (as long as they do not require further sub-options). Therefore,  ls -a -d -l  and  ls -adl  are equivalent. The command tar is somewhat unique, presumably due to its long history, in that it doesn't require a hyphen at all for the short form. Therefore,  tar -cvf filearchive.tar thisdirectory /  and  tar cvf filearchive.tar thisdirectory /  are both legitimate. On the ls man page, after Description are Author , Reporting Bugs , Copyright , and See Also . The See Also section will often suggest related man pages, so it is generally worth a glance. After all, there is much more to man pages than just commands. Certain commands that are specific to Bash and not system commands, like alias , cd , and a number of others, are listed together in a single BASH_BUILTINS man page. While the documentation for these is even more terse and compact, overall it contains similar information. I find that man pages offer a lot of good, usable information, especially when I need a command I haven't used recently, and I need to brush up on the options and requirements. This is one place where the man pages' much-maligned terseness is actually very beneficial. Topics Linux About the author Greg Pittman - Greg is a retired neurologist in Louisville, Kentucky, with a long-standing interest in computers and programming, beginning with Fortran IV in the 1960s. When Linux and open source software came along, it kindled a commitment to learning more, and eventually contributing. He is a member of the Scribus Team. #### [Jun 29, 2017] printf Command #### [Feb 25, 2017] 5 basic cURL command examples ###### Feb 25, 2017 | www.rosehosting.com cURL is very useful command line tool to transfer data from or to a server. cURL supports various protocols like FILE, HTTP, HTTPS, IMAP, IMAPS, LDAP, DICT, LDAPS, TELNET, FTP, FTPS, GOPHER, RTMP, RTSP, SCP, SFTP, POP3, POP3S, SMB, SMBS, SMTP, SMTPS, and TFTP. cURL can be used in many different and interesting ways. With this tool you can download, upload and manage files, check your email address, or even update your status on some of the social media websites or check the weather outside. In this article will cover five of the most useful and basic uses of the cURL tool on any Linux VPS . 1. Check URL One of the most common and simplest uses of cURL is typing the command itself, followed by the URL you want to check curl https://domain.com  This command will display the content of the URL on your terminal 2. Save the output of the URL to a file The output of the cURL command can be easily saved to a file by adding the -o option to the command, as shown below curl -o website https://domain.com % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 41793 0 41793 0 0 275k 0 --:--:-- --:--:-- --:--:-- 2.9M  In this example, output will be save to a file named 'website' in the current working directory. 3. Download files with cURL You can downlaod files with cURL by adding the -O option to the command. It is used for saving files on the local server with the same names as on the remote server curl -O https://domain.com/file.zip  In this example, the 'file.zip' zip archive will be downloaded to the current working directory. You can also download the file with a different name by adding the -o option to cURL. curl -o archive.zip https://domain.com/file.zip  This way the 'file.zip' archive will be downloaded and saved as 'archive.zip'. cURL can be also used to download multiple files simultaneously, as shown in the example below curl -O https://domain.com/file.zip -O https://domain.com/file2.zip  cURL can be also used to download files securely via SSH using the following command curl -u user sftp://server.domain.com/path/to/file  Note that you have to use the full path of the file you want to download 4. Get HTTP header information from a website You can easily get HTTP header information from any website you want by adding the -I option (capital 'i') to cURL. curl -I http://domain.com HTTP/1.1 200 OK Date: Sun, 16 Oct 2016 23:37:15 GMT Server: Apache/2.4.23 (Unix) X-Powered-By: PHP/5.6.24 Connection: close Content-Type: text/html; charset=UTF-8  5. Access an FTP server To access your FTP server with cURL use the following command curl ftp://ftp.domain.com --user username:password  cURL will connect to the FTP server and list all files and directories in user's home directory You can download a file via FTP curl ftp://ftp.domain.com/file.zip --user username:password  and upload a file ot the FTP server curl -T file.zip ftp://ftp.domain.com/ --user username:password  You can check cURL manual page to see all available cURL options and functionalities man curl  Of course, if you use one of our Linux VPS Hosting services, you can always contact and ask our expert Linux admins (via chat or ticket) about cURL and anything related to cURL. They are available 24×7 and will provide information or assistance immediately. PS. If you liked this post please share it with your friends on the social networks using the buttons below or simply leave a reply. Thanks. #### [Feb 20, 2017] Using rsync to back up your Linux system ###### Feb 20, 2017 | opensource.com Another interesting option, and my personal favorite because it increases the power and flexibility of rsync immensely, is the --link-dest option. The --link-dest option allows a series of daily backups that take up very little additional space for each day and also take very little time to create. Specify the previous day's target directory with this option and a new directory for today. rsync then creates today's new directory and a hard link for each file in yesterday's directory is created in today's directory. So we now have a bunch of hard links to yesterday's files in today's directory. No new files have been created or duplicated. Just a bunch of hard links have been created. Wikipedia has a very good description of hard links . After creating the target directory for today with this set of hard links to yesterday's target directory, rsync performs its sync as usual, but when a change is detected in a file, the target hard link is replaced by a copy of the file from yesterday and the changes to the file are then copied from the source to the target. So now our command looks like the following. rsync -aH --delete --link-dest=yesterdaystargetdir sourcedir todaystargetdir  There are also times when it is desirable to exclude certain directories or files from being synchronized. For this, there is the --exclude option. Use this option and the pattern for the files or directories you want to exclude. You might want to exclude browser cache files so your new command will look like this. rsync -aH --delete --exclude Cache --link-dest=yesterdaystargetdir sourcedir todaystargetdir  Note that each file pattern you want to exclude must have a separate exclude option. rsync can sync files with remote hosts as either the source or the target. For the next example, let's assume that the source directory is on a remote computer with the hostname remote1 and the target directory is on the local host. Even though SSH is the default communications protocol used when transferring data to or from a remote host, I always add the ssh option. The command now looks like this. rsync -aH -e ssh --delete --exclude Cache --link-dest=yesterdaystargetdir remote1:sourcedir todaystargetdir  This is the final form of my rsync backup command. rsync has a very large number of options that you can use to customize the synchronization process. For the most part, the relatively simple commands that I have described here are perfect for making backups for my personal needs. Be sure to read the extensive man page for rsync to learn about more of its capabilities as well as the options discussed here. #### [Feb 14, 2017] switching from gnu screen to tmux (updated) Linux~ized ##### Ability to watch the other user screen is a very valuable option... ###### Feb 14, 2017 | www.linuxized.com ed says: June 16, 2010 at 15:15 screen is really cool, and does somethings that I've yet to find counterparts to with tmux, such as the -x option: #### [Feb 12, 2017] HowTo Use rsync For Transferring Files Under Linux or UNIX ###### Feb 12, 2017 | www.cyberciti.biz So what is unique about the rsync command? It can perform differential uploads and downloads (synchronization) of files across the network, transferring only data that has changed. The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network connection. How do I install rsync? Use any one of the following commands to install rsync. If you are using Debian or Ubuntu Linux, type the following command: # apt-get install rsync  OR  sudo apt-get install rsync 
If you are using Red Hat Enterprise Linux (RHEL) / CentOS 4.x or older version, type the following command:
# up2date rsync 
RHEL / CentOS 5.x or newer (or Fedora Linux) user type the following command:
# yum install rsync 

Always use rsync over ssh

Since rsync does not provide any security while transferring data it is recommended that you use rsync over ssh session. This allows a secure remote connection. Now let us see some examples of rsync command.

Comman rsync command options
• --delete : delete files that don't exist on sender (system)
• -v : Verbose (try -vv for more detailed information)
• -e "ssh options" : specify the ssh as remote shell
• -a : archive mode
• -r : recurse into directories
• -z : compress file data
Task : Copy file from a local computer to a remote server

Copy file from /www/backup.tar.gz to a remote server called openbsd.nixcraft.in
$rsync -v -e ssh /www/backup.tar.gz jerry@openbsd.nixcraft.in:~  Output: Password: sent 19099 bytes received 36 bytes 1093.43 bytes/sec total size is 19014 speedup is 0.99  Please note that symbol ~ indicate the users home directory (/home/jerry). Task : Copy file from a remote server to a local computer Copy file /home/jerry/webroot.txt from a remote server openbsd.nixcraft.in to a local computer's /tmp directory: $ rsync -v -e ssh jerry@openbsd.nixcraft.in:~/webroot.txt /tmp 

Task: Synchronize a local directory with a remote directory

$rsync -r -a -v -e "ssh -l jerry" --delete /local/webroot openbsd.nixcraft.in:/webroot  Task: Synchronize a remote directory with a local directory $ rsync -r -a -v -e "ssh -l jerry" --delete openbsd.nixcraft.in:/webroot/ /local/webroot 

Task: Synchronize a local directory with a remote rsync server or vise-versa

$rsync -r -a -v --delete rsync://rsync.nixcraft.in/cvs /home/cvs  OR $ rsync -r -a -v --delete /home/cvs rsync://rsync.nixcraft.in/cvs 

Task: Mirror a directory between my "old" and "new" web server/ftp

You can mirror a directory between my "old" (my.old.server.com) and "new" web server with the command (assuming that ssh keys are set for password less authentication)
\$ rsync -zavrR --delete --links --rsh="ssh -l vivek" my.old.server.com:/home/lighttpd /home/lighttpd 

Other options – rdiff and rdiff-backup

The rdiff command uses the rsync algorithm. A utility called rdiff-backup has been created which is capable of maintaining a backup mirror of a file or directory over the network, on another server. rdiff-backup stores incremental rdiff deltas with the backup, with which it is possible to recreate any backup point. Next time I will write about these utilities.

rsync for Windows Server/XP/7/8

Please note if you are using MS-Windows, try any one of the program:

=> Official rsync documentation

#### [Feb 12, 2017] How to Sync Two Apache Web Servers-Websites Using Rsync

###### Feb 12, 2017 | www.tecmint.com
The purpose of creating a mirror of your Web Server with Rsync is if your main web server fails, your backup server can take over to reduce downtime of your website. This way of creating a web server backup is very good and effective for small and medium size web businesses. Advantages of Syncing Web Servers

The main advantages of creating a web server backup with rsync are as follows:

1. Rsync syncs only those bytes and blocks of data that have changed.
2. Rsync has the ability to check and delete those files and directories at backup server that have been deleted from the main web server.
3. It takes care of permissions, ownerships and special attributes while copying data remotely.
4. It also supports SSH protocol to transfer data in an encrypted manner so that you will be assured that all data is safe.
5. Rsync uses compression and decompression method while transferring data which consumes less bandwidth.
How To Sync Two Apache Web Servers

Let's proceed with setting up rsync to create a mirror of your web server. Here, I'll be using two servers.

Main Server
2. Hostname : webserver.example.com
Backup Server
2. Hostname : backup.example.com
Step 1: Install Rsync Tool

Here in this case web server data of webserver.example.com will be mirrored on backup.example.com . And to do so first, we need to install Rsync on both the server with the help of following command.

[root@tecmint]# yum install rsync        [On
Red Hat
based systems]
[root@tecmint]# apt-get install rsync    [On
Debian
based systems]

Step 2: Create a User to run Rsync

We can setup rsync with root user, but for security reasons, you can create an unprivileged user on main webserver i.e webserver.example.com to run rsync.

[root@tecmint]# useradd tecmint
[root@tecmint]# passwd tecmint


Here I have created a user " tecmint " and assigned a password to user.

Step 3: Test Rsync Setup

It's time to test your rsync setup on your backup server (i.e. backup.example.com ) and to do so, please type following command.

[root@backup www]# rsync -avzhe ssh tecmint@webserver.example.com:/var/www/ /var/www

Sample Output
tecmint@webserver.example.com's password:
receiving incremental file list
sent 128 bytes  received 32.67K bytes  5.96K bytes/sec
total size is 12.78M  speedup is 389.70


You can see that your rsync is now working absolutely fine and syncing data. I have used " /var/www " to transfer; you can change the folder location according to your needs.

Now, we are done with rsync setups and now its time to setup a cron for rsync. As we are going to use rsync with SSH protocol, ssh will be asking for authentication and if we won't provide a password to cron it will not work. In order to work cron smoothly, we need to setup passwordless ssh logins for rsync.

Here in this example, I am doing it as root to preserve file ownerships as well, you can do it for alternative users too.

First, we'll generate a public and private key with following commands on backups server (i.e. backup.example.com ).

[root@backup]# ssh-keygen -t rsa -b 2048


When you enter this command, please don't provide passphrase and click enter for Empty passphrase so that rsync cron will not need any password for syncing data.

Sample Output
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
9a:33:a9:5d:f4:e1:41:26:57:d0:9a:68:5b:37:9c:23 root@backup.exmple.com
The key's randomart image is:
+--[ RSA 2048]----+
|          .o.    |
|           ..    |
|        ..++ .   |
|        o=E *    |
|       .Sooo o   |
|       =.o o     |
|      * . o      |
|     o +         |
|    . .          |
+-----------------+


Now, our Public and Private key has been generated and we will have to share it with main server so that main web server will recognize this backup machine and will allow it to login without asking any password while syncing data.

[root@backup html]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@webserver.example.com


Now try logging into the machine, with " ssh 'root@webserver.example.com '", and check in .ssh/authorized_keys .

[root@backup html]# root@webserver.example.com


Now, we are done with sharing keys. To know more in-depth about SSH password less login , you can read our article on it.

Step 5: Schedule Cron To Automate Sync

Let's setup a cron for this. To setup a cron, please open crontab file with the following command.

[root@backup ~]# crontab –e


It will open up /etc/crontab file to edit with your default editor. Here In this example, I am writing a cron to run it every 5 minutes to sync the data.

*/5        *        *        *        *   rsync -avzhe ssh root@webserver.example.com:/var/www/ /var/www/


The above cron and rsync command simply syncing " /var/www/ " from the main web server to a backup server in every 5 minutes . You can change the time and folder location configuration according to your needs. To be more creative and customize with Rsync and Cron command, you can check out our more detailed articles at:

#### [Feb 12, 2017] How to Use rsync to Synchronize Files Between Servers Linux Server Training 101

###### Feb 12, 2017 | www.youtube.com
soundtraining.net

Great demonstration and very easy to follow Don! Just a note to anyone who might come across this and start using it in production based systems is that you certainly would not want to be rsyncing with root accounts. In addition you would use key based auth with SSH as an additional layer of security. Just my 2cents ;-) curtis shaw 11 months ago Best rsync tutorial on the web. Thanks.

#### [Feb 12, 2017] An Easy Way To Monitor A Website From Command Line In Linux

###### OSTechNix

We all know that ping command will tell you instantly whether the website is live or down. Usually, we all check whether a website is up or down like below.

ping ostechnix.com -c 3

Sample output:

PING ostechnix.com (64.90.37.180) 56(84) bytes of data.
64 bytes from ostechnix.com (64.90.37.180): icmp_seq=1 ttl=51 time=376 ms
64 bytes from ostechnix.com (64.90.37.180): icmp_seq=2 ttl=51 time=374 ms

--- ostechnix.com ping statistics ---
3 packets transmitted, 2 received, 33% packet loss, time 2000ms
rtt min/avg/max/mdev = 374.828/375.471/376.114/0.643 ms

But, Would you run this command every time to check whether your website is live or down? You may create a script to check your website status at periodic intervals. But wait. It's not necessary! Here is simple command that will watch or monitor on a regular interval.

watch -n 1 curl -I http://DOMAIN_NAME/

For those who don't know, watch command is used to run any command on a particular intervals.

Example:

Let us check if ostechnix.com site is live or down. To do so, run:

watch -n 1 curl -I https://www.ostechnix.com/

Sample output:

Every 1.0s: curl -I https://www.ostechnix.com/ sk: Thu Dec 22 17:37:24 2016

% Total % Received % Xferd Average Speed Time Time Time Current
`