The command batch is one of the three commands that constitute simple batch scheduler
in Unix. The other two are jobs and jobs.
There is also script atrun which allow to specify the load threshold below which batch
jobs are allowed to run.
The at utility, best known for running commands at a specified time, also has a queue
feature and can be asked to start running commands now. It reads the command to run from standard
echo 'command1 --option arg1 arg2' | at -q myqueue now
echo 'command2 ...' | at -q myqueue now
batch command is equivalent to at -q b -m now (-m
meaning the command output, if any, will be mailed to you, like cron does). Not all unix variants
support queue names (-q myqueue); you may be limited to a single queue called b.
Linux's at is limited to single-letter queue names.
But default this is when the load average drops below 0.8, or the value specified
in the invocation ofatrun-lload_avg] [-d]
The latter is a shell script containing invocation of /usr/sbin/atd
with the -s option (exists for compatibility with other Unixes).
In Linux it is extremely primitive and alternative batch systems should
be used for anything more complex then submitting series of jobs with equal
By default batch submit jobs into b queue, the queue
exists specifically for batch jobs (query a is used for at command
submissions and query c for crontab submission).
Since queue c is reserved for cron jobs, it can not be
used with the -q option.
of submitted jobs can be delayed by limits on the number of jobs allowed
to run concurrently. Submission jobs into
several queries permit running several jobs streams in parallel.
batch [-V] [-q queue] [-f file]
[-mv] [TIME] executes commands
when system load levels permit; in other words, when the load average
drops below 1.5, or the value specified or the value specified in the invocation of atd.
-V prints the version number to standard error.
-q queue uses the specified queue. A queue designation
consists of a single letter; valid queue designations range from
a to z. and A to Z. The b queue is the
default for batch. Queues with higher letters run with
increased niceness. The special queue "=" is reserved for jobs which
are currently running. If a job is submitted to an uppercase letter queue , it is treated as if it had been submitted to batch
at the specific time. Once the time is reached, the batch processing rules with respect to
load average apply
-m Send mail to the user when the job has completed
-f file Reads the job from file rather than
-v Shows the time the job will be executed. Times displayed
will be in the format "1997-02-20 14:50" unless the environment variable
POSIXLY_CORRECT is set; then, it will be "Thu Feb 20 14:50:00
TIME There are several important relative time frames used ( see at
command for details)
now - Indicates
the current day and time. Invoking at - now will submit an at-job
for potentially immediate execution.
midnight - Indicates
the time 12:00 am (00:00).
noon - Indicates
the time 12:00 pm.
The optional increment after time specification in
at command permit to specify offset from the time. It should be
a number preceded by a plus sign (+) with one of the following
The spacing is quite flexible as long as there are no ambiguities.
0815am Jan 24
now "+ 1day"
5 pm FRIday
The singular forms are also accepted, for example
now + 1 minute
The keyword next can be used as an equivalent to an increment
+ 1. For example,:
/usr/bin/at -q b [-p project] now
/usr/xpg4/bin/at -q b -m [-p project] now
At the same time at is quite different animal than cron: "at"
preserves the environment in which it was invoked, while cron does not (it
executes command in its own "cron" environment, and you should not expect
that PATH and other valuables will be preserved).
The at utility is pipable: it can reads commands from standard
input and submit a job to be executed immediately (like in example below)
or at a later time.
echo "perl myjob"
| at now
The at-job is executed in a separate invocation of the shell, running
in a separate process group with no controlling terminal, except that the
environment variables, current working directory, file creation mask (see
umask(1)), and system resource limits (for sh and ksh
ulimit(1)) in effect when the at utility is executed is retained
and used when the at-job is executed.
When the at-job is submitted, the at_job_id and scheduled
time are written to standard error. The at_job_id is an identifier
that is a string consisting solely of alphanumeric characters and the period
character. The at_job_id is assigned by the system when the job
is scheduled such that it uniquely identifies a particular job.
User notification and the processing of the job's standard output and
standard error are described under the -m option.
Like with cron two files that list users one per line and are similar
to cron control files control the behavior of the command:
Users are permitted to use at and batch (see
below) if their name appears in the file /usr/lib/cron/at.allow
If this file exists, all users that need to use at commadn should be explicitly
If that file does not exist, the file /usr/lib/cron/at.deny
is checked to determine if the user should be denied access to
If neither file exists, only a user with the solaris.jobs.user
authorization is allowed to submit a job.
If onlyat.deny exists and is empty, global usage is
cron and at jobs are not be executed if the
user's account is locked. Only accounts which are not locked as defined
shadow(4) will have their job or process executed. For example
webservd is a "locked" user under
Solaris 10 and you need to unlock it before it is possible to use at
for submitting jobs from this user. Un such cases /var/cron/log will
contain the entry like the folllowing:
! bad user (webservd) Fri Apr 21 14:47:49
That can also happen with human accounts if password aging was turned
on. Apparently if the password expires cron jobs do not run.
sh(1) is used to execute the at-job.
Specifies the path of a file to be used as the source of the at-job,
instead of standard input.
(The letter ell.) Reports all jobs scheduled for the invoking user if
no at_job_id operands are specified. If at_job_ids
are specified, reports only information for these jobs.
Sends mail to the invoking user after the at-job has run, announcing
its completion. Standard output and standard error produced by the at-job
are mailed to the user as well, unless redirected elsewhere. Mail is
sent even if the job produces no output.
If -m is
not used, the job's standard output and standard error is provided to
the user by means of mail, unless they are redirected elsewhere; if
there is no such output to provide, the user is not notified of the
Specifies under which project the at or batch job
is run. When used with the -l option, limits the search
to that particular project. Values for project is interpreted
first as a project name, and then as a possible project ID, if entirely
numeric. By default, the user's current project is used.
Specifies in which queue to schedule a job for submission. When used
with the -l option, limits the search to that particular
queue. Values for queuename are limited to the lower case
letters a through z. By default, at-jobs are scheduled
in queue a. In contrast, queue b is reserved for batch
jobs. Since queue c is reserved for cron jobs, it can not be
used with the -q option.
Removes the jobs with the specified at_job_id operands that
were previously scheduled by the at utility.
Submits the job to be run at the time specified by the time
option-argument, which must have the format as specified by the
The following operands are supported:
The name reported by a previous invocation of the at utility
at the time the job was scheduled.
Submit the job to be run at the date and time specified. All of the
timespec operands are interpreted as if they were separated
by space characters and concatenated. The date and time are interpreted
as being in the timezone of the user (as determined by the TZ
variable), unless a timezone name appears as part of time
In the "C" locale, the following describes the three parts
of the time specification string. All of the values from the LC_TIME
categories in the "C" locale are recognized in a case-insensitive manner.
The time can be specified as one, two or four digits.
One- and two-digit numbers are taken to be hours, four-digit numbers
to be hours and minutes. The time can alternatively be specified
as two numbers separated by a colon, meaning hour:minute.
An AM/PM indication (one of the values from the am_pm keywords
in the LC_TIME locale category) can follow the time; otherwise,
a 24-hour clock time is understood. A timezone name of GMT, UCT,
or ZULU (case insensitive) can follow to specify that the time is
in Coordinated Universal Time. Other timezones can be specified
using the TZ environment variable. The time
field can also be one of the following tokens in the "C" locale:
Indicates the time 12:00 am (00:00).
Indicates the time 12:00 pm.
Indicate the current day and time. Invoking atnow submits an at-job for potentially immediate execution
(that is, subject only to unspecified scheduling delays).
An optional date can be specified as either a month name
(one of the values from the mon or abmon keywords
in the LC_TIME locale category) followed by a day number
(and possibly year number preceded by a comma) or a day of the week
(one of the values from the day or abday keywords
in the LC_TIME locale category). Two special days are recognized
in the "C" locale:
Indicates the current day.
Indicates the day following the current day.
If no date is given, today is assumed if
the given time is greater than the current time, and tomorrow
is assumed if it is less. If the given month is less than the current
month (and no year is given), next year is assumed.
The optional increment is a number preceded by a plus
sign (+) and suffixed by one of the following: minutes,
hours, days, weeks, months,
or years. (The singular forms are also accepted.) The keyword
next is equivalent to an increment number of + 1.
For example, the following are equivalent commands:
at 2pm + 1 week
at 2pm next week
The format of the at command line shown here is guaranteed
only for the "C" locale. Other locales are not supported for midnight,
noon, now, mon, abmon, day,
abday, today, tomorrow, minutes,
hours, days, weeks, months, years,
Since the commands run in a separate shell invocation, running in a separate
process group with no controlling terminal, open file descriptors, traps
and priority inherited from the invoking environment are lost.
the Bash shell in Linux to manage foreground and background processes. You can use Bash's job control functions and
signals to give you more flexibility in how you run commands. We show you how.
How to Speed Up a Slow PC
All About Processes
Whenever a program is executed in a Linux or Unix-like operating
system, a process is started. "Process" is the name for the internal representation of the executing program in the
computer's memory. There is a process for every active program. In fact, there is a process for nearly everything that
is running on your computer. That includes the components of your
(GDE) such as
are launched at start-up.
that is running? Well, Bash built-ins such as
not need to have a process launched (or "spawned") when they are run. Bash executes these commands within the instance
of the Bash shell that is running in your terminal window. These commands are fast precisely because they don't need to
have a process launched for them to execute. (You can type
a terminal window to see the list of Bash built-ins.)
Processes can be running in the foreground, in which case they take
over your terminal until they have completed, or they can be run in the background. Processes that run in the background
don't dominate the terminal window and you can continue to work in it. Or at least, they don't dominate the terminal
window if they don't generate screen output.
A Messy Example
We'll start a simple
. We're going to
How-To Geek domain. This will execute as a foreground process.
We get the expected results, scrolling down the terminal window. We
can't do anything else in the terminal window while
running. To terminate the command hit
The visible effect of the
highlighted in the screenshot.
a short summary and then stops.
Let's repeat that. But this time we'll hit
The task won't be terminated. It will become a background task. We get control of the terminal window returned to us.
The visible effect of hitting
highlighted in the screenshot.
This time we are told the process is stopped. Stopped doesn't mean
terminated. It's like a car at a stop sign. We haven't scrapped it and thrown it away. It's still on the road,
stationary, waiting to go. The process is now a background
list the jobs
that have been started in the current terminal session. And because jobs are (inevitably) processes,
we can also use the
to see them. Let's use both commands and compare their outputs. We'll use the
(terminal) option to only list the processes that are running in this terminal window. Note that there is no need to use
The number in square brackets is the job number. We can use this to refer to the job when we need to control it with
job control commands.
The plus sign
that this is the job that will be acted upon if we use a job control command without a specific job number. It is
called the default job. The default job is always the one most recently added to the list of jobs.
The process is not running.
: The command line that launched the process.
The process ID of the process. Each process has a unique ID.
The pseudo-teletype (terminal window) that the process was executed from.
The status of the process.
The amount of CPU time consumed by the process.
The command that launched the process.
These are common values for the STAT column:
Uninterruptible sleep. The process is in a waiting state, usually waiting for input or output, and cannot be
Stopped by a job control signal.
A zombie process. The process has been terminated but hasn't been "cleaned down" by its parent process.
The value in the STAT column can be followed by one of these extra
High-priority task (not nice to other processes).
Low-priority (nice to other processes).
process has pages locked into memory (typically used by real-time processes).
A session leader. A session leader is a process that has launched process groups. A shell is a session leader.
A foreground process.
We can see that Bash has a state of
The uppercase "S" tell us the Bash shell is sleeping, and it is interruptible. As soon as we need it, it will respond.
The lowercase "s" tells us that the shell is a session leader.
The ping command has a state of
This tells us that
been stopped by a job control signal. In this example, that was the
used to put it into the background.
command has a state of
which stands for running. The
that this process is a member of the foreground group. So the
command is running in the foreground.
The bg Command
is used to resume a background process. It can be used with or without a job number. If you use it without a job number
the default job is brought to the foreground. The process still runs in the background. You cannot send any input to it.
If we issue the
we will resume our
resumes and we see the scrolling output in the terminal window once more. The name of the command that has been
restarted is displayed for you. This is highlighted in the screenshot.
But we have a problem. The task is running in the background and
won't accept input. So how do we stop it?
do anything. We can see it when we type it but the background task doesn't receive those keystrokes so it keeps pinging
In fact, we're now in a strange blended mode. We can type in the
terminal window but what we type is quickly swept away by the scrolling output from the
Anything we type takes effect in the foregound.
To stop our background task we need to bring it to the foreground
and then stop it.
The fg Command
will bring a background task into the foreground. Just like the
it can be used with or without a job number. Using it with a job number means it will operate on a specific job. If it
is used without a job number the last command that was sent to the background is used.
If we type
will be brought to the foreground. The characters we type are mixed up with the output from the
but they are operated on by the shell as if they had been entered on the command line as usual. And in fact, from the
Bash shell's point of view, that is exactly what has happened.
And now that we have the
running in the foreground once more, we can use
We Need to Send the Right Signals
That wasn't exactly pretty. Evidently running a process in the
background works best when the process doesn't produce output and doesn't require input.
But, messy or not, our example did accomplish:
Putting a process into the background.
Restoring the process to a running state in the background.
Returning the process to the foreground.
Terminating the process.
When you use
you are sending signals to the process. These are
of using the
at the command line to list them.
the only source of these signals. Some of them are raised automatically by other processes within the system
Here are some of the commonly used ones.
Signal 1. Automatically sent to a process when the terminal it is running in is closed.
Signal 2. Sent to a process you hit
The process is interrupted and told to terminate.
Signal 3. Sent to a process if the user sends a quit signal
Signal 9. The process is immediately killed and will not attempt to close down cleanly. The process does not go down
15. This is the default signal sent by
It is the standard program termination signal.
20. Sent to a process when you use
It stops the process and puts it in the background.
We must use the
to issue signals that do not have key combinations assigned to them.
Further Job Control
A process moved into the background by using
placed in the stopped state. We have to use the
to start it running again. To launch a program as a running background process is simple. Append an ampersand
the end of the command line.
Although it is best that background processes do not write to the
terminal window, we're going to use examples that do. We need to have something in the screenshots that we can refer to.
This command will start an endless loop as a background process:
while true; do echo "How-To Geek Loop
Process"; sleep 3; done &
We are told the job number and process ID id of the process. Our
job number is 1, and the process id is 1979. We can use these identifiers to control the process.
The output from our endless loop starts to appear in the terminal
window. As before, we can use the command line but any commands we issue are interspersed with the output from the loop
To stop our process we can use
remind ourselves what the job number is, and then use
reports that our process is job number 1. To use that number with
must precede it with a percent sign
signal number 15, to the process and it is terminated. When the Enter key is next pressed, a status of the job is shown.
It lists the process as "terminated." If the process does not respond to the
you can take it up a notch. Use
signal number 9. Just put the number 9 between the
the job number.
kill 9 %1
Things We've Covered
signal 2, to the process -- if it is accepting input -- and tells it to terminate.
signal 3, to the process -- if it is accepting input -- and tells it to quit.
signal 20, to the process and tells it to stop (suspend) and become a background process.
Lists the background jobs and shows their job number.
Restarts a background process. If you don't provide a job number the last process that was turned into a background
task is used.
brings a background process into the foreground and restarts it. If you don't provide a job number the last process
that was turned into a background task is used.
Adding an ampersand
the end of a command line executes that command as a background task, that is running.
signal 15, to the process to terminate it.
signal 9, to the process and terminates it abruptly.
In this quick tutorial, I want to look at
and a few of the ways that we can manipulate the jobs running on our systems. In short, controlling jobs lets you
suspend and resume processes started in your Linux shell.
will list all jobs on the system; active, stopped, or otherwise. Before I explore the command and output, I'll create
a job on my system.
I will use the
as it won't change my system in any meaningful way.
First, I issued the
and then I received the
then immediately stopped the job by using
Next, I run the
to view the newly created job:
[tcarrigan@rhel ~]$ jobs
+ Stopped sleep 500
You can see that I have a single stopped job
identified by the job number
Other options to know for this command
-l - list PIDs in addition to default info
-n - list only processes that have changed since the last notification
-p - list PIDs only
-r - show only running jobs
-s - show only stopped jobs
Next, I'll resume the
in the background. To do this, I use the
has a pretty simple syntax, as seen here:
Where JOB_SPEC can be any of the following:
%n - where
is the job number
%abc - refers to a job started by a command beginning with
%?abc - refers to a job started by a command containing
%- - specifies the previous job
on the current job if no JOB_SPEC is provided.
I can move this job to the background by
using the job number
[tcarrigan@rhel ~]$ bg %1
+ sleep 500 &
You can see now that I have a single running
job in the background.
[tcarrigan@rhel ~]$ jobs
+ Running sleep 500 &
Now, let's look at how to move a background
job into the foreground. To do this, I use the
The command syntax is the same for the foreground command as with the background command.
Refer to the above bullets for details on
I have started a new
[tcarrigan@rhel ~]$ sleep 500 &
Now, I'll move it to the foreground by using
the following command:
[tcarrigan@rhel ~]$ fg %2
has now brought my system back into a sleep state.
While I realize that the jobs presented here
were trivial, these concepts can be applied to more than just the
If you run into a situation that requires it, you now have the knowledge to move running or stopped jobs from the
foreground to background and back again.
project allows you
to queue up tasks from the shell for batch execution. Task Spooler is simple to use and requires no
configuration. You can view and edit queued commands, and you can view the output of queued commands at any
Task Spooler has some similarities with other delayed and batch execution projects, such as "
While both Task Spooler and at handle multiple queues and allow the execution of commands at a later point, the
at project handles output from commands by emailing the results to the user who queued the command, while Task
Spooler allows you to get at the results from the command line instead. Another major difference is that Task
Spooler is not aimed at executing commands at a specific time, but rather at simply adding to and executing
commands from queues.
The main repositories for Fedora, openSUSE, and Ubuntu do not contain packages for Task Spooler. There are
packages for some versions of Debian, Ubuntu, and openSUSE 10.x available along with the source code on the
project's homepage. In this article I'll use a 64-bit Fedora 9 machine and install version 0.6 of Task Spooler
from source. Task Spooler does not use autotools to build, so to install it, simply run
make; sudo make
. This will install the main Task Spooler command
and its manual page into /usr/local.
A simple interaction with Task Spooler is shown below. First I add a new job to the queue and check the
status. As the command is a very simple one, it is likely to have been executed immediately. Executing ts by
itself with no arguments shows the executing queue, including tasks that have completed. I then use
to get at the stdout of the executed command. The
to display the
output file for a task. Using
shows you information about the job. To clear finished jobs
from the queue, use the
command, not shown in the example.
$ ts echo "hello world"
ID State Output E-Level Times(r/u/s) Command [run=0/1]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world
option operates like
, showing you the last few lines of output and
continuing to show you any new output from the task. If you would like to be notified when a task has
completed, you can use the
option to have the results mailed to you, or you can queue another
command to be executed that just performs the notification. For example, I might add a tar command and want to
know when it has completed. The below commands will create a tarball and use
commands to create an
inobtrusive popup window on my desktop when the tarball creation is complete. The popup will be dismissed
automatically after a timeout.
$ ts tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
$ ts notify-send "tarball creation" "the long running tar creation process is complete."
ID State Output E-Level Times(r/u/s) Command [run=0/1]
11 finished /tmp/ts-out.O6epsS 0 4.64/4.31/0.29 tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
12 finished /tmp/ts-out.4KbPSE 0 0.05/0.00/0.02 notify-send tarball creation the long... is complete.
Notice in the output above, toward the far right of the header information, the
This tells you that Task Spooler is executing nothing, and can possibly execute one task. Task spooler allows
you to execute multiple tasks at once from your task queue to take advantage of multicore CPUs. The
option allows you to set how many tasks can be executed in parallel from the queue, as shown below.
$ ts -S 2
ID State Output E-Level Times(r/u/s) Command [run=0/2]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world
If you have two tasks that you want to execute with Task Spooler but one depends on the other having already
been executed (and perhaps that the previous job has succeeded too) you can handle this by having one task wait
for the other to complete before executing. This becomes more important on a quad core machine when you might
have told Task Spooler that it can execute three tasks in parallel. The commands shown below create an explicit
dependency, making sure that the second command is executed only if the first has completed successfully, even
when the queue allows multiple tasks to be executed. The first command is queued normally using
I use a subshell to execute the commands by having
explicitly start a new bash shell. The
second command uses the
option, which tells
to execute the command only after
the successful completion of the last command that was appended to the queue. When I first inspect the queue I
can see that the first command (28) is executing. The second command is queued but has not been added to the
list of executing tasks because Task Spooler is aware that it cannot execute until task 28 is complete. The
second time I view the queue, both tasks have completed.
$ ts bash -c "sleep 10; echo hi"
$ ts -d echo there
ID State Output E-Level Times(r/u/s) Command [run=1/2]
28 running /tmp/ts-out.hKqDva bash -c sleep 10; echo hi
29 queued (file) && echo there
ID State Output E-Level Times(r/u/s) Command [run=0/2]
28 finished /tmp/ts-out.hKqDva 0 10.01/0.00/0.01 bash -c sleep 10; echo hi
29 finished /tmp/ts-out.VDtVp7 0 0.00/0.00/0.00 && echo there
$ cat /tmp/ts-out.hKqDva
$ cat /tmp/ts-out.VDtVp7
You can also explicitly set dependencies on other tasks as shown below. Because the
prints the ID of a new task to the console, the first command puts that ID into a shell variable for use in the
second command. The second command passes the task ID of the first task to ts, telling it to wait for the task
with that ID to complete before returning. Because this is joined with the command we wish to execute with the
operation, the second command will execute only if the first one has finished
The first time we view the queue you can see that both tasks are running. The first task will be in the
command that we used explicitly to slow down its execution. The second command will be
, which will be waiting for the first task to complete. One downside of tracking
dependencies this way is that the second command is added to the running queue even though it cannot do
anything until the first task is complete.
$ FIRST_TASKID=`ts bash -c "sleep 10; echo hi"`
$ ts sh -c "ts -w $FIRST_TASKID && echo there"
ID State Output E-Level Times(r/u/s) Command [run=2/2]
24 running /tmp/ts-out.La9Gmz bash -c sleep 10; echo hi
25 running /tmp/ts-out.Zr2n5u sh -c ts -w 24 && echo there
ID State Output E-Level Times(r/u/s) Command [run=0/2]
24 finished /tmp/ts-out.La9Gmz 0 10.01/0.00/0.00 bash -c sleep 10; echo hi
25 finished /tmp/ts-out.Zr2n5u 0 9.47/0.00/0.01 sh -c ts -w 24 && echo there
$ ts -c 24
$ ts -c 25
Task Spooler allows you to convert a shell command to a queued command by simply prepending
to the command line. One major advantage of using ts over something like the
command is that
you can effectively run
on the output of a running task and also get at the output of
completed tasks from the command line. The utility's ability to execute multiple tasks in parallel is very
handy if you are running on a multicore CPU. Because you can explicitly wait for a task, you can set up very
complex interactions where you might have several tasks running at once and have jobs that depend on multiple
other tasks to complete successfully before they can execute.
Because you can make explicitly dependant tasks take up slots in the actively running task queue, you can
effectively delay the execution of the queue until a time of your choosing. For example, if you queue up a task
that waits for a specific time before returning successfully and have a small group of other tasks that are
dependent on this first task to complete, then no tasks in the queue will execute until the first task
Run the commands listed in the ' my-at-jobs.txt ' file at 1:35 AM. All output from the job will be mailed to the
user running the task. When this command has been successfully entered you should receive a prompt similar to the example below:
commands will be executed using /bin/sh
job 1 at Wed Dec 24 00:22:00 2014
This command will list each of the scheduled jobs in a format like the following:
1 Wed Dec 24 00:22:00 2003
...this is the same as running the command atq .
at -r 1
Deletes job 1 . This command is the same as running the command atrm 1 .
Deletes job 23. This command is the same as running the command at -r 23 .
But processing each line until the command is finished then moving to the next one is very
time consuming, I want to process for instance 20 lines at once then when they're finished
another 20 lines are processed.
I thought of wget LINK1 >/dev/null 2>&1 & to send the command
to the background and carry on, but there are 4000 lines here this means I will have
performance issues, not to mention being limited in how many processes I should start at the
same time so this is not a good idea.
One solution that I'm thinking of right now is checking whether one of the commands is
still running or not, for instance after 20 lines I can add this loop:
Of course in this case I will need to append & to the end of the line! But I'm feeling
this is not the right way to do it.
So how do I actually group each 20 lines together and wait for them to finish before going
to the next 20 lines, this script is dynamically generated so I can do whatever math I want
on it while it's being generated, but it DOES NOT have to use wget, it was just an example so
any solution that is wget specific is not gonna do me any good.
wait is the right answer here, but your while [ $(ps would be much
better written while pkill -0 $KEYWORD – using proctools that is, for legitimate reasons to
check if a process with a specific name is still running. – kojiro
Oct 23 '13 at 13:46
I think this question should be re-opened. The "possible duplicate" QA is all about running a
finite number of programs in parallel. Like 2-3 commands. This question, however, is
focused on running commands in e.g. a loop. (see "but there are 4000 lines"). –
Jan 11 at 19:01
@VasyaNovikov Have you readall the answers to both this question and the
duplicate? Every single answer to this question here, can also be found in the answers to the
duplicate question. That is precisely the definition of a duplicate question. It makes
absolutely no difference whether or not you are running the commands in a loop. –
Jan 11 at 23:08
I recommend reopening this question because its answer is clearer, cleaner, better, and much
more highly upvoted than the answer at the linked question, though it is three years more
recent. – Dan Nissenbaum
Apr 20 at 15:35
Wait until the child process specified by each process ID pid or job specification
jobspec exits and return the exit status of the last command waited for. If a job spec is
given, all processes in the job are waited for. If no arguments are given, all currently
active child processes are waited for, and the return status is zero. If neither jobspec
nor pid specifies an active child process of the shell, the return status is 127.
Unless you're sure that each process will finish at the exact same time, this is a bad idea.
You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer.
Jul 18 '14 at 17:26
I've tried this but it seems that variable assignments done in one block are not available in
the next block. Is this because they are separate processes? Is there a way to communicate
the variables back to the main process? – Bobby
Apr 27 '17 at 7:55
Basically, Convert.py will read in a small json file (4kb), do some
processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no
other CPU-intense process is running on this server.
By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I
find that -P 40 is not as fast as expected. Sometimes all cores will freeze and
decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease
the number of parallel processes to -P 20-30 , but it's still not very fast. The
ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs
You are most likely hit by I/O: The system cannot read the files fast enough. Try starting
more than 40: This way it will be fine if some of the processes have to wait for I/O. –
Apr 19 '15 at 8:45
I second @OleTange. That is the expected behavior if you run as many processes as you have
cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep),
then they will process, and then repeat. If you add more processes, then the additional
processes that currently aren't running on a physical core will have kicked off parallel IO
operations, which will, when finished, eliminate or at least reduce the sleep periods on your
cores. – PSkocik
Apr 19 '15 at 11:41
1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually
overwritten with each call to convert.py ... not sure if this is the intended behavior or
not. – Bichoy
Apr 20 '15 at 3:32
I'd be willing to bet that your problem is python . You didn't say what kind of processing is
being done on each file, but assuming you are just doing in-memory processing of the data,
the running time will be dominated by starting up 30 million python virtual machines
If you can restructure your python program to take a list of files, instead of just one,
you will get a huge improvement in performance. You can then still use xargs to further
improve performance. For example, 40 processes, each processing 1000 files:
This isn't to say that python is a bad/slow language; it's just not optimized for startup
time. You'll see this with any virtual machine-based or interpreted language. Java, for
example, would be even worse. If your program was written in C, there would still be a cost
of starting a separate operating system process to handle each file, but it would be much
From there you can fiddle with -P to see if you can squeeze out a bit more
speed, perhaps by increasing the number of processes to take advantage of idle processors
while data is being read/written.
What is the constraint on each job? If it's I/O you can probably get away with
multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its
going to be worse than pointless running more jobs concurrently than you have CPU cores.
My understanding of these things is that GNU Parallel would give you better control over
the queue of jobs etc.
As others said, check whether you're I/O-bound. Also, xargs' man page suggests using
-n with -P , you don't mention the number of
Convert.py processes you see running in parallel.
As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try
doing the processing in a tmpfs (of course, in this case you should check for enough memory,
avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in
the first place).
I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be
run at a specified time in the future as would be the case with the "at" command. I want them
to start running now, but not simultaneously. The next scheduled command in the queue should
be run only after the first command finishes executing. Alternatively, it would be nice if I
could specify a maximum number of commands from the queue that could be run simultaneously;
for example if the maximum number of simultaneous commands is 2, then only at most 2 commands
scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the
next command in the remaining queue being started only when one of the currently 2 running
I've heard task-spooler could do something like this, but this package doesn't appear to
be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what
I'm using). If that's the best alternative then let me know and I'll use task-spooler,
otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free,
canonical way to do such a thing with bash.
Simple solutions like ; or && from bash do not work. I need to schedule these
commands from an external program, when an event occurs. I just don't want to have hundreds
of instances of my command running simultaneously, hence the need for a queue. There's an
external program that will trigger events where I can run my own commands. I want to handle
ALL triggered events, I don't want to miss any event, but I also don't want my system to
crash, so that's why I want a queue to handle my commands triggered from the external
That will list the directory. Only after ls has run it will run touch
test which will create a file named test. And only after that has finished it will run
the next command. (In this case another ls which will show the old contents and
the newly created file).
Similar commands are || and && .
; will always run the next command.
&& will only run the next command it the first returned success.
Example: rm -rf *.mp3 && echo "Success! All MP3s deleted!"
|| will only run the next command if the first command returned a failure
(non-zero) return value. Example: rm -rf *.mp3 || echo "Error! Some files could not be
deleted! Check permissions!"
If you want to run a command in the background, append an ampersand ( &
Example: make bzimage & mp3blaster sound.mp3 make mytestsoftware ; ls ; firefox ; make clean
Will run two commands int he background (in this case a kernel build which will take some
time and a program to play some music). And in the foregrounds it runs another compile job
and, once that is finished ls, firefox and a make clean (all sequentially)
For more details, see man bash
[Edit after comment]
in pseudo code, something like this?
While( queue not empty )
run next command from the queue.
remove this command from the queue.
// If commands where added to the queue during execution then
// the queue is not empty, keep processing them all.
// Queue is now empty, returning to wait_for_a_signal
// Wait forever on commands and add them to a queue
// Signal run_quueu when something gets added.
Append command to queue
The easiest way would be to simply run the commands sequentially:
cmd1; cmd2; cmd3; cmdN
If you want the next command to run only if the previous command exited
successfully, use && :
cmd1 && cmd2 && cmd3 && cmdN
That is the only bash native way I know of doing what you want. If you need job control
(setting a number of parallel jobs etc), you could try installing a queue manager such as
TORQUE but that
seems like overkill if all you want to do is launch jobs sequentially.
task spooler is a Unix batch system where the tasks spooled run one after the other. The
amount of jobs to run at once can be set at any time. Each user in each system has his own
job queue. The tasks are run in the correct context (that of enqueue) from any shell/process,
and its output/results can be easily watched. It is very useful when you know that your
commands depend on a lot of RAM, a lot of disk use, give a lot of output, or for whatever
reason it's better not to run them all at the same time, while you want to keep your
resources busy for maximum benfit. Its interface allows using it easily in scripts.
Alessandro Öhler once maintained a mailing list for discussing newer functionalities
and interchanging use experiences. I think this doesn't work anymore , but you can
look at the old archive
or even try to subscribe .
The queue is maintained by a server process. This server process is started if it isn't
there already. The communication goes through a unix socket usually in /tmp/ .
When the user requests a job (using a ts client), the client waits for the server message to
know when it can start. When the server allows starting , this client usually forks, and runs
the command with the proper environment, because the client runs run the job and not
the server, like in 'at' or 'cron'. So, the ulimits, environment, pwd,. apply.
When the job finishes, the client notifies the server. At this time, the server may notify
any waiting client, and stores the output and the errorlevel of the finished job.
Moreover the client can take advantage of many information from the server: when a job
finishes, where does the job output go to, etc.
Eric Keller wrote a nodejs web server showing the status of the task spooler queue (
Look at its manpage (v0.6.1). Here you also
have a copy of the help for the same version:
usage: ./ts [action] [-ngfmd] [-L <lab>] [cmd...]
TS_SOCKET the path to the unix socket used by the ts command.
TS_MAILTO where to mail the result (on -m). Local user by default.
TS_MAXFINISHED maximum finished jobs in the queue.
TS_ONFINISH binary called on job end (passes jobid, error, outfile, command).
TS_ENV command called on enqueue. Its output determines the job information.
TS_SAVELIST filename which will store the list, if the server dies.
TS_SLOTS amount of jobs which can run at once, read on server start.
-K kill the task spooler server
-C clear the list of finished jobs
-l show the job list (default action)
-S [num] set the number of max simultanious jobs of the server.
-t [id] tail -f the output of the job. Last run if not specified.
-c [id] cat the output of the job. Last run if not specified.
-p [id] show the pid of the job. Last run if not specified.
-o [id] show the output file. Of last job run, if not specified.
-i [id] show job information. Of last job run, if not specified.
-s [id] show the job state. Of the last added, if not specified.
-r [id] remove a job. The last added, if not specified.
-w [id] wait for a job. The last added, if not specified.
-u [id] put that job first. The last added, if not specified.
-U <id-id> swap two jobs in the queue.
-h show this help
-V show the program version
Options adding jobs:
-n don't store the output of the command.
-g gzip the stored output (if not -n).
-f don't fork into background.
-m send the output by e-mail (uses sendmail).
-d the job will be run only if the job before ends well
-L <lab> name this task with a label, to be distinguished on listing.
To Raúl Salinas, for his inspiring ideas
To Alessandro Öhler, the first non-acquaintance user, who proposed and created the
Пантюхину, who created the BSD
To the useful, although sometimes uncomfortable, UNIX interface.
To Alexander V. Inyukhin, for the debian packages.
To Pascal Bleser, for the SuSE packages.
To Sergio Ballestrero, who sent code and motivated the development of a multislot version
To GNU, an ugly but working and helpful ol' UNIX implementation.
After some research I found out that Solaris logs crontab messages
to /var/cron/log (which is actually pretty predictable logging for Solaris).
The log entries for the updating of the sunfreeware mirror looked something
> CMD: /usr/local/bin/update-sunfreeware
> ftp 21022 c Fri Jul 30 06:00:00 2004
! bad user (ftp) Fri Jul 30 06:00:00 2004
So we are talking about a bad user here. Well actually the user is all
there and running in /etc/passwd and /etc/shadow. But hey wait, the
FTP account is locked. Well I found that normal behaviour, but guess
what, crontab expects a password there, else the account is not good
and is a bad user!
FAIR USE NOTICEThis site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
You can use PayPal to to buy a cup of coffee for authors
of this site
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society.We do not warrant the correctness