|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
| Recommended Links | Installation Checklist | YaST | RPM | Packages | ||
| Apache | pure-ftpd | SSH | VNC/VINO | syslog-ng | Cron | LVM |
| Runlevels and RC scripts | Xinetd | rsync | NIS | NFS | Samba | Beagle |
|
postfix |
sendmail |
|
Humor | Etc |
From user-space, processes are represented by process identifiers (PIDs). From the user's perspective, a PID is a numeric value that uniquely identifies the process. A PID doesn't change during the life of a process, but PIDs can be reused after a process dies, so it's not always ideal to cache them.
In user-space, you can create processes in any of several ways. You can execute a program (which results in the creation of a new process) or, within a program, you can invoke a
forkorexecsystem call. Theforkcall results in the creation of a child process, while anexeccall replaces the current process context with the new program. I discuss each of these methods to understand how they work.For this article, I build the description of processes by first showing the kernel representation of processes and how they're managed in the kernel, then review the various means by which processes are created and scheduled on one or more processors, and finally, what happens if they die.
Read more by Tim Jones on developerWorks
Within the Linux kernel, a process is represented by a rather large structure called
task_struct. This structure contains all of the necessary data to represent the process, along with a plethora of other data for accounting and to maintain relationships with other processes (parents and children). A full description of thetask_structis beyond the scope of this article, but a portion oftask_structis shown in Listing 1. This code contains the specific elements this article explores. Note thattask_structresides in ./linux/include/linux/sched.h.
Listing 1. A small portion of task_struct
struct task_struct { volatile long state; void *stack; unsigned int flags; int prio, static_prio; struct list_head tasks; struct mm_struct *mm, *active_mm; pid_t pid; pid_t tgid; struct task_struct *real_parent; char comm[TASK_COMM_LEN]; struct thread_struct thread; struct files_struct *files; ... };
In Listing 1, you can see several items that you'd expect, such as the state of execution, a stack, a set of flags, the parent process, the thread of execution (of which there can be many), and open files. I explore these later in the article but will introduce a few here. The
statevariable is a set of bits that indicate the state of the task. The most common states indicate that the process is running or in a run queue about to be running (TASK_RUNNING), sleeping (TASK_INTERRUPTIBLE), sleeping but unable to be woken up (TASK_UNINTERRUPTIBLE), stopped (TASK_STOPPED), or a few others. A complete list of these flags is available in ./linux/include/linux/sched.h.The
flagsword defines a large number of indicators, indicating everything from whether the process is being created (PF_STARTING) or exiting (PF_EXITING), or even if the process is currently allocating memory (PF_MEMALLOC). The name of the executable (excluding the path) occupies thecomm(command) field.Each process is also given a priority (called
static_prio), but the actual priority of the process is determined dynamically based on loading and other factors. The lower the priority value, the higher its actual priority.The
tasksfield provides the linked-list capability. It contains aprevpointer (pointing to the previous task) and anextpointer (pointing to the next task).The process's address space is represented by the
mmandactive_mmfields. Themmrepresents the process's memory descriptors, while theactive_mmis the previous process's memory descriptors (an optimization to improve context switch times).Finally, the
thread_structidentifies the stored state of the process. This element depends on the particular architecture on which Linux is running, but you can see an example of this in ./linux/include/asm-i386/processor.h. In this structure, you'll find the storage for the process when it is switched from the executing context (hardware registers, program counter, and so on).
Back to top
Maximum processes
Although processes are dynamically allocated within Linux, certain maximums are observed. The maximum is represented in the kernel by a symbol called
max_threads, which can be found in ./linux/kernel/fork.c). You can change this value from user-space through the proc file system at /proc/sys/kernel/threads-max.Now, let's explore how you manage processes within Linux. In most cases, processes are dynamically created and represented by a dynamically allocated
task_struct. One exception is theinitprocess itself, which always exists and is represented by a statically allocatedtask_struct. You can see an example of this in ./linux/arch/i386/kernel/init_task.c.All processes in Linux are collected in two different ways. The first is a hash table, which is hashed by the PID value; the second is a circular doubly linked list. The circular list is ideal for iterating through the task list. As the list is circular, there's no head or tail; but as the
init_taskalways exists, you can use it as an anchor point to iterate further. Let's look at an example of this to walk through the current set of tasks.The task list is not accessible from user-space, but you can easily solve that problem by inserting code into the kernel in the form of a module. A very simple program is shown in Listing 2 that iterates the task list and provides a small amount of information about each task (
name,pid, andparentname). Note here that the module usesprintkto emit the output. To view the output, you need to view the /var/log/messages file with thecatutility (ortail -f /var/log/messagesin real time). Thenext_taskfunction is a macro in sched.h that simplifies the iteration of the task list (returns atask_structreference of the next task).
Listing 2. Simple kernel module to emit task information (procsview.c)
#include <linux/kernel.h> #include <linux/module.h> #include <linux/sched.h> int init_module( void ) { /* Set up the anchor point */ struct task_struct *task = &init_task; /* Walk through the task list, until we hit the init_task again */ do { printk( KERN_INFO "*** %s [%d] parent %s\n", task->comm, task->pid, task->parent->comm ); } while ( (task = next_task(task)) != &init_task ); return 0; } void cleanup_module( void ) { return; }
You can compile this module with the Makefile shown in Listing 3. When compiled, you can insert the kernel object withinsmod procsview.koand remove it withrmmod procsview.
Listing 3. Makefile to build the kernel module
obj-m += procsview.o KDIR := /lib/modules/$(shell uname -r)/build PWD := $(shell pwd) default: $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
After insertion, /var/log/messages displays output as shown below. You can see here the idle task (called
swapper) and theinittask (pid 1).
Nov 12 22:19:51 mtj-desktop kernel: [8503.873310] *** swapper [0] parent swapper Nov 12 22:19:51 mtj-desktop kernel: [8503.904182] *** init [1] parent swapper Nov 12 22:19:51 mtj-desktop kernel: [8503.904215] *** kthreadd [2] parent swapper Nov 12 22:19:51 mtj-desktop kernel: [8503.904233] *** migration/0 [3] parent kthreadd ...
Note that it's also possible to identify the currently running task. Linux maintains a symbol called
currentthat is the currently running process (of typetask_struct). If at the end ofinit_moduleyou add the line:
printk( KERN_INFO, "Current task is %s [%d], current->comm, current->pid );
you would see:
Nov 12 22:48:45 mtj-desktop kernel: [10233.323662] Current task is insmod [6538]
Note that the current task is
insmod, because theinit_modulefunction executes within the context of the execution of theinsmodcommand. Thecurrentsymbol actually refers to a function (get_current) and can be found in an arch-specific header (for example, ./linux/include/asm-i386/current.h).
Back to top
System call functions
You've probably seen a pattern with the system calls. In many cases, system calls are named
sys_*and provide some of the initial functionality to implement the call (such as error checking or user-space activities). The real work is often delegated to another function calleddo_*.So, let's walk through the creation of a process from user-space. The underlying mechanism is the same for user-space tasks and kernel tasks, as both eventually rely on a function called
do_forkto create the new process. In the case of creating a kernel thread, the kernel calls a function calledkernel_thread(see ./linux/arch/i386/kernel/process.c), which performs some initialization, then callsdo_fork.A similar action occurs for user-space process creation. In user-space, a program calls
fork, which results in a system call to the kernel function calledsys_fork(see ./linux/arch/i386/kernel/process.c). The function relationships are shown graphically in Figure 1.
Figure 1. Function hierarchy for process creation
![]()
From Figure 1, you can see that
do_forkprovides the basis for process creation. You can find thedo_forkfunction in ./linux/kernel/fork.c (along with the partner function,copy_process).The
do_forkfunction begins with a call toalloc_pidmap, which allocates a new PID. Next,do_forkchecks to see whether the debugger is tracing the parent process. If it is, theCLONE_PTRACEflag is set in theclone_flagsin preparation for forking. Thedo_forkfunction then continues with a call tocopy_process, passing the flags, stack, registers, parent process, and newly allocated PID.The
copy_processfunction is where the new process is created as a copy of the parent. This function performs all actions except for starting the process, which is handled later. The first step incopy_processis validation of theCLONEflags to ensure that they're consistent. If they're not, anEINVALerror is returned. Next, the Linux Security Module (LSM) is consulted to see whether the current task may create a new task. To learn more about LSMs in the context of Security-Enhanced Linux (SELinux), check out the Resources section.Next, the
dup_task_structfunction (found in ./linux/kernel/fork.c) is called, which allocates a newtask_structand copies the current process's descriptors into it. After a new thread stack is set up, some state information is initialized and control returns tocopy_process. Back incopy_process, some housekeeping is performed in addition to several other limit and security checks, including a variety of initialization on your newtask_struct. A sequence of copy functions is then invoked that copy individual aspects of the process, from copying open file descriptors (copy_files), copying signal information (copy_sighandandcopy_signal), copying process memory (copy_mm), and finally copying the thread (copy_thread).The new task is then assigned to a processor, with some additional checking based on the processors on which the process is allowed to execute (
cpus_allowed). After the priority of the new process inherits the priority of the parent, a small amount additional housekeeping is performed, and control returns todo_fork. At this point, your new process exists but is not yet running. Thedo_forkfunction fixes this with a call towake_up_new_task. This function, which you can find in ./linux/kernel/sched.c), initializes some of the scheduler housekeeping information, places the new process in a run queue, then wakes it up for execution. Finally, upon returning todo_fork, the PID value is returned to the caller and the process is complete.
Back to top
While a process exists in Linux, it can potentially be scheduled through the Linux scheduler. Although outside of the scope of this article, the Linux scheduler maintains a set of lists for each priority level on which
task_structreferences reside. Tasks are invoked through theschedulefunction (available in ./linux/kernel/sched.c), which determines the best process to run based on loading and prior process execution history. You can learn more about the Linux version 2.6 scheduler in Resources.
Back to top
Process destruction can be driven by several events—from normal process termination, through a signal, or through a call to the
exitfunction. However process exit is driven, the process ends through a call to the kernel functiondo_exit(available in ./linux/kernel/exit.c). This process is shown graphically in Figure 2.
Figure 2. Function hierarchy for process destruction
![]()
The purpose behind
do_exitis to remove all references to the current process from the operating system (for all resources that are not shared). The destruction process first indicates that the process is exiting by setting thePF_EXITINGflag. Other aspects of the kernel use this indication to avoid manipulating this process while it's being removed. The cycle of detaching the process from the various resources that it attained during its life is performed through a series of calls, includingexit_mm(to remove memory pages) toexit_keys(which disposes of per-thread session and process security keys). Thedo_exitfunction performs various accountings for the disposal of the process, then a series of notifications (for example, to signal the parent that the child is exiting) is performed through a call toexit_notify. Finally, the process state is changed toPF_DEAD, and theschedulefunction is called to select a new process to execute. Note that if signalling is required to the parent (or the process is being traced), the task will not completely disappear. If no signalling is necessary, a call torelease_taskwill actually reclaim the memory that the process used.
Back to top
Linux continues to evolve, and one area that will see further innovation and optimization is process management. While keeping true to UNIX principles, Linux continues to push the boundaries. New processor architectures, symmetrical multiprocessing (SMP), and virtualization will drive new advances in this area of the kernel. One example is the new O(1) scheduler introduced in Linux version 2.6, which provides scalability for systems with large numbers of tasks. Another is the updated threading model using the Native POSIX Thread Library (NPTL), which enables efficient threading beyond the prior LinuxThreads model. You can learn more about these innovations and what's ahead in Resources.
Learn
- One of the most innovative aspects of the 2.6 kernel is its O(1) scheduler. It allows Linux to scale to very large numbers of processes without the typical overhead. You can learn more about the 2.6 kernel schedule in "Inside the Linux Scheduler" (developerWorks, June 2006).
- For a great look at memory management in Linux, check out Mel Gorman's Understanding the Linux Virtual Memory Manager (Prentice Hall, 2004), which is available in PDF form. This book provides a detailed but accessible presentation of memory management in Linux, including a chapter on process address spaces.
- For a nice introduction to process management, see Performance Tuning for Linux: An Introduction to Kernels (Prentice Hall, 2005). A sample chapter is available from IBM Press.
- Linux provides an interesting approach to system calls that involves transitioning between user-space and the kernel (separate address spaces). You can read more about this in "Kernel command using Linux system calls" (developerWorks, March 2007).
- In this article, you saw cases in which the kernel checked the security capabilities of the caller. The basic interface between the kernel and the security framework is called the Linux Security Module. To explore this module in the context of SELinux, read "Anatomy of Security-Enhanced Linux (SELinux)" (developerWorks, April 2008).
- The Portable Operating System Interface (POSIX) standard for threads defines a standard application programming interface (API) for creating and managing threads. You can find implementations for POSIX on Linux, Sun Solaris, and even non-UNIX-based operating systems.
- The Native POSIX Thread Library is a threading implementation in the Linux kernel for efficiently executing POSIX threads. This technology was introduced into the 2.6 kernel, where the prior implementation was called LinuxThreads.
- Read "TASK_KILLABLE: New process state in Linux" (developerWorks, September 2008) for an introduction to a useful alternative to the TASK_UNINTERRUPTIBLE and TASK_INTERRUPTIBLE process states.
- Read more of Tim's articles on developerWorks.
- In the developerWorks Linux zone, find more resources for Linux developers (including developers who are new to Linux), and scan our most popular articles and tutorials.
- See all Linux tips and Linux tutorials on developerWorks.
- Stay current with developerWorks technical events and Webcasts.
F**king Beagle on Suse 10Ron Albright
2006-03-25, 10:19 am
How do I stop it, forever. I figured out how to kill the Beagle process
that were taking up 500MB of mymemory but there are still process
starting every night by root and suing to another uid and they never exit.
What is starting these things and how do I stop them? I can't find
anything in the rc scripts or crontabs. Short of uninstalling it where can
I findinformation on what's starting anything related to Beagle? I can
find all kinds of information on installing and using it but nothing on
stopping it. Any pointers would be greatly appreciated.
Nico Kadel-Garcia 2006-03-25, 10:19 am
Ron Albright wrote:
> How do I stop it, forever. I figured out how to kill the Beagle
> process that were taking up 500MB of my memory but there are still
> process starting every night by root and suing to another uid and
> they never exit. What is starting these things and how do I stop
> them? I can't find anything in the rc scripts or crontabs. Short of
> uninstalling it where can I find information on what's starting
> anything related to Beagle? I can find all kinds of information on
> installing and using it but nothing on stopping it. Any pointers
> would be greatly appreciated.
rpm -e beagle? It seems to be an RPM package.
J. Clarke 2006-03-25, 10:19 am
Ron Albright wrote:
> How do I stop it, forever. I figured out how to kill the Beagle process
> that were taking up 500MB of my memory but there are still process
> starting every night by root and suing to another uid and they never exit.
> What is starting these things and how do I stop them? I can't find
> anything in the rc scripts or crontabs. Short of uninstalling it where can
> I find information on what's starting anything related to Beagle? I can
> find all kinds of information on installing and using it but nothing on
> stopping it. Any pointers would be greatly appreciated.
You need to find what is starting beagled and either induce it to quit
starting beagled or have it start beagled with "beagled
--disable-scheduler". Once beagled is running it does its own scheduling.
Best thing to do about it IMO is remove the whole package.
--
--John
to email, dial "usenet" and validate
(was jclarke at eye bee em dot net)
Anatomy of Linux dynamic libraries
Dynamically linked shared libraries are an important aspect of
GNU/Linux. They allow executables to dynamically access external
functionality at run time and thereby reduce their overall
memory footprint (by bringing functionality in when it's
needed). This article investigates the process of creating and
using dynamic libraries, provides details on the various tools
for exploring them, and explores how these libraries work under
the hood. 20 Aug 2008
Anatomy of Linux loadable kernel modules
Linux loadable kernel modules, introduced in version 1.2 of the
kernel, are one of the most important innovations in the Linux
kernel. They provide a kernel that is both scalable and dynamic.
Discover the ideas behind loadable modules, and learn how these
independent objects dynamically become part of the Linux kernel. 16 Jul 2008
Anatomy of Linux journaling file systems
In recent history, journaling file systems were viewed as an
oddity and thought of primarily in terms of research. But today,
a journaling file system (ext3) is the default in Linux.
Discover the ideas behind journaling file systems, and learn how
they provide better integrity in the face of a power failure or
system crash. Learn about the various journaling file systems in
use today, and peek into the next generation of journaling file
systems. 04 Jun 2008
Anatomy of Linux flash file systems
You've probably heard of Journaling Flash File System (JFFS) and
Yet Another Flash File System (YAFFS), but do you know what it
means to have a file system that assumes an underlying flash
device? This article introduces you to flash file systems for
Linux, and explores how they care for their underlying
consumable devices (flash parts) through wear leveling, and
identifies the various flash file systems available along with
their fundamental designs. 20 May 2008
Anatomy of Security-Enhanced Linux (SELinux)
Linux has been described as one of the most secure operating
systems available, but the National Security Agency (NSA) has
taken Linux to the next level with the introduction of
Security-Enhanced Linux (SELinux). SELinux takes the existing
GNU/Linux operating system and extends it with kernel and
user-space modifications to make it bullet-proof. If you're
running a 2.6 kernel today, you might be surprised to know that
you're using SELinux right now! This article explores the ideas
behind SELinux and how it's implemented. 29 Apr 2008
Anatomy of real-time Linux architectures
It's not that Linux isn't fast or efficient, but in some cases
fast just isn't good enough. What's needed instead is the
ability to deterministically meet scheduling deadlines with
specific tolerances. Discover the various real-time Linux
alternatives and how they achieve real time -- from the early
architectures that mimic virtualization solutions to the options
available today in the standard 2.6 kernel. 15 Apr 2008
Anatomy of the Linux SCSI subsystem
The Small Computer Systems Interface (SCSI) is a collection of
standards that define the interface and protocols for
communicating with a large number of devices (predominantly
storage related). Linux provides a SCSI subsystem to permit
communication with these devices. Linux is a great example of a
layered architecture that joins high-level drivers, such as disk
or CD-ROM drivers, to a physical interface such as Fibre Channel
or Serial Attached SCSI (SAS). This article introduces you to
the Linux SCSI subsystem and discusses where this subsystem is
going in the future. 14 Nov 2007
Anatomy of Linux synchronization methods
In your Linux education, you may have learned about concurrency,
critical sections, and locking, but how do you use these
concepts within the kernel? This article reviews the locking
mechanisms available within the 2.6 kernel, including atomic
operators, spinlocks, reader/writer locks, and kernel
semaphores. It also explores where each mechanism is most
applicable for building safe and efficient kernel code. 31 Oct 2007
Anatomy of the Linux file system
When it comes to file systems, Linux is the Swiss Army knife of
operating systems. Linux supports a large number of file
systems, from journaling to clustering to cryptographic. Linux
is a wonderful platform for using standard and more exotic file
systems and also for developing file systems. This article
explores the virtual file system (VFS) -- sometimes called the
virtual filesystem switch -- in the Linux kernel and then
reviews some of the major structures that tie file systems
together. 30 Oct 2007
Anatomy of the Linux networking stack
One of the greatest features of the Linux operating system is
its networking stack. It was initially a derivative of the BSD
stack and is well organized with a clean set of interfaces. Its
interfaces range from the protocol agnostics, such as the common
sockets layer interface or the device layer, to the specific
interfaces of the individual networking protocols. This article
explores the structure of the Linux networking stack from the
perspective of its layers and also examines some of its major
structures. 27 Jun 2007
Anatomy of the Linux kernel
The Linux kernel is the core of a large and complex operating
system, and while it's huge, it is well organized in terms of
subsystems and layers. In this article, you explore the general
structure of the Linux kernel and get to know its major
subsystems and core interfaces. Where possible, you get links to
other IBM articles to help you dig deeper. 06 Jun 2007
Anatomy of the Linux slab allocator
Good operating system performance depends in part on the
operating system's ability to efficiently manage resources. In
the old days, heap memory managers were the norm, but
performance suffered due to fragmentation and the need for
memory reclamation. Today, the Linux kernel uses a method that
originated in Solaris but has been used in embedded systems for
quite some time, allocating memory as objects based on their
size. This article explores the ideas behind the slab allocator
and examines its interfaces and their use.