|
Softpanorama
(slightly skeptical)
Open Source Software Educational Society |
May the
source be with you,
but remember the KISS principle ;-)
|
Unix/Linux Internals Courses
At the center of the UNIX onion is a program called the kernel. Although
you are unlikely to deal with the kernel directly, it is absolutely crucial
to the operation of the UNIX system.
The kernel provides the essential services that make up the heart of
UNIX systems; it allocates memory, keeps track of the physical location
of files on the computer's hard disks, loads and executes binary programs
such as shells, and schedules the task swapping without which UNIX systems
would be incapable of doing more than one thing at a time. The kernel accomplishes
all these tasks by providing an interface between the other programs running
under its control and the physical hardware of the computer; this interface,
the system call interface, effectively insulates the other programs on the
UNIX system from the complexities of the computer. For example, when a running
program needs access to a file, it cannot simply open the file; instead
it issues a system call which asks the kernel to open the file. The kernel
takes over and handles the request, then notifies the program whether the
request succeeded or failed. To read data in from the file takes another
system call; the kernel determines whether or not the request is valid,
and if it is, the kernel reads the required block of data and passes it
back to the program. Unlike DOS (and some other operating systems), UNIX
system programs do not have access to the physical hardware of the
computer. All they see are the kernel services, provided by the system call
interface.
The system call interface is an example of an API, or application programming
interface. An API is a set of system calls with strictly defined parameters,
which allow an application (or other program) to request access to a service;
it literally acts as an interface. (For example, a large database system
might provide an API that allows programmers to write external programs
that request services from the database.)
IBM poster explaining virtual memory 1978
- If it's there and you can see it--it's real
- If it's not there and you can see it--it's virtual
- If it's there and you can't see it--it's transparent
- If it's not there and you can't see it--you erased it!
|
|
Notes:
- This is a Spartan WHYFF (We Help
You For Free) site written by people for whom English
is not a native language.
Some amount of grammar and spelling errors should be
expected.
- The site contain some broken links
as it develops like a living tree...
Please try to use Google, Open directory,
etc. to find a replacement link (see
HOWTO search the WEB for details). We would appreciate
if you can
mail us a correct link.
|
|
|
Linux is a very dynamic system with constantly
changing computing needs. The representation of the
computational needs of Linux centers around the
common abstraction of the process. Processes
can be short-lived (a command executed from the
command line) or long-lived (a network service). For
this reason, the general management of processes and
their scheduling is very important.
From user-space, processes are represented by
process identifiers (PIDs). From the user's
perspective, a PID is a numeric value that uniquely
identifies the process. A PID doesn't change during
the life of a process, but PIDs can be reused after
a process dies, so it's not always ideal to cache
them.
In user-space, you can create processes in any of
several ways. You can execute a program (which
results in the creation of a new process) or, within
a program, you can invoke a fork or
exec system call. The fork
call results in the creation of a child process,
while an exec call replaces the current
process context with the new program. I discuss each
of these methods to understand how they work.
For this article, I build the description of
processes by first showing the kernel representation
of processes and how they're managed in the kernel,
then review the various means by which processes are
created and scheduled on one or more processors, and
finally, what happens if they die.
That's a very questionable approach. Standardization is the most powerful
thing in computing. Actually Apple is indirectly subsidized by Microsoft
as he uses the same Intel-based architecture.
Conformity is a powerful instinct. There’s safety in numbers.
You have to be different to be better, but different is scary.
So of course there’s some degree of herd mentality in every industry.
But I think it’s more pronounced, to a pathological degree, in the PC
hardware industry. It was at the root of long-standing punditry holding
that Apple should license the Mac OS to other PC makers, or that Apple
should dump Mac OS and make Windows PCs. On the surface, those two old
canards seem contradictory — one arguing that Apple should be a hardware
company, the other arguing that it should be a software company. But
at their root they’re the same argument: that Apple should stop being
different, and either act just like other PC makers (and sell computers
running Windows) or else act just like Microsoft (and sell licenses
to its OS).
No one argues those two points any more. But it’s the same herd mentality
that led to the rash of Apple needs to get in the “netbook” game
punditry that I
claim-checked earlier this week. I could have linked to a dozen
others. The argument, though, is the same: everyone else is making netbooks,
so Apple should, too. Why? Because everyone else is.
I think there’s a simple reason why the herd mentality is worse in
the PC industry: Microsoft. In fact, I think it used to be worse. A
decade ago the entire computing industry — all facets of it — was dominated
by a herd mentality that boiled down to Get behind Microsoft and
follow their lead, or else you’ll get stomped. That’s no longer
true in application software. The web, and Google in particular, have
put an end to that.
But the one area where Microsoft still reigns supreme is in PC operating
systems. PC hardware makers are crippled. They can’t stand apart from
the herd even if they want to. Their OS choices are: (a) the same version
of Windows that every other PC maker includes; or (b) the same open
source Linux distributions that every other PC maker could include but
which no customers want to buy.1
Apple’s ability to produce innovative hardware is inextricably intertwined
with its ability to produce innovative software. The iPhone is an even
better example than the Mac.
It’s not just that Apple is different among computer makers.
It’s that Apple is the only one that even can be different,
because it’s the only one that has its own OS. Part of the industry-wide
herd mentality is an assumption that no one else can make a computer
OS — that anyone can make a computer but only Microsoft can make an
OS. It should be embarrassing to companies like Dell and Sony, with
deep pockets and strong brand names, that they’re stuck selling computers
with the same copy of Windows installed as the no-name brands.
And then there’s HP, a company with one of the best names and proudest
histories in the industry. Apple made news this week for the design
and tech specs of its all-new iMacs, which start at $1199. HP made news
this week for unveiling a Windows 7 launch bundle at Best Buy that includes
a desktop PC and two laptops, all for $1199. That might be
great for Microsoft, but how is it good for HP that their brand now
stands for bargain basement prices?
Operating systems aren’t mere components like RAM or CPUs; they’re
the single most important part of the computing experience. Other than
Apple, there’s not a single PC maker that controls the most important
aspect of its computers. Imagine how much better the industry would
be if there were more than one computer maker trying to move the state
of the art forward.
[Jul 22, 2008]
UNDELETED
by Ralf Spenneberg
Linux Magazine Online
Modern filesystems make forensic file recovery much more difficult.
Tools like Foremost and Scalpel identify data structures and carve files
from a hard disk image.
IT experts and investigators have many reasons for reconstructing
deleted files. Whether an intruder has deleted a log to conceal an attack
or a user has destroyed a digital photo collection with an accidental
rm ‑rf, you might someday face the need to recover deleted data. In
the past, recovery experts could easily retrieve a lost file because
an earlier generation of filesystems simply deleted the directory entry.
The meta information that described the physical location of the data
on the disk was preserved, and tools like The Coroner’s Toolkit (TCT
[1]) and The Sleuth Kit (TSK [2]) could uncover the information necessary
for restoring the file. Today, many filesystems delete the full set
of meta information, leaving the data blocks. Putting these pieces together
correctly is called file carving – forensic experts carve the raw data
off the disk and reconstruct the files from it. The more fragmented
the filesystem, the harder this task become.
On UNIX® systems, each system and end-user task is contained within
a process. The system creates new processes all the time and processes
die when a task finishes or something unexpected happens. Here,
learn how to control processes and use a number of commands to peer
into your system.
At a recent street fair, I was mesmerized by the one-man band. Yes,
I am easily amused, but I was impressed nonetheless. Combining harmonica,
banjo, cymbals, and a kick drum -- at mouth, lap, knees, and foot, respectively
-- the veritable solo symphony gave a rousing performance of the Led
Zeppelin classic "Stairway to Heaven" and a moving interpretation of
Beethoven's Fifth Symphony. By comparison, I'm lucky if I can pat my
head and rub my tummy in tandem. (Or is it pat my tummy and rub my head?)
Lucky for you, the UNIX® operating system is much more like the one-man
band than your clumsy columnist. UNIX is exceptional at juggling many
tasks at once, all the while orchestrating access to the system's finite
resources (memory, devices, and CPUs). In lay terms, UNIX can readily
walk and chew gum at the same time.
This month, let's probe a little deeper than usual to examine how
UNIX manages to do so many things simultaneously. While spelunking,
let's also glimpse the internals of your shell to see how job-control
commands, such as Control-C (terminate) and Control-Z (suspend), are
implemented. Headlamps on! To the bat cave!
This is just a discussion. You need to read
the
report first. It contains a lot of interesting information
Microsoft Research has released part
of a report on the "Singularity" kernel they've been working on as part
of their planned shift to network computing.
The
report includes some performance comparisons that show Singularity
beating everything else on a 1.8Ghz AMD Athlon-based machine.
What's noteworthy about it is that Microsoft
compared Singularity to FreeBSD and Linux as well as Windows/XP - and
almost every result shows Windows losing to the two Unix variants.
For example, they show the number of
CPU cycles needed to "create and start a process" as 1,032,000 for FreeBSD,
719,000 for Linux, and 5,376,000 for Windows/XP. Similarly they provide
four graphs comparing raw disk I/O and show the Unix variants beating
Windows/XP in three (and a half) of the four cases.
Oddly, however, it's the cases in which
they report Windows/XP as beating Unix that are the most interesting.
There are three examples of this: one in which they count the CPU cycles
needed for a "thread yield" as 911 for FreeBSD, 906 for Linux, and 753
for Windows XP; one in which they count CPU cycles for a "2 thread wait-set
ping pong" as 4,707 for FreeBSD, 4,041 for Linux, and 1,658 for Windows/XP;
and, one in which they report that "for the sequential read operations,
Windows XP performed significantly better than the other systems for
block sizes less than 8 kilobytes."
So how did they get these results?
The sequential tests read or wrote
512MB of data from the same portion of the hard disk. The random
read and write tests performed 1000 operations on the same sequences
of blocks on the disk. The tests were single threaded and performed
synchronous raw I/O. Each test was run seven times and the results
averaged.
umm…
The Unix thread tests ran on user-space
scheduled pthreads. Kernel scheduled threads performed significantly
worse. The "wait-set ping pong" test measured the cost of switching
between two threads in the same process through a synchronization
object. The "2 message ping pong" measured the cost of sending a
1-byte message from one process to another and then back to the
original process. On Unix, we used sockets, on Windows, a named
pipe, and on Singularity, a channel.
So why is this interesting? Because their
test methods reflect Windows internals, not Unix kernel design. There
are better, faster, ways of doing these things in Unix, but these guys
- among the best and brightest programmers working at Microsoft- either
didn't know or didn't care.
If all the basics
are the same, what has changed? Well, these things:
- Number of system
calls
- Languages we
use
- Subsystems we
program
- Need for portability
- Relevance of
UNIX standards
More System Calls
The number of system
calls has quadrupled, more or less, depending on
what you mean by "system call." The first edition
of Advanced UNIX Programming focused on only
about 70 genuine kernel system calls—for example,
open, read, and write; but
not library calls like fopen, fread,
and fwrite. The second edition includes about
300. (There are about 1,100 standard function calls
in all, but many of those are part of the Standard
C Library or are obviously not kernel facilities.)
Today's UNIX has threads, real-time signals, asynchronous
I/O, and new interprocess-communication features
(POSIX IPC), none of which existed 20 years ago.
This has caused, or been caused by, the evolution
of UNIX from an educational and research system
to a universal operating system. It shows up in
embedded systems (parking meters, digital video
recorders); inside Macintoshes; on a few million
web servers; and is even becoming a desktop system
for the masses. All of these uses were unanticipated
in 1984.
More Languages
In 1984, UNIX applications
were usually programmed in C, occasionally mixed
with shell scripts, Awk, and Fortran. C++ was just
emerging; it was implemented as a front end to the
C compiler. Today, C is no longer the principal
UNIX application language, although it's still important
for low-level programming and as a reference language.
(All the examples in both books are written in C.)
C++ is efficient enough to have replaced C when
the application requirements justify the extra effort,
but many projects use Java instead, and I've never
met a programmer who didn't prefer it over C++.
Computers are fast enough so that interpretive scripting
languages have become important, too, led by Perl
and Python. Then there are the web languages: HTML,
JavaScript, and the various XML languages, such
as XSLT.
Even if you're working
in one of these modern languages, though, you still
need to know what going on "down below," because
UNIX still defines—and, to a degree, limits—what
the higher-level languages can do. This is a challenge
for many students who want to learn UNIX, but don't
want to learn C. And for their teachers, who tire
of debugging memory problems and explaining the
distinction between declarations and definitions.
TIP
To enable students
to learn UNIX without first learning C, I developed
a Java-to-UNIX system-call interface that I
call Jtux. It allows almost all of the
UNIX system calls to be executed from Java,
using the same arguments and datatypes as the
official C calls. You can find out more about
Jtux and download its source code from
http://basepath.com/aup/.
More Subsystems
The third area of
change is that UNIX is both more visible than ever
(sold by Wal-Mart!) and more hidden, underneath
subsystems like J2EE and web servers, Apache, Oracle,
and desktops such as KDE or GNOME. Many application
programmers are programming for these subsystems,
rather than for UNIX directly. What's more, the
subsystems themselves are usually insulated from
UNIX by a thin portability layer that has different
implementations for different operating systems.
Thus, many UNIX system programmers these days are
working on middleware, rather than on the end-user
applications that are several layers higher up.
More Portability
The fourth change
is the requirement for portability between UNIX
systems, including Linux and the BSD-derivatives,
one of which is the Macintosh OS X kernel (Darwin).
Portability was of some interest in 1984, but today
it's essential. No developer wants to be locked
into a commercial version of UNIX without the possibility
of moving to Linux or BSD, and no Linux developer
wants to be locked into only one distribution. Platforms
like Java help a lot, but only serious attention
to the kernel APIs, along with careful testing,
will ensure that the code is really portable. Indeed,
you almost never hear a developer say that he or
she is writing for XYZ's UNIX. It's much more common
to hear "UNIX and Linux," implying that the vendor
choice will be made later. (The three biggest proprietary
UNIX hardware companies—Sun, HP, and IBM—are all
strong supporters of Linux.)
More Complete Standards
The requirement for
portability is connected with the fifth area of
change, the role of standards. In 1984, a UNIX standards
effort was just starting. The IEEE's POSIX group
hadn't yet been formed. Its first standard, which
emerged in 1988, was a tremendous effort of exceptional
quality and rigor, but it was of very little use
to real-world developers because it left out too
many APIs, such as those for interprocess communication
and networking. That minimalist approach to standards
changed dramatically when The Open Group was formed
from the merger of X/Open and the Open Software
Foundation in 1996. Its objective was to include
all the APIs that the important applications were
using, and to specify them as well as time allowed—which
meant less precisely than POSIX did. They even named
one of their standards Spec 1170, the number
being the total of 926 APIs, 70 headers, and 174
commands. Quantity over quality, maybe, but the
result meant that for the first time programmers
would find in the standard the APIs they really
needed. Today, The Open Group's
Single UNIX Specification is the best guide
for UNIX programmers who need to write portably.
The following tutorial describes various
common methods for reading and writing files and directories on a Unix
system. Part of the information is common C knowledge, and is repeated
here for completeness. Other information is Unix-specific, although
DOS programmers will find some of it similar to what they saw in various
DOS compilers. If you are a proficient C programmer, and know everything
about the standard I/O functions, its buffering operations, and know
functions such as fseek() or fread(), you
may skip the standard C library I/O functions section. If in doubt,
at least skim through this section, to catch up on things you might
not be familiar with, and at least look at the
standard C library examples.
This document is copyright (c) 1998-2002
by guy keren.
The material in this document is provided AS IS, without any expressed
or implied warranty, or claim of fitness for a particular purpose. Neither
the author nor any contributers shell be liable for any damages incured
directly or indirectly by using the material contained in this document.
permission to copy this document (electronically or on paper, for personal
or organization internal use) or publish it on-line is hereby granted,
provided that the document is copied as-is, this copyright notice is
preserved, and a link to the original document is written in the document's
body, or in the page linking to the copy of this document.
Permission to make translations of this document is also granted, under
these terms - assuming the translation preserves the meaning of the
text, the copyright notice is preserved as-is, and a link to the original
document is written in the document's body, or in the page linking to
the copy of this document.
For any questions about the document and its license, please
contact the author.
[July 28, 2004]
FreeBSD system programming Nathan Boeger (nboeger
at khmere.com)
Mana Tominaga (mana at dumaexplorer.com)
Copyright (C) 2001,2002,2003,2004 Nathan Boeger and Mana Tominaga
Contents
Reasons why Reiser4 is
great for you:
- Reiser4 is the fastest filesystem,
and here are the benchmarks.
- Reiser4 is an atomic filesystem,
which means that your filesystem operations either entirely occur,
or they entirely don't, and they don't corrupt due to half occuring.
We do this without significant performance losses, because we invented
algorithms to do it without copying the data twice.
- Reiser4 uses dancing trees, which
obsolete the balanced tree algorithms used in databases (see farther
down). This makes Reiser4 more space efficient than other filesystems
because we squish small files together rather than wasting space
due to block alignment like they do. It also means that Reiser4
scales better than any other filesystem. Do you want a million files
in a directory, and want to create them fast? No problem.
- Reiser4 is based on plugins, which
means that it will attract many outside contributors, and you'll
be able to upgrade to their innovations without reformatting your
disk. If you like to code, you'll really like plugins....
- Reiser4 is architected for military
grade security. You'll find it is easy to audit the code, and that
assertions guard the entrance to every function.
V3 of reiserfs is used as the default
filesystem for SuSE, Lindows, and Gentoo. We don't touch the V3 code
except to fix a bug, and as a result we don't get bug reports for the
current mainstream kernel version. It shipped before the other journaling
filesystems for Linux, and is the most stable of them as a result of
having been out the longest. We must caution that just as Linux 2.6
is not yet as stable as Linux 2.4, it will also be some substantial
time before V4 is as stable as V3.
This web site compares and
contrasts operating systems. It originally started out on a small server
in the engineering department of Ohio State University to answer a single
question: “On technical considerations only, how does
Rhapsody
(also known as
Mac OS
X Server) stack up as a server operating system (especially in comparison
to
Windows NT)?” The web site now compares and contrasts server operating
systems and will in the near future expand to compare other kinds of
operating systems.
For
non-technical persons: A general overview of operating systems for
non-technical people is located at:
kinds of operating systems. Brief summaries of operating systems
are located at:
summaries
of operating systems. There is an entire section of pages on individual
operating systems, all formatted in the same order for easy comparison.
The
holistic area looks at operating systems from a holistic point of
view and particular subjects in that presentation may be useful for
comparison. Some of the charts and tables may
also be useful for specific comparisons.
For
technical persons: The
system
components area goes into detail about the inner workings of an
operating system and the individual operating systems pages provide
some technical information.
This
site is organized as an unbalanced tree structure, with cyclic graph
hyperlinks and a sequential traversal path through the tree.
A long time ago, my undergraduate operating-systems
class required that we cross-compile a small, standalone system and
upload it to a PDP-11 minicomputer. We could do some limited debugging
at the console if the program didn't crash. The development environment
was poor; it was painful and time-consuming to get things working, but
the experience was an overall confidence builder. I feel there is a
huge advantage for a student to control the operations of a computer
directly.
Another approach for teaching operating
systems is to provide a controlled runtime and development environment
using a simulator. Several universities teach operating-system concepts
using the Nachos simulator (<http://www.cs.washington.edu/homes/tom/nachos>).
The advantage is that the instructor can easily control much of the
environment for assignments, and the students don't waste time with
crashes, kernel builds, and rebooting. These kinds of systems can be
very simplistic and lack realism.
As a private pilot, I know that aviation
simulation goes only so far. You need to spend some time in the sky,
in the air-traffic-control system, in the weather, and with the attendant
dangers, to absorb and appreciate the training fully. A two-hour actual
flight lesson is often fatiguing and draining; but the same amount of
time in a simulator is more like a classroom experience. Similarly,
students sense the difference between working in a safe simulator environment
and working on a real kernel. Lessons with the latter seem more dramatic.
is now available for free as a collection of Acrobat
files.
This is a detailed guide to kernel configuration, compilation,
upgrades, and troubleshooting for ix86-based systems.
Nice tutorial
Like any time-sharing system, Linux achieves
the magical effect of an apparent simultaneous execution of multiple
processes by switching from one process to another in a very short time
frame. Process switch itself was discussed in
Chapter 3, Processes;
this chapter deals with scheduling, which
is concerned with when to switch and which process to choose.
The chapter consists of three parts.
The section "Scheduling
Policy" introduces the choices made by Linux to schedule processes
in the abstract. The section "The
Scheduling Algorithm" discusses the data structures used to implement
scheduling and the corresponding algorithm. Finally, the section "System
Calls Related to Scheduling" describes the system calls that affect
process scheduling.
VMware enables you to run a Virtual Machine, which is VMware's
version of an emulated state of Windows, Linux, or FreeBSD. You heard
me right-on VMware, not only can you do Windows, but also Linux and
FreeBSD. That means if you need to test out that new version of Linux,
but you don't want to format your drive just to test it out, VMware
can just create a virtual drive and you're on your way to seeing what
the latest version of your favorite distribution has to offer.
To date, VMware has been pretty much a development product, but thanks
to demand for a stable, versatile operating environment, VMware has
upped the ante and created their best version of VMware yet-2.0.2.
If you've used a package like Connectix VirtualPC for the Macintosh
PowerPC, you'll notice many likenesses it shares with VMware. The website
may really hype VMware up and make it sound like there is no loss of
performance, but the simple fact is that you do lose clock speed, RAM
and hard disk speed, just like you would with any piece of emulation
software.
In fact, you can tell both VMware and VirtualPC are designed along
the same lines. The configuration is much the same, except one is obviously
more PC-fied, while one is more Mac-centric.
Although, what it comes down to is compatibility. VMware does a much
better job at emulating x86 hardware, probably since it's operating
on top of x86 hardware. That's a logical assumption, right? Enough with
guesswork, let's take a look at what's really going on.

Here we see how it really works. A typical PC works like we see on the
left. I think the diagram oversimplifies things in a way, but it will
do the job.
Essentially, VMware interfaces directly with most your system hardware,
which is one way it achieves pretty good performance even on low-end
machines. Don't get me wrong, you still won't get the full speed of
your PC out of VMware, this happens because things like the hard disk
access (where it looks to be hurting the most) are still done through
the operating system.

This is how it all happens. This diagram shows you the devices which
need VMware still needs to call through the OS-disk, memory and CPU.

Once again, VMware has a few tricks up its sleeve. One great thing about
VMware is that you can utilize your local network to get access to your
Windows or Linux filesystem. In fact, you can even use a regular network
along with your local network at the same time, so you don't need to
sacrifice anything with the networking setup.
Coming from a hybrid Sys V and BSD system, the first
time I began maintaining a BSD system I was immediately plunged into
making system level changes and finding out very specific information
about the system. There is a tool for just such a task, sysctl.
Along with that, however, I had come across an unusual program that
needed access to such information as well. The program needed the information
"hard coded", something I did not like. Luckily, the sysctl
calls are easily (and extraordinarily well documented) accessible via
a simple system subroutine. This article will cover two aspects of
sysctl:
- Some examples using the
sysctl command.
- Examples with sample code on using the
sysctl subroutines.
Note: Examples were drawn from all three free
BSDs (I have run all three of them at one time or another):
NetBSD,
FreeBSD and
OpenBSD.
The sysctl Command (Facility)
It might be more correct to call sysctl
a facility or utility rather than just a command.
The official short definition is:
-
sysctl get or set kernel state
In reality (typical to BSD design - which is a
good thing) sysctl has been extended to a great
many things and show all sorts of great information. I say this
because judging by the short definition, one would thing all you
can do with it is examine kernel parameters and perhaps modify others
. . . well, that and:
- get specific hardware information
- get and set a wide variety of kernel level
network parameters
- get and set dependancy information
- . . . and the list goes on
Really, the well documented man page of
man 8 sysctl has all the information you need. Let us take
a look at some sample usages:
First, how about the OS type:
$ sysctl kern.ostype
kern.ostype = NetBSD
Here is a sample looking at the clockrate:
$ sysctl kern.clocktrate
kern.clockrate = tick = 10000, tickadj = 40, hz = 100, profhz = 100, stathz = 100
A very important (and often modified parameter
on systems) ye olde ip forwarding (where 1 is on and 0 is off):
$ sysctl net.inet.ip.forwarding
net.inet.ip.forwarding = 0
Now some real quick hardware gathering examples
that show us the following information respectfully:
- machine type
- specific model information
- number of processors
$ sysctl kern.hw.machine
hw.machine = sparc
$sysctl hw.model
hw.model = SUNW,SPARCstation-5, MB86904 @ 110 MHz, on-chip FPU
$ sysctl hw.ncpu
hw.ncpu = 1
Another quick note: all of the examples
were done in userland.
We have seen the ease of use of the sysctl
command, but the subroutine offers great access at a low level to
even more information.
Using the sysctl Subroutine(s)
Note: The next section requires a basic
understanding the C programming language.
The sysctl function allows programmatic
access to a wide array of information about the system itself, the
kernel and network information, in this respect it is very similar
in nature to it's command counterpart. It should also be quite obvious
that this is in fact the function that the sysctl primarily
uses (duh). This begs the question, why is this important to know
or understand? The name of the game is understanding, seeing how
to directly access the sysctl function is one
of the many steps to systems programming emlightenment. In short
order, what it is you do when you might use the sysctl
command. Additionally, using the function can help develop or extend
utilities. The reason sysctl is so wonderful at this
is how it is so linked to the core operating system. Again, I must
reiterate the BSD philosophy of extension versus new. It is better
to extend a pre-existing piece of software rather than encourage
the development of a completely new one, nevertheless, the
sysctl function could be useful for (and is no doubt in employed
in many other pieces of existing programs) building new utilities.
Well let us get to it shall we? For the sake of
simplicity, the code examples will follow some of the examples shown
in the command section of this article. The best way to illustrate
a usage is a case study, so. let us create one, for posterity, we
will acknowledge the great forecoders by using the example that
comes from the BSD Programmer's Manual and an additional one that
does not:
We have a program that, for some odd reason,
needs to know the following information:
- the number pf processes allowed on the
system (the one from the manual)
- the number of cpus (perhaps 3rd party
licensing software :) )
Getting the Number of Processes
One thing I believe in is paying due respect,
and as such, we will peruse one of the examples in the BSD documentation,
how to snag the number of processes allowed on the system:
. . .
#include
. . .
int get_processes_max {
int mib[2], maxproc;
size_t len;
mib[0] = CTL_KERN;
mib[1] = KERN_MAXPROC;
len = sizeof(maxproc);
sysctl(mib, 2, &maxproc, &len, NULL, 0);
return maxproc;
}
It is important, at this point, to understand
what it is we are accessing and how it is done. To think in C terms,
we are looking at this (again, noted in the man page):
int sysctl(int *name, y_int namelen, void *oldp, size_t *oldlenp, void *newp, size_t newlen);
If you look carefully across the function prototype
for sysctl you will see where all of the arguments
specified satisy the function.
Again, for the next value, our function really
would not have to look much different:
. . .
#include
. . .
int get_processes_max {
int mib[2], num_cpu;
size_t len;
mib[0] = CTL_HW;
mib[1] = HW_NCPU;
len = sizeof(num_cpu);
sysctl(mib, 2, &num_cpu, &len, NULL, 0);
return num_cpu;
}
Basically what we are looking at is access to
data structures, nothing more really. The great thing about it is
the ease of access, quite simpler than endless routine writing for
endless direct file level access, instead, using this function,
we can get a great deal of information about the system with a minimal
and safe level of exertion.
This Is Just The Beginning
Doubtless if this article was something new to
you, then the door that lie before you is a great one indeed. BSD
presents an unparalled opportunity to delve into the inner workings
of BSD and UNIX itself. Continue on and look to programming guides
and documentation to lead the way, you will not be disappointed.
As for my material, I will also open the door, and we shall see
in the long run what lie on the other side.
What About sysctl for Linux
To the best_of_my_knowledge system parms
associated with sysctl can be viewed and modified under
/proc/sys (for the most part) on
Linux systems.
When programmatic access is required it is recommended to use
/proc as well.
"POSIX (Portable Operating System
Interface) threads are a great way to increase the responsiveness and
performance of your code."
The title is really just a fancy
way of saying that I am going to attempt to describe the whole VM enchilada,
hopefully in a way that everyone can follow. For the last year I have
concentrated on a number of major kernel subsystems within FreeBSD,
with the VM and Swap subsystems being the most interesting and NFS being
'a necessary chore'. I rewrote only small portions of the code. In the
VM arena the only major rewrite I have done is to the swap subsystem.
Most of my work was cleanup and maintenance, with only moderate code
rewriting and no major algorithmic adjustments within the VM subsystem.
The bulk of the VM subsystem's theoretical base remains unchanged and
a lot of the credit for the modernization effort in the last few years
belongs to John Dyson and David Greenman. Not being a historian like
Kirk I will not attempt to tag all the various features with peoples
names, since I will invariably get it wrong.
Before moving along to the actual
design let's spend a little time on the necessity of maintaining and
modernizing any long-living codebase. In the programming world, algorithms
tend to be more important than code and it is precisely due to BSD's
academic roots that a great deal of attention was paid to algorithm
design from the beginning. More attention paid to the design generally
leads to a clean and flexible codebase that can be fairly easily modified,
extended, or replaced over time. While BSD is considered an 'old' operating
system by some people, those of us who work on it tend to view it more
as a 'mature' codebase which has various components modified, extended,
or replaced with modern code. It has evolved, and FreeBSD is at the
bleeding edge no matter how old some of the code might be. This is an
important distinction to make and one that is unfortunately lost to
many people. The biggest error a programmer can make is to not
learn from history, and this is precisely the error that many other
modern operating systems have made. NT is the best example of
this, and the consequences have been dire. Linux also makes this mistake
to some degree -- enough that we BSD folk can make small jokes about
it every once in a while, anyway (grin). Linux's problem is simply
one of a lack of experience and history to compare ideas against, a
problem that is easily and rapidly being addressed by the Linux community
in the same way it has been addressed in the BSD community -- by continuous
code development. The NT folk, on the other hand, repeatedly
make the same mistakes solved by UNIX decades ago and then spend years
fixing them. Over and over again. They have a severe case of 'not
designed here' and 'we are always right because our marketing department
says so'. I have little tolerance for anyone who cannot learn from history.
Much of the apparent complexity
of the FreeBSD design, especially in the VM/Swap subsystem, is a direct
result of having to solve serious performance issues that occur under
various conditions. These issues are not due to bad algorithmic design
but instead rise from environmental factors. In any direct comparison
between platforms, these issues become most apparent when system resources
begin to get stressed. As I describe FreeBSD's VM/Swap subsystem the
reader should always keep two points in mind. First, the most important
aspect of performance design is what is known as "Optimizing the Critical
Path". It is often the case that performance optimizations add a little
bloat to the code in order to make the critical path perform better.
Second, a solid, generalized design outperforms a heavily-optimized
design over the long run. While a generalized design may end
up being slower than an heavily-optimized design when they are first
implemented, the generalized design tends to be easier to adapt
to changing conditions and the heavily-optimized design winds
up having to be thrown away. Any codebase that will survive and
be maintainable for years must therefore be designed properly from the
beginning even if it costs some performance. Twenty years ago
people were still arguing that programming in assembly was better than
programming in a high-level language because it produced code that was
ten times as fast. Today, the fallibility of that argument is obvious
-- as are the parallels to algorithmic design and code generalization.
Last month, we started a new series of Linux
kernel internals.
In that first part, we looked at how Linux manages processes and
why in many ways Linux is better at creating and maintaining processes
than many commercials Unixes.
This series on Linux internals is by the way the fruit
of a tight collaboration with some of the most experienced kernel hackers
in the Linux project. Without the contribution of people like
Andrea Arcangeli in Italy (VM contributor
and SuSE employee), Ingo
Molnar (scheduler contributor) and many others, this series wouldn't
be possible. Many thanks to all of them, but especially to Andrea Arcangeli
who has shown a lot of patience in answering many of my questions.
"Kernel Development" page can be useful too. Archives are
here.
VMware's system emulator lets you run up to five OSs on
one box simultaneously
Rawn Shah checks out VMware's latest
system emulator, version 1.1. It promises to let you run a Linux host
OS, then switch -- without rebooting -- among up to four other guest
OSs that operate inside virtual hardware created by VMware. (2,100 words)
[July 25, 1999]
Welcome to VMware Inc.
- Virtual Platform Technology VMware software initially comes
in two flavors, depending on the user's host operating system:
VMware
for Linux, and
VMware for Windows NT.
VMware
for Linux (time-limited demo) -- run DOS-, FreeBSD-, Windows 3.x, 9x
and NT 4.0-applications easily under Linux. VMware is included in SuSE distribution:
http://linuxpr.com/releases/176.html
Experience of on of the users who has
Celeron 450 MHz, 256MB RAM and had given virtual machine 64M was quite
positive. He used NT driver SVGA from vmware, and after than it started
to work with the screen noticeably faster and supported modes more than
800x600. (vmware recommend X 3.3.3.2). in this configuration Visio is
working satisfactory (redrawing of screen is a little bit slow in non-full
screen mode), but generally is OK. The fact that it's now possible to
work on a single computer instead of two overweight the small inconveniences
described.
[May 27, 1999]
Linux Memory
Management subsystem; main page
[March 2,1999]
Linux Kernel Mailing List, Archive by Week by thread
[Feb.12,1999]
www8.pair.com
-- the ultimate OS
Uniform
Driver Interface (UDI)
Universal
Serial Bus (USB)
Kernel
Traffic (http://www.kt.opensrc.org/)
-- information of new kernel developments
In case of broken links
please try to use Google search. If you find the page please notify
us about new location
See also University Courses
Linux Documentation Project Guides(see also
Linux Guides):
The Linux Kernel Hackers' Guide
, freely redistributable collection
of documents; version 0.7 by Michael K. Johnson is available in
HTML and
HTML (tared and gziped).
The Linux Kernel
, freely redistributable book by David A. Rusling.
Version 0.8-2is available in
HTML,
HTML (tared and gziped),
DVI, LaTeX source, PDF, and PostScript.
The Linux Programmer's Guide
, version 0.4 by B. Scott Burkett,
Sven Goldt, John D. Harper, Sven van der Meer and Matt Welsh, is available
in
HTML,
HTML (tared and gziped),
LaTeX source, PDF and PostScript.
Linux
Kernel Glossary
Operating Systems -- introduction to OS by Sharon Heimansohn,
sheimans@klingon.cs.iupui.edu.
see also other modules from
Department of Computer
and Information Science of IUPU (Indiana University / Purdue University
Indianapolis):
The Mythical Man-Month. Essays on
Software Engineering by Frederick Brooks Jr. Anniversary Edition. Contain
a fascinating account on the creation of OS/360 -- real classic.
-
A Quarter Century of Unix
-
Peter H. Salus / Paperback / Published 1994
-
Casting the Net : From Arpanet to Internet and Beyond (Unix and Open
Systems Series)
-
Peter H. Salus / Paperback / Published 1995
-
Hard Drive : Bill Gates and the Making of the Microsoft Empire
-
James Wallace, et al / Paperback /
-
Overdrive : Bill Gates and the Race to Control Cyberspace
-
James Wallace / Paperback / Published 1998 -- not that good as a
previous one but still interesting
Other synchronization primitives
Ada Tasking
Java
Atomic Transactions
Bankers algorithm
Dining Philosophers
Lecture Notes
Distributed case and databases
Etc.
Deadlock...
The Deadly Embrace (Millersville University) Dr. Roger W. Webster
(contains the picture from SG)
-
Memory Management: Address Translation
-
Everything you want to know about Memory Management -- The
Memory Management Glossary. Useful definitions of terms
-
Paging and TLB
-
80386 Memory Management
-
More on descriptor registers in 80386
-
Memory Management (11 pages of Memory Management Descriptions)
-
Sample Questions on Memory Management.
-
Sun memory management -- Why doesn't Sun's OS free unused memory?
Adrian Cockcroft tackles this question in the first of his monthly performance
columns for SunWorld Online. Cockcroft, Sun's performance guru,
has heard and answered this and countless other questions during his
years as a systems engineer. Once he explains how Solaris 1 and 2 handle
your computer's memory, you'll probably be relieved.
-
The Linux Cache Flush Architecture
- David Miller wrote this document explaining how Linux tries
to flush caches optimally, and more importantly, how people porting
Linux can write code to use the architecture.
-
Linux Memory Management
- This chapter is rather old; it was originally written when Linux
was only a year old, and was updated a year later. The first section,
on Linux's memory management code, is out of date by now, but may
still provide some sort of understanding of the basic structure
that will help you navigate through more recent kernels. The second
section, an overview of 80386 memory management, is still mostly
applicable; there are a few assumptions that should not get in your
way in general.
-
80386 Memory Management
- Linux's memory management was originally conceived for Intel's
80386 processor, which has fairly rich and relatively easy-to-use
memory management features. During the port to the Alpha, Linux's
memory management was abstracted in a way that has been successfully
applied to many different processors, including the memory management
units (MMU's) that are supplied with the 386, Alpha, Sparc (there
are several different MMUs for the Sparc; most are supported), Motorola
68K, PowerPC, ARM, and MIPS CPUs.
Paging vs. Segmentation, Multilevel Page Tables, Paging Along with
Segmentation
Capability Addressing, Protection Capabilities, Single Virtual Address
Space, & Protection Rings
Distributed Shared Memory, & The Mach VM
Memory Consistency, & Consistency Models Requiring & Not Requiring
Synchronization Operations
NUMA vs NORMA, Replication Of Memory, Achieving Sequential Consistency,
& Synchronization in DSM Systems
Management of Available Storage, Swapping and Paging, & Inverted Page
Tables
Performance of Demand Paging, Replacement Strategies, Stack Algorithms
and Priority Lists, Approximations to LRU Replacement, Page vs. Segment
Replacement, & Page Replacement in DSM Systems
Locality of Reference, User-Level Memory Managers,The Working Set
Model, Load Control in UNIX, & Performance of Paging Algorithms
-
Working Set Model (Bershad)
-
Working Set Model (Carthy)
-
"Important Papers on Locality of Reference," Sivan Toledo
- "Locality",
Cornell Theory Center
-
"Distributed Shared Memory: Concepts and Systems," Jelica Protic, Milo
Tomasevic, and Veljko Milutinovic. IEEE Parallel & Distributed Technology,
Summer 1996.
-
"Incorporating Memory Management into User-Level Network Interfaces,"
Anindya Basu, Matt Welsh, and Thorsten von Eicken. Department
of Computer Science, Cornell University.
Caching
SunWorld Online - January - CacheFS and Solstice AutoClient
May1995
- OPERATING SYSTEMS
IMPLEMENTING LOADABLE KERNEL MODULES FOR LINUX -- Matt Welsh
Blox Data AB
DLX
HAL91
See also Linux for a limited hardware (installations
of FAT disk drives):
JOS - Java VM-based OS
Real time OSes
Advanced systems programming and realtime systems Realtime operating systems
and device programming
Windows NT Architecture, Part 1
Sample Chapter from Inside Windows NT®, Second Edition by David A. Solomon,
based on the original edition by Helen Custer.
Inside the Windows 2000 Kernel
Windows NT File System Internals A Developer's Guide Chapter 4. The NT I-O
Manager
Copyright © 1996-2009 by Dr. Nikolai Bezroukov.
www.softpanorama.org was
created as a service to the UN Sustainable Development Networking Programme (SDNP)
in the author free time.
Submit
comments This document is an industrial compilation designed and created
exclusively for educational use and is placed under the copyright of the
Open Content License(OPL).
Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made
for educational purposes only in compliance with the fair use doctrine.
Disclaimer:
- The statements, views and opinions presented on
this web page are those of the author and are not endorsed by, nor do they necessarily
reflect, the opinions of the author present and former employers, SDNP or any other
organization the author may be associated with.
- We do not warrant the correctness of the information provided or its
fitness for any purpose
- In no way this site is associated with or endorse cybersquatters
using
the term "softpanorama" with other main or country domains (e.g. softpanorama.com) with
bad faith intent to profit from the goodwill belonging to
someone else.
Last modified:
January 15, 2010
Re:wishfull thinking
(Score:5, Informative)