|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
| News |
Recommended Links |
Open Boot | Solaris Run Levels | SMF | Startup Files |
| Serial Console on Solaris | ALOM | Boot Process | History | Humor | Etc |
Although the new SMF feature is totally different from the previous boot and daemon management within the Solaris OS, it includes many welcome changes. The system boots faster and can recover from errors, such as hardware failures, that cause services to fail. SMF has some knowledge about knowledge of the state of the system and its services. SMF understands:
This understanding of dependencies allows a new level of service functionality -- if a service fails, SMF can restart not only that particular service but all of the services that depended on it. Thus SMF can fully restore the system to a given run level, even if a core service fails.
With the -m verbose option, SMF outputs a line for
each service that it's starting, which can help reassure those new to the Solaris
10 OS that everything is working. Gone, however, are the days of grepping through
/var/adm/messages in hopes of finding an error that it
is actually labeled with the name of the service that is having a problem. Rather,
each service has its own persistent log file. These are in /var/svc/log
for the most part, with pre-single-user milestone service logs in
/etc/svc/volatile. The system reaches the "login" prompt
much more quickly now, as only the services depended on by login need to start
before login is started.
SMF has improved several aspects of the Solaris administrative model; here are some of the most notable examples:
svcs(1) command) and managed (using
svcadm(1M) and svccfg(1M)).
svcs -x), as well as individual, persistent log files
for each service. svcadm(1M)),
allowing the changes to persist across upgrades and patches. smf_security(5) man page). Despite these changes, compatibility with existing administrative practices has been preserved wherever possible. For example, most site-local and ISV-supplied rc scripts still work as usual.
Enabling and Disabling Services
Releases prior to the Solaris 10 OS haven't offered a good way to permanently
disable a service. The typical method used is to rename the relevant rc script to
a name that won't get executed, but that change will be overlooked the next time
the system is upgraded. Furthermore, inetd-based services are enabled and disabled
by a totally different method -- editing a configuration file. Under SMF, both types
of services can be configured using the svcadm(1M) command,
and the changes will persist if the machine is upgraded. Here's a comparison of
how to enable and disable some services.
|
The last argument to svcadm in these examples is the
Fault Managed Resource Identifier (FMRI) of the service.
Note that svcadm should only be used for SMF services
-- legacy rc script-controlled services work the same as in past releases.
Stopping, Starting, and Restarting Services
Traditionally, services have been started by an rc script run at boot, run with
the argument start. Some rc scripts provide a
stop option, and a few also allow
restart. In SMF, these tasks are all accomplished with the
svcadm(1M) command, as shown in the following table.
|
The -t option to svcadm enable
and svcadm disable indicates that the requested action
should be temporary -- it will not affect whether the service is started the next
time that the system boots. This is in contrast to the
Enabling and Disabling Services example.
As with the enabling and disabling of services, svcadm
should not be used to control rc script-controlled services; they continue to work
the same as in past releases.
Observing the Boot Process
As mentioned in the Notable Changes section of the QuickStart guide, the boot process is much quieter by default than in previous releases of Solaris. This was done to reduce the amount of uninformative "chatter" that might obscure any real problems that might occur during boot.
Some new boot options have been added to control the verbosity of boot. One that
you may find particularly useful is -m verbose, which
prints a line of information when each service attempts to start up. This is similar
to the default boot mode for some other UNIX-based and UNIX-like operating systems.
Verbose boot looks like this:
{1} ok boot -m verbose
Rebooting with command: boot -m verbose
Boot device: /pci@1c,600000/scsi@2/disk@0,0:a File and args: -m verbose
SunOS Release 5.10 Version Generic 64-bit
Copyright 1983-2004 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
[ network/pfil:default starting (pfil) ]
[ network/loopback:default starting (Loopback network interface) ]
[ system/filesystem/root:default starting (Root filesystem mount) ]
Oct 18 13:53:02/13: system start time was Mon Oct 18 13:52:57 2004
[ network/physical:default starting (Physical network interfaces) ]
[ system/filesystem/usr:default starting (/usr and / mounted read/write) ]
( more service messages elided )
[ system/filesystem/local:default starting (Local filesystem mounts) ]
[ network/ntp:default starting (network time protocol (NTP)) ]
[ system/utmp:default starting (utmpx monitoring) ]
[ system/filesystem/local:default starting (Local filesystem mounts) ]
[ system/console-login:default starting (Console login) ]
demobox console login: checking ufs filesystems
/dev/rdsk/c0t0d0s7: is logging.
Oct 18 13:53:14/50: system/system-log:default starting
Oct 18 13:53:14/51: network/inetd:default starting
Oct 18 13:53:14/52: system/cron:default starting
( more service messages elided )
The order of the service start messages may change from boot to boot, because SMF starts services in parallel according to their dependency relationships.
If a service fails to start successfully, warning messages will be printed in addition to the start message. Here's an example where the NTP service failed to start up:
[ system/filesystem/local:default starting (Local filesystem mounts) ]
[ network/ntp:default starting (network time protocol (NTP)) ]
Oct 25 13:58:42/49 ERROR: svc:/network/ntp:default:
Method "/lib/svc/method/xntp" failed with exit status 96.
Oct 25 13:58:42 svc.startd[4]: svc:/network/ntp:default:
Method "/lib/svc/method/xntp" failed with exit status 96.
[ network/ntp:default misconfigured (see 'svcs -x' for details) ]
[ system/utmp:default starting (utmpx monitoring) ]
( more service messages elided )
The first two error messages would appear during both normal boot and verbose
boot; the last one (network/ntp:default misconfigured ...)
would only appear during verbose boot.
Discovering What's Going Wrong
The Solaris OS has not had a comprehensive place to look for problems with system
services. Some solutions exist to help catch and diagnose these problems, ranging
from coreadm(1M) logging to site-specific monitoring
scripts to comprehensive products such as Sun Cluster. The new
svcs(1) command includes an "explain" option (svcs
-x), which prints out detailed, solution-driven messages about the services
that are not running. svcs -x shows when and why the
service failed, provides pointers to more information about the problem, and lists
what other services are affected by this problem.
Let's continue with the example of the NTP service failing to start up:
# svcs -x svc:/network/ntp:default (Network Time Protocol (NTP).) State: maintenance since Mon Oct 18 13:58:42 2004 Reason: Start method exited with $SMF_EXIT_ERR_CONFIG. See: http://sun.com/msg/SMF-8000-KS See: ntpq(1M) See: ntpdate(1M) See: xntpd(1M) Impact: 0 services are not running.
The NTP service has been placed into maintenance mode because the startup script
indicated a problem with the service's configuration. Further information about
the service failure is available in the service's log file in the
/var/svc/log directory (or the /etc/svc/volatile
directory). The log file name is based off the short form of the FMRI, with instances
of "/" replaced by "-". So the log file for the svc:/network/ntp:default
service is /var/svc/log/network-ntp:default.log. This
log file quickly led to the conclusion that the NTP daemon's configuration file,
/etc/inet/ntp.conf, had been removed.
Another example shows SMF's ability to track dependencies and point out problems
relating to disabled services. We use the -v option in
this example to see the list of impacted services.
# svcs -x -v
svc:/application/print/server:default (LP Print Service)
State: disabled since Mon Oct 18 16:17:27 2004
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: man -M /usr/share/man -s 1M lpsched
Impact: 1 service is not running:
svc:/application/print/rfc1179:default
Here, the application/print/server:default service
has been explicitly disabled, but another service that depended on it (application/print/rfc1179:default)
was not disabled. So the disabling of the first service has kept the second one
from running.
Observing Services
In earlier versions of Solaris, the only way to see what services were available
was to use the ps(1) command and list all the active
processes on the system, and then look around for the names of processes that matched
the names of service applications. Unfortunately, it's very difficult to track things
this way since most systems have many processes, and new services are introduced
with each new version of Solaris and when other software packages are added. To
further complicate the situation, many modern services are no longer implemented
as single processes. Some services are implemented as collections of processes,
multithreaded processes, or both simultaneously.
The new svcs(1) command makes it much easier to observe
the status of a system service. The -p option shows all
the processes associated with a service:
% svcs -p network/smtp:sendmail
STATE STIME FMRI
online 18:20:30 svc:/network/smtp:sendmail
18:20:30 655 sendmail
18:20:30 657 sendmail
% ps -fp 655,657
UID PID PPID C STIME TTY TIME CMD
root 655 1 0 18:20:30 ? 0:01 /usr/lib/sendmail -bd -q15m
smmsp 657 1 0 18:20:30 ? 0:00 /usr/lib/sendmail -Ac -q15m
The -d option shows what other services this service
depends on, and the -D option shows what other services
depend on this service:
% svcs -d network/smtp:sendmail STATE STIME FMRI online 18:20:14 svc:/system/identity:domain online 18:20:26 svc:/network/service:default online 18:20:27 svc:/system/filesystem/local:default online 18:20:27 svc:/milestone/name-services:default online 18:20:27 svc:/system/system-log:default online 18:20:30 svc:/system/filesystem/autofs:default % svcs -D network/smtp:sendmail STATE STIME FMRI online 18:20:32 svc:/milestone/multi-user:default
We can see that sendmail requires networking, local file systems, name services,
the syslog daemon, and the automount daemon to be running before it will run, and
sendmail itself must be running before the multi-user milestone can be reached.
The service start times (the STIME column) illustrate
that these dependencies have been followed.
Changing Run Levels
SMF has introduced the concept of milestones, which supplant the traditional
notion of run levels. Run levels provide a basic description of the set of services
running on the machine, traditionally grouped as the services necessary for one
user to log in on the machine console (run level S), and for multiple users to log
in to the machine (run levels 2 and 3). These system states are represented in SMF
as milestones, which are stable services that represent a group of other services.
svcs -d can be used to see what services must be running
before a milestone is reached.
svcadm(1M) is now the preferred method of setting
the system's default run level. This is done with the milestone
subcommand and the FMRI of a valid milestone, as seen in Table 3.
|
The -d option indicates that the default milestone
should be set to the named FMRI. Without the -d option,
svcadm milestone transitions the system to the named
milestone immediately.
The boot process has been updated to be aware of milestones. In addition to the
traditional boot -s (boot into single-user mode), you
now have boot -m milestone=<milestone>
to boot to the named milestone. <milestone>
can be single-user, multi-user,
or multi-user-server, as well as the special milestones
all (all enabled services online) and
none (no services at all). The none
milestone can be very useful in repairing systems that have failures early in the
boot process.
Booting to the single-user milestone (with -m milestone=single-user)
is slightly different than using the old boot -s. When
the system is explicitly booted to a milestone, exiting the console administrative
shell will not transition the system to multi-user mode, as
boot -s does. To move to multi-user mode after boot -m
milestone=single-user, use the command svcadm milestone
milestone/multi-user-server:default.
Enabling, Disabling, and Monitoring Legacy Services
Services that are started by traditional rc scripts (referred to as legacy services)
will generally continue to work as they always have. They will show up in the output
of svcs(1), with an FMRI based on the path name of their
rc script, but they cannot be controlled by svcadm(1M).
They should be stopped and started by running the rc script directly.
As mentioned in the Notable Changes section of the guide, rc scripts may not run at exactly the same point in boot as they had in earlier versions of Solaris. In particular, problems may arise for scripts that depend on running before certain rc scripts provided in the Solaris OS. The vast majority of scripts should continue to work without any trouble, though.
Adding New Services to inetd.conf
The Internet services daemon, inetd(1M), has been
rewritten as part of SMF. It stores all of its configuration data in the SMF database,
rather than /etc/inet/inetd.conf, allowing the SMF tools
to be used to control and observe inetd-based services.
Most inetd-based services that ship with the Solaris OS will no longer
have entries in inetd.conf.
To provide compatibility for services which haven't converted to SMF, entries
can still be added to inetd.conf using the same syntax
as always, and the new inetconv(1M) command will convert
the new services to SMF services. inetconv should always
be run after editing /etc/inet/inetd.conf; it can be
run without any arguments.
SMF and Predictive Self-Healing.
Last modified: August 12, 2009