|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
|
| News | Recommended Links | sar | memtool | ||
| Sarcheck | Humor | Etc |
SarCheck by Apptune Corporation is a Linux & UNIX performance analysis and performance tuning tool. It is designed to help you with performance management on most Solaris, Linux, AIX, and HP-UX systems by making recommendations and explaining them with Plain English text, supporting graphs, and tables. You can order a FREE evaluation copy right here. More information about SarCheck can be found further down on this page or on our high-bandwidth narrated product tour.
SarCheck analyze the output of sar, the /proc filesystem, ps and reading more information from the kernel. It identifies problem areas and if necessary, recommends changes to the system's tunable kernel parameters and hardware resources like CPUs, disks, and memory. Recommendations should be taken with a gran of salt. I like sarcheck's philosophy: in true Unix style uses the built-in tools (sar and ps) to collect data. The driver for the analyzer is a a very simple shell script (/usr/local/bin/sarcheck). SarCheck can detect:
Also good is that sarcheck doesn't make any changes. It just produce recommendation which might be right or wrong. what is good it explains the recommendations that are made. SarCheck can generate report in HTML. If you're used to looking at nothing more than vmstat, sar, and uptime output, you're going to be amazed. If you're used to looking at one of the many "drill-down graph generator" tools, you'll love the fact that SarCheck gives real answers.
The tools you need to collect that data are shipped with the system: all you have to do is run /usr/lib/sa/sar_enable -y (if you haven't done it already) to start sar collecting data. Sar will quietly collect performance information regularly, and will store days or even a full months worth of data in the /var/adm/sa directory for your later review.
Don't turn sar on when you have a problem; turn it on when your system is running normally. The baseline data you collect is valuable: it shows you what your system should look like.
Sar doesn't make your system slow, as some people think. It runs from cron each twenty minutes and perform analysys very quickly. Try:
timex /usr/lib/sa/sa1 to verify that for yourself.
Sar will use about 1000 blocks for a whole week's worth of data, and it normally only stores two weeks worth, and won't keep more than a month. It is entirely self limiting.
By default, on current versions, the script /usr/lib/sa/sa2 removes files older than 7 days from /var/adm/sa. You can modify that script by commenting out the "find" at the end of it. This will cause a full months worth of sar data to be kept. Because sar files are named with the day of the month only, new data will overwrite old each new month, and the directory will not continue to grow.
To analyze todays activity, just type "sar" followed by any flags desired. For example, "sar -r" reports memory usage. To analyze a different day, look in /var/adm/sa for files named sa01, sa02, etc. Those represent data from the day of the month indicated by the numeric part of the name. Type "sar -r -f sa01" to analyze data from the 1st, for example.
The "sar01", "sar02" files are complete reports run with all of sar's options turned on. They are ASCII text; you can view them or print them directly.
Sarcheck spots memory leaks and run-away processes. It analyzes disk access, cpu bottlenecks and the usual buffers and tables. In addition to making recommendations for improving performance, it also estimates how much load you can add given your current resources. That's handy for growing companies.
Timed evals are availble from their web-site, and a free Ultra-Lite version is available from Skunkware. Recently, Sarcheck has made their 3.2v4.2 version available free: see http://www.sarcheck.com/sc324.htm
One of the great things about the Linux operating system is that it has a tool for practically everything. Anything you want to know can be ascertained as long as you have the talent to memorize some arcane commands and understand how to pipe them together while selectively cutting and pasting from multiple sets of output. Linux vendors try to make their high-end offerings stand out from other implementations by adding all-in-one administration tools that reduce complex actions to a few point-and-click operations. Those vendors, however, are missing a very important operation ý system analysis and recommendations based on usage, load, and related variables. That is where SarCheck comes in.
Advertised as ýan inexpensive tool ýthat helps system administrators with Linux and UNIX performance tuning,ý it is does more than that might at first imply. Yes, it takes the output from a number of utilities (
ps,sar, and the /proc information, among others) and puts them into a single HTML report, but that report is unlike most you are probably used to seeing in that it is written in a straightforward language that is easily understandable (there goes job security). There are a few spelling errors (ýahowý instead of ýshowý) in the report and documentation, but nothing that canýt be easily understood at first glance.The report is generated from the output of these commands over the course of the day (cron entries do the dirty work) and is divided into three key categories: Resource Analysis, Capacity Planning, and Summary of Statistics. The following sections look at each of these categories and give examples from a SUSE Linux Enterprise Server 9 system.
Resource Analysis
The Resource Analysis section was the most verbose of any of the three each time I ran it and that held true on this machine as well as in general. In my opinion, this is also the most useful section because it not only analyzes what is there but also makes recommendations.
The following output shows the results generated in one report:
Average CPU utilization was only 0.3 percent. This indicates that spare capacity exists within the CPU. If any performance problems were seen during the monitoring period, they were not caused by a lack of CPU power. CPU utilization peaked at 2.02 percent from 11:09:30 to 11:10:00. A CPU upgrade is not recommended because the current CPU had significant unused capacity.Average time spent waiting for I/O was 0.21 percent. I/O wait time peaked at 0.41 percent from 11:09:30 to 11:10:00. Traditional UNIX thresholds for this column ahow that values in excess of 7 - 15 percent are high enough to suggest an I/O bottleneck. Insufficient data exists to determine if these thresholds are valid on a multiprocessor Linux system.
Average time spent servicing interrupts was 0.013 percent. The time spent servicing interrupts peaked at 0.03 percent from 11:09:30 to 11:10:00.
Average time spent servicing softirqs was 0.0 percent. The time spent servicing softirqs peaked at 0.00 percent from 11:00:00 to 11:09:30.
CPU number 0 was busy for an average of 0.32 percent of the time. During the peak interval from 11:09:30 to 11:10:00, this CPU was 2.43 percent busy. The CPU was busy with user work 0.23 percent of the time and was busy with system work 0.07 percent of the time. The sys/usr ratio on this CPU was 0.29:1. This is below the threshold of 2.50:1.
CPU number 1 was busy for an average of 0.23 percent of the time. During the peak interval from 11:09:30 to 11:10:00, this CPU was 1.61 percent busy. The CPU was busy with user work 0.14 percent of the time and was busy with system work 0.06 percent of the time. The sys/usr ratio on this CPU was 0.39:1.
Individual CPU Statistics
CPU#
Average %Busy
Average %User
Average %Sys
Average %Nice
Peak %Busy
0
0.32
0.23
0.07
0.02
2.43
1
0.23
0.14
0.06
0.03
1.61
The values of the min-timeslice and max-timeslice parameters were 10000 and 300000, respectively.Capacity PlanningThe average amount of free memory was 10626.7 pages or 41.5 megabytes. The minimum amount of free memory was 9929 pages or 38.79 megabytes at 12:30:00. The /proc/sys/vm/freepages data was not found. This is normal on some kernels and kernel version 2.6.5-7.97-smp was in use at the time. A roughly horizontal line was detected in the free memory statistics at 39.2 megabytes. This may indicate the approximate point at which the operating system gets more aggressive about freeing up pages of memory. The average swap out rate during those intervals when free memory appeared to be at a memory threshold was 0.00 per second and the average CPU utilization was 0.01 percent.
No memory bottleneck was seen and the system has sufficient memory.
The value of the page_cluster parameter was 3. This means that 8 pages are read at once. Values of 2 or 3 are typically better for systems will small memory sizes, systems where response time is important, and systems where most I/O is not sequential. There may be an advantage in raising this value if you want to speed up the performance of programs which do mostly sequential I/O.
The data needed to calculate page in and page out rates was not present in /proc/stat.
The data needed to calculate swap in and swap out rates was not present in /proc/stat.
The /proc/sys/vm/swappiness file was seen. The swappiness value was 60.
The size of swap space was 1004.02 megabytes. The peak amount of swap space in use was 0.948 percent of the total. This is a very small amount of the total available swap space. The amount of swap space in use did not change during the monitoring period.
There was one swap partition seen in /proc/swaps.
According to data collected from /proc/partitions, the system-wide disk I/O rate averaged 0.76 per second and peaked at 2.13 per second from 11:09:30 to 11:10:00. The read rate averaged 0.0009 per second and peaked at 0.007 per second from 11:00:00 to 11:09:30. The write rate averaged 0.76 per second and peaked at 2.13 per second from 11:09:30 to 11:10:00.
The value of the logging_level parameter was 0. A value of zero disables logging. This has the least overhead but provides no information about SCSI activity.
The I/O rate on disk device hda averaged 0.83 per second and peaked at 2.13 per second from 11:09:30 to 11:10:00. The read rate averaged 0.0010 per second. The write rate averaged 0.83 per second. This disk was busy for an average of 0.46 percent of the time and was 0.87 percent busy at peak times.
The noatime option was not specified on any of the mounted filesystems. Since no disk bottleneck was seen, there is probably no reason to change this. Disk activity on the busiest disk device, hda, peaked at only 0.87 percent busy from 11:09:30 to 11:10:00.
The value of the ctrl-alt-del parameter was 0. The value of 0 is better in almost all cases because it prevents an immediate reboot if the ctrl, alt, and delete keys are pressed simultaneously.
There were an average of 1023.66 interrupts per second and the peak interrupt rate seen was 1118.27 per second from 11:09:30 to 11:10:00.
The Capacity Planning section contains information identifying the licensing of the software (serial number, any expiration information, and so on), as well as ýa rudimentary linear capacity planning model.ý Right or wrong, estimates made assume that if you increase the workload, you will increase the usage of each and every resource the same (hence the linear part of the model).
The more you run this tool, and the more you study the resource analysis under different loads, the more accurate you can get with capacity planning.
Summary of Statistics
The Summary of Statistics section offers a comprehensive table showing pertinent information on the system. The following table is from the slow SLES server that I ran it on (note: I changed the MAC Address value in the table). By looking at this, it is immediately apparent that there is no sizable load on the server and it is possible to start loading it up for the company intranet:
Statistics for system: lab_linux
Start of peak interval
End of peak interval
Date of peak interval
Statistics collected on:
08/26/2005
MAC Address:
xx:xx:xx:xx:xx:xx
Average combined CPU utilization:
0.28%
Average user CPU utilization:
0.19%
Average sys CPU utilization:
0.06%
Average 'nice' CPU utilization:
0.03%
Peak combined CPU utilization:
2.02%
11:09:30
11:10:00
08/26/2005
Peak 'not nice' CPU utilization:
1.89%
11:09:30
11:10:00
08/26/2005
Average time in I/O wait
0.21%
Average time servicing interrupts:
0.01%
Average time servicing softirqs:
0.00%
Average swap space in use:
9.52 megabytes
Peak swap space in use:
9.52 megabytes
Average amount of free memory:
10627 pages or
41.5 megabytes
Minimum amount of free memory:
9929 pages or
38.79 megabytes12:30:00
08/26/2005
Average system-wide I/O rate:
0.76/sec
Peak system-wide I/O rate:
2.13/sec
11:09:30
11:10:00
08/26/2005
Average read rate:
0.0009/sec
Peak read rate:
0.007/sec
11:00:00
11:09:30
08/26/2005
Average write rate:
0.76/sec
Peak write rate:
2.13/sec
11:09:30
11:10:00
08/26/2005
Disk device w/highest peak:
hda
Avg pct busy for that disk:
0.46%
Peak pct busy for that disk:
0.87%
11:09:30
11:10:00
08/26/2005
Average Interrupt rate:
1023.66/sec
Peak Interrupt rate:
1118.27/sec
11:09:30
11:10:00
08/26/2005
Approx CPU capacity remaining:
100%+
Approx I/O bandwidth remaining:
100%+
Can memory support add'l load:
Yes
More About the Product
SarCheck is available for a number of platforms, with Linux being but one. If sar does not exist in your implementation (true with many distributions of Linux), its functionality is handled by other SarCheck utilities to give the same results. Make sure, however, that you have added gnuplot — SarCheck can use it to produce graphs and include them in the report. To use SarCheck, you must manually create a number of directories and add entries into the crontab file.
An evaluation copy can be obtained for free by filling out and submitting the order form on the Web site. After you finish the form, someone manually reads your request and then emails you (within a day or so) a build with an expiration date hard-coded in it. Be prepared to start testing it immediately upon arrival as that expiration is cast in stone and is a shorter time period than many other vendors offer.
Current pricing is $450 for a 1-year subscription, and $112.50 for each subsequent year. Discounts are available for quantity purchases ranging from $2,075 for six to $4,795 for 24.
More Information
Aptitune Corporation
P.O. Box 1033
Plaistow, NH 03865
Fax: (603) 382-4247
Web Home Page: http://www.sarcheck.com
Sales: sales@sarcheck.comEmmett Dulaney is the author of several books on Linux, Unix, and certification. He is a former partner in Mercury Technical Solutions, and can be reached at edulaney@iquest.net .
UNIX - Linux Performance Tuning and Kernel Tuning with graphs and analysis in plain English
BigAdmin Description - SarCheck Performance Tool
Son of Devil's Advocate, by Stan Kelly-Bootle
Apptune documents
Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: June 02, 2008