Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Softpanorama University
Performance Monitoring and Tuning

News

Recommended Links

Books

Tutorials Papers Etc

 

Top Visited
Switchboard
Latest
Past week
Past month

News

freshmeat.net Project details for sysstat

The sysstat package contains the sar, sadf, iostat, mpstat, and pidstat commands for Linux. The sar command collects and reports system activity information. The statistics reported by sar concern I/O transfer rates, paging activity, process-related activites, interrupts, network activity, memory and swap space utilization, CPU utilization, kernel activities, and TTY statistics, among others. The sadf command may be used to display data collected by sar in various formats. The iostat command reports CPU statistics and I/O statistics for tty devices and disks. The pidstat command reports statistics for Linux processes. The mpstat command reports global and per-processor statistics.

Release focus: Minor bugfixes

Changes:
mpstat and sar didn't parse /proc/interrupts correctly when some CPUs had been disabled. This is now fixed. This release also fixes a bug in pidstat which caused confusion between PID and TID, resulting in erroneous statistics values being displayed. The iconfig script has been updated: Help for the --enable-compress-manpg parameter is now available, help for the --enable-install-cron parameter has been updated, and the parameter cron_interval has been added.

DNet eWEEK Sprint puts backbone flow under surveillance

Aiming to provide increasingly higher-quality IP and Internet services at lower prices, Sprint Corp. has begun its most comprehensive study to date of traffic behavior on its Internet backbone.

After a year of developing its own test equipment, the carrier late last month began collecting data at its San Jose, Calif., Internet POP (point of presence), the first of many sites slated for testing.

Sprint plans to use the data from the testing, called the Internet Measurement Study, to ensure that its network can handle ever-increasing customer traffic volume and to discover which network monitoring tools will be needed in future network equipment.

"Very little is known about the detailed behavior of Internet backbones," said Bryan Lyles, chief scientist at Sprint, in Kansas City, Mo. "Very fine-grained studies are what we need to make rational decisions on the equipment that goes into the network -- even the standards that go into it."

Sprint hopes the multimillion-dollar, multiyear study will enable it to keep its equipment costs as low as possible and ensure that its network delivers optimal performance.

"The goal is to make sure we make the best use of capital and the other resources we put into the network and to keep our customers happy," Lyles said.

Performance, performance, performance

As the Internet's importance to a company's bottom line increases, users expect ISPs (Internet service providers) or other data carriers to meet increasingly stringent service performance goals.

At Quebecor Printing (USA) Inc., which is installing an IP-based VPN (virtual private network) at its many locations, "class of service will include bandwidth allocation and prioritization for certain applications," said Terry Bush, vice president of data communications, in Greenwich, Conn.

At its bigger printing facilities, the company is installing multiple 1.5M-bps circuits to handle growth in its data traffic because IP bandwidth is more efficient and flexible in a VPN than in more conventional network designs, Bush said. Nevertheless, Quebecor demands service levels that rival private network solutions and has a service-level agreement that specifies zero packet loss and a round-trip, coast-to-coast network delay of less than 75 milliseconds, Bush said.

Sprint isn't alone among carriers and ISPs in its quest to improve Internet service. For example, "2001 will probably be the last year that we will buy narrowband switches," said Fred Briggs, chief technical officer at WorldCom Inc., in Clinton, Miss.

Solaris Developer Connection

Chat Title: Solaris Utilities for Monitoring System Performance
Guest Speakers: James Liu and Karpagam Narayanan

This is a moderated forum

LizA: Welcome to the Solaris Live Chat, "Solaris Utilities for Monitoring System Performance" with James Liu and Karpagam Narayanan. James was our first Solaris Live! guest and we're very happy to have him back. James is ready to answer your questions on software development and benchmark formation strategies and configuration, scaling analysis, processor management, thread libraries, and so on. He is joined by Karpagam Narayanan, who has lots of experience with all the standard tools like Virtial Adrian (aka SE Toolkit) disk partitioning, network bandwidth trunking, and other things that get your app to run faster on Solaris[tm]. Karpagam and James, let's say that I'm new to Solaris and I want to know what CPU a process takes. Is there a command that shows me this?

jamesliu: I'll take this one. A number of commands can show this. You can use prstat which is bundled with Solaris 8 and is probably easiest. If you have the freeware top... you can use this too.

LizA: What does NLWP mean in prstat?

karpagam: NLWP refers to the number of light weight processes, or LWP, associated with the process.

LizA: How does someone find out which processors are online or off line?

jamesliu: You can find out using the psrinfo command. -v option gives you a lot of info on the processors

LizA: I need to increase the file descriptors on my server...I bumped up the ulimit but it still doesn't work. What else do I need to do?

karpagam: Increase the rlim_fd_max and rlim_fd_cur parameters in /etc/system. Remember that these take affect after you reboot.

jamesliu: LizA, you can also gain some efficiencies if your problem is related to using network file descriptors (i.e. sockets). You can tune the tcp/ip parameters using the ndd /dev/tcp command to shorten the tcp_time_wait_interval.

tefluid: I'm interested in optimizing application servers in order to run Java[tm] engines such as BEA WebLogic and ATG Dynamo. What advice can you give on profiling the system to best determine where the bottlenecks lie?

karpagam: This is a Java on Solaris question. Java has a profiling tool called hprof that can be included in the command line. Type -Xrunhprof:help for more info on this. The output gives you methods that take more CPU time...

karpagam: tefluid, There is a HAT (Heap Analysis Tool) also available. There are also 3rd party GUI tools available. Optimizeit and JProbe are two of them.

LizA: I heard that in Solaris you can allocate certain processors to work on only one process. Will that help, too?

jamesliu: LizA, you can in fact specify certain processors to a specific process. The command to use is psrset. For folks like Tefluid, binding the JVM PID to a processor set and excluding interrupts can possibly give a boost in performance.

Craki: I have a farm of Sybase database boxes all on Solaris 8. Where can I start in making sure that everything that can be optimized is, for database operations.

karpagam: Craki, I would always start with the db monitoring tools. Once you are sure that you do not have any issues go through the system parameters...

karpagam: Craki, Start by looking into shared memory, semaphores and message queue parameters first in the /etc/system. Then look into disk, network, NFS, swapping/paging, memory, CPU, filesystem, and TCP, one at a time...

karpagam: Craki, do look in http://www.sun.com/sun-on-net/performance/perftools-solaris8.pdf for more info on Solaris tools

Zartaj: I am interested in performance comparisons between Sun Solaris and Wintel. The problem is it is not easy to decide what is the right pair to compare. I have a UE250 450MHz with Solaris 8 and a P3 733 MHz with Windows 2000. I have seen the Wintel box consistently outperform the UE250. But is that a fair comparison? In general if I have a Sun system how do I determine what is the equivalent Wintel system to compare. Going by price alone, Wintel seems to have the edge.

jamesliu: Zartaj, it is often a race for more MIPS/MFLOPS, etc. in the hardware area. I don't know which benchmarks you run but in those apps that are important to Sun's customers. Sun consistently tunes our applications to out scale and outperform anything on the market. It all depends on the use. In your particular case, it may in fact be that Wintel has better price performance. In many of Sun's core customers, our value proposition is reliability, availability and scalability. We've competed well on this philosophy for about 18 years and I predict we'll continue. As for your particulars, perhaps we can communicate offline and discuss how to improve your performance.

alexc: We use some scripts to automate gathering info from ps. We also use sar. We notice that total CPU utilization (by adding up ps info) is usually quite a bit less than what is stated by sar. Why is there a discrepancy?

karpagam: Alexc, I am not sure what ps you are referring to - /usr/ucb/ps? In what version of Solaris? I do not know the time interval that ps uses for data gathering. If you are in Solaris 8, try using prstat. There are a lot of parameters that can come into play here - interval, versions, options for the tools, etc...

LizA: What do I need in order to look at mpstat? What do the columns mutexes and context switching mean?

karpagam: LizA, mutexes occur when a lot of CPUs are trying to grab the same resource lock. Only one CPU will be successful at any time. We do not want this to happen a lot...

jamesliu: LizA, context switching is also something that, done too often, expends resources... What you want to do is to limit these values to certain levels. smtx, for example is best below 500 per CPU per second. Context switches ... you can check at http://www.setoolkit.com.

Zartaj: I'd like to know what tools are available for shared library profiling? Shared libraries cannot be instrumented for prof or gprof. And the LD_PROFILE variable can be used only for one shared library at a time. So how do I go about profiling all shared libraries being used by an app?

karpagam: Zartaj, You can try using truss and sotruss. truss gives shared library activity and entry/exit trace of user-level function calls. sotruss is good and has less noise than truss...

dmdebertin: Are there any particular columns in vmstat (or other command) output that could indicate hardware or software problems? What are some things to look for that could indicate problems, and what is harmless?

jamesliu: DMDebertin, if your CPU percentage is high but system usage is low, most of the CPU is consumed by your app. You may want to think about tuning your code in this case. If system time is high, check out more with mpstat and look at context switch and smtx values.

Emory2: Could you please compare the performance of a 24 CPU SunFire 6800 to the performance of a 24 CPU IBM S80 (configured with the same amount of RAM).

karpagam: Emory2, For what workload? You can consider looking into TPC-C, TPC-D, spec standard benchmark pages that matches your workload.

LizA: How do I monitor the network?

karpagam: LizA, the primary tool you can use is netstat. There are options like -in for cumulative data, -s for TCP/UDP stats, -I for specific interface. I like to put in netstat -in in a while loop...

jamesliu: LizA, Sun also provides some scripts for tuning your network drivers. http://www.sun.com has these scripts. Search for "network tuning" or "syn flood" and you should see some docs on how to tune your network interface.

karpagam: LizA, netstat -a gives a lot more information on thevsockets/ports open. Look for ESTABLISHED and TIME_WAIT

LizA: netstat -a tells me that I have over 8000 connections. But I have only 3000 sessions open. They have a time_wait status on more than half of them. Is that something to do with my application?

jamesliu: LizA, Regarding netstat output, you'll probably have lots of network sessions still waiting to close. The default setting on Solaris is 240 seconds. You can use ndd /dev/tcp to set the tcp_time_wait_interval to a lower value so that these connections close down more quickly. Say 30 seconds is good. Be careful not to set this too low as slow connections (e.g. modems) might get dropped.

Zartaj: I believe a 32-bit process can only use around 3GB out of a possible 4GB. So is it useful to have more than 4GB physical memory on a system that allows it?

karpagam: Zartaj, What you need to look into is how much your application uses/needs. Are you running 64-bit Oracle and need more than 4GB SGA? Use pmap to tell you the processor footprint and calculate on that basis.

Zartaj: In the Solaris Multithreading Guide, it recommends against thread-pooling saying it is cheaper to create threads as needed. Do you agree with that?

jamesliu: Zartaj, in general I would agree that threads are relatively cheaper to create than to pool. Pooling creates many potential oppotunities for contention. However, in some cases, such as Java, the threading model may be more amenable to pooling since there is a Java layer there.

jd: The way I understand load average to be calculated, it is incremented by 1 for every CPU's worth of time spent. (Ex. a 10 CPU system with 10% user time as shown by vmstat will report a load avg. of 1). High system time (as show in vmstat) causes load to jump very high in some cases; I have seen load avg. of 30 on a 10 CPU system with 40% system time/10% user time. I would like to know how the system comes up with that load avg.

jamesliu: jd, I couldn't tell you exactly how the algorithm works. It's been a while since I've touched on it. Karpagam?

karpagam: jd, A high system time of that ratio clearly shows that there is a bottleneck. Did you check to see how your disks are doing. You also might want to see in mpstat/top/prstat/statit how the utilizations per processor is.

Craki: I find that whenever a box has fairly high uptime, memory reports on usage is higher than it should be. My DBA's see this and start getting worrired about the boxes not being big enough. Is this a Solaris behavioral quirk?

jamesliu: Craki, I can't be certain, but our experience shows that in uptimes of 60+ days, the memory footprint remains stable on many of our servers. The most common area of memory growth over time we've seen has perhaps been in memory leaks on the application or windowing side. Many windowing apps or servers or windows managers do in fact leak lots of memory. This may be the cause of growth over time.

jd: I am not asking about a problem in particular, I have just seen the load avg. jump like that and am curious as to how it's calculated.

karpagam: jd, Did you see this on Solaris 8?

Emory2: Does anyone know if there is a working version of "proctool" for Solaris 8? One version that we tested did not work for multiprocessors.

karpagam: Emory2, you can use /usr/proc/bin proc tools - right? pmap, ptree, ptime, pldd, etc...

jd: I have seen it on 2.6 and 8; the most recent was on 8 where a Java programmer had an app. that went crazy with creating/deleting threads.

jamesliu: jd, I guess you're still asking about how the load average is computed. Again, I can't tell you off hand since it's been a while since I've touched the algorithms. But I can imagine that any process that creates/destroys lots of threads is a contrived and somewhat unique situation. Perhaps we can work offline to discuss optimization and development techniques to reduce the CPU utilization.

LizA: Are there any special libraries I can use to improve performance?

jamesliu: LizA, there are a number of libraries that might boost performance. Some are in Solaris 8, some are third party. If you have a thread intensive application and have high smtx values, due to schedlock, you may want to put /usr/lib/lwp at the top of your LD_LIBRARY_PATH which is an alternate thread library. If your app. is memory allocation intensive, there are 3 ISV solutions that replace the bundled malloc on Solaris that improve performance.

alexc: question about threading, etc., ... the way I understand it, some programmers use multiple processes to do threading (spawning child processes) and some use threads within a single process. Clearly, multiple processes can run on multiple processes simultaneously. However, can threads within a single process run on more than one processr simultaneously?

alexc: Rather, multiple processes can use multiple processORs, but can threads within a single process do the same?

jamesliu: Alexc, absolutely. Threads do run on multiple processors on Solaris. As do multiple processors with multiple threads. Solaris supports scheduling that allows a many-to-many relationship between threads or processes and processORs.

Craki: Can you recommend a centralized monitoring/management package? I've done a small deployment of Sun[tm] Management Center and liked it. Would Big Brother be a good solution as well?

karpagam: Sun Management Center is very good. If you want to monitor database statistics also, I know that a lot of folks use Foglight from Quest Software. I do not know about Big Brother - sorry.

LizA: We're about out of time. Thanks to Karpagam and James...and all of you who asked such great questions. Karpagam and James, do you have a few parting words?

jamesliu: It has again been a pleasure. I'd be pleased to field questions in this forum again soon. -JCL

jamesliu: Note to all, if you're running any of the vmstat or mpstat, just make sure you put a time interval like 5 seconds and exclude the first entry in you computations. - jcl

karpagam: Thanks everyone for all the wonderful questions. It has been a pleasure. Thanks LizA for taking this forum smoothly :)

LizA: Be sure to join us again on June 21, at 10 a.m. PDT, when our guest is Rich Teer and the topic is "Secure C Programming."

 


Books

System Performance Tuning

Oracle and Unix Performance Tuning ~ Usually ships in 24 hours
Ahmed Alomari / Paperback / Published 1997
Amazon price: $35.96 ~ You Save: $8.99 (20%)
Aix Performance Tuning ~ Usually ships in 2-3 days
Frank Waters / Paperback / Published 1996
Amazon price: $63.00
Optimizing Unix for Performance ~ Usually ships in 24 hours
Amir H. Majidimehr / Paperback / Published 1995
Amazon price: $40.00
Solaris Performance Administration : Performance Measurement, Fine Tuning, and Capacity Planning for Releases 2.5.1 and 2.6 ~ Usually ships in 24 hours
H. Frank Cervone / Paperback / Published 1998
Amazon price: $35.96 ~ You Save: $8.99 (20%)
Sun Performance and Tuning : Java and the Internet ~ Usually ships in 24 hours
Adrian Cockcroft, et al / Paperback / Published 1998
Amazon price: $40.80 ~ You Save: $10.20 (20%)
System Performance Tuning (Nutshell Handbooks) ~ Usually ships in 2-3 days
Michael Kosta Loukides, Mike Loukides / Paperback / Published 1991
Amazon price: $23.96 ~ You Save: $5.99 (20%)
UNIX Performance Tuning; Sys Admin-Essential Reference Series ~ Usually ships in 2-3 days
Sys Admin Magazine(Editor) / Paperback / Published 1997
Amazon price: $23.96 ~ You Save: $5.99 (20%)
Hp-Ux Tuning and Performance : Concepts, Tools and Methods (Hewlett-Packard Professional Books)
Robert F. Sauers, Peter Weygant / Paperback / Published 1999
Amazon price: $45.00 (Not Yet Published -- On Order)
Sun Performance and Tuning : Sparc & Solaris
Adrian Cockcroft / Paperback / Published 1994
(Publisher Out Of Stock)
Taming UNIX : UNIX Performance Management Series
Robert A. Lund / Spiral-bound / Published 1997
Amazon price: $59.95 (Special Order)

Tutorials

Monitoring Performance and System Tuning - USAIL.  Good.

Monitoring Performance with iostat and vmstat

Making the Most of NFS

Finding Disk Hogs

UNIX Performance Management It doesn t have to cost a fortune - Jaqui Lynch Boston College. slides only.


Papers

Other Cockcroft columns at www.sun.com

Performance

Troubleshooting Tips

System performance

From the SGI Admin Guide - last I checked the CPU spends most of its time waiting for something to do

 


Table 5-3 : Indications of an I/O-Bound System

Field					Value		sar Option

%busy (% time disk is busy)		>85		sar -d

%rcache (reads in buffer cache)		low, <85	sar -b

%wcache (writes in buffer cache)	low, <60%	sar -b

%wio (idle CPU waiting for disk I/O)	dev. system >30	sar -u
					fileserver >80


Table 5-5 Indications of Excessive Swapping/Paging

bswot/s (ransfers from memory to disk swap area)	>200	sar -w

bswin/s (transfers to memory)				>200	sar -w

%swpocc (time swap queue is occupied)			>10	sar -q

rflt/s (page reference fault)				>0	sar -t

freemem (average pages for user processes)		<100	sar -r

Indications of a CPU bound systems

%idle (% of time CPU has no work to do)			<5	sar -u

runq-sz (processes in memory waiting for CPU)		>2	sar -q

%runocc (% run queue occupied and processes not executing)	>90	sar -q

hypermail /usr/local/src/src/hypermail - mailing list to web page converter; grep hypermail /etc/aliases shows which lists use hypermail

pwck, grpck should be run weekly to make sure ok; grpck produces a ton of errors

can use local man pages - text only - see Ch3 User Services
put in /usr/local/manl (try /usr/man/local/manl) suffix .l
long ones pack -> pack program.1;mv program.1.z /usr/man/local/mannl/program.z

 


Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater�s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright � 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: November 18, 2007