||Home||Switchboard||Unix Administration||Red Hat||TCP/IP Networks||Neoliberalism||Toxic Managers|
|(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix|
|Performance tuning||Recommended Links||Performance Monitoring||sar|
|Disk subsystem tuning||Linux Kernel Tuning||Linux Virtual Memory Subsystem Tuning||TCP performance tuning||NFS performance tuning||strace|
|Linux performance bottlenecks||VMware||Virtualization||Humor||Etc|
Even with sufficient memory, most database servers will perform large amounts of disk I/O to bring data records into memory and flush modified data to disk. Therefore, it is important to configure sufficient numbers of disk drives to match the CPU processing power being used.
In general, a minimum of 10 high-speed disk drives is required for each Xeon processor. Optimal configurations can require more than 50 10K-RPM disk drives per Xeon CPU. With most database applications, more drives equals greater performance.
The main factors affecting performance include:
From this simple rule stem the following recommendations:
What partition layout to choose? In the Linux community, the partitioning of a disk subsystem engenders vast discussion. The partitioning layout of a disk subsystem is often dictated by application needs, systems management considerations, and personal liking, not performance. The partition layout will therefore be given in most cases. The only suggestion we want to give here is to use a swap partition. Swap partitions, as opposed to swap files, have a performance benefit because there is no overhead of a file system. Ideally swap partition should be on a separate disk drive (preferably solid state). Large swap partitions can be split into two using different drives for each half.
What file system to use? The installation of RHEL 5.6 limits the choice of file systems to: ext2, ext3 and ext4. The Red Hat Enterprise Linux 5.6 installer defaults to ext3 and this is acceptable in most cases, but we encourage you to consider using ext4. To allow anaconda to manipulate ext4 filesystems, you need to start 5.6 installer using the "ext4" parameter on the command line:
Smaller file systems that have no focus on integrity (for example, a Web server cluster) and systems with a strict need for performance (high-performance computing environments) can benefit from the performance of the ext2 file system. ext2 does not have the overhead of journaling, and while ext3 andnext4 has undergone tremendous improvements, there still is a difference. Also note that ext2 file systems can be upgraded easily.
On Suse ReiserFS can be used for applications that use many small files such as
or other applications that use synchronous I/O.
When using Ext3 with many files in one directory, consider enabling btree support:
# mkfs.ext3 -O dir_index
When using Ext3 with multiple threads appending to files in the same directory, consider turning preallocation on:
# mount -o reservation
You can benefit from using dedicated logging devices:
mkreiserfs -j /dev/xxx -s 8193 /dev/xxy
reiserfstune –journal-new-device /dev/xxx -s 8193
mke2fs -O journal_dev /dev/xxx
mke2fs -j -J device=/dev/xxx,size=8193 /dev/xxy
tune2fs -J device=/dev/xxx,size=8193 /dev/xxy
File System Tuning Split file systems based on data access patterns
Consider disabling atime updates on files and directories
# mount -o noatime,nodiratime
Per-request service deadline
Blocker Layer Tunables
Block read ahead buffer
Default is 128. Increase to 512 for fast storage
(SCSI disks or RAID).
May speed up streaming reads a lot.
Number of requests
Default is 128. Increase to 256 with CFQ
scheduler for fast storage.
Increases throughput at minor latency expense.
The value stored in /proc/sys/vm/dirty_background_ratio defines at what percentage of main memory the pdflush daemon should write data out to the disk.
If larger flushes are desired then increasing the default value of 10% to a larger value will cause less frequent flushes.
As in the example above the value can be changed to 25 as shown in
# sysctl -w vm.dirty_background_ratio=25
The default value 10 means that data will be written into system memory until the file system cache has a size of 10% of the server’s RAM.
The ratio at which dirty pages are written to disk can be altered as follows to a setting of 20% of the system memory
# sysctl -w vm.dirty_ratio=20
The disk subsystem is often the most important aspect of server performance, and it is usually the most common bottleneck. However, problems can be hidden by other factors, such as lack of memory. Applications are considered to be I/O-bound when CPU cycles are wasted simply waiting for I/O tasks to finish.
The most common disk bottleneck is having too few disks. Most disk configurations are based on capacity requirements, not performance. The least expensive solution is to purchase the smallest number of the largest-capacity disks possible. However, this places more user data on each disk, causing greater I/O rates to the physical disk and allowing disk bottlenecks to occur.
The second most common problem is having too many logical disks on the same array, which increases seek time and greatly lowers performance.
We discuss the disk subsystem in greater detail in 15.9, "Tuning the file system" on page 480.
As with the other components of the Linux system we discussed, disk metrics are important when identifying performance bottlenecks. Some of the values that may point to a disk bottleneck are:
Iowait -- This is the time the CPU spends waiting for an I/O to occur.
Average queue length -- This is the number of outstanding I/O requests. In general, when the value is higher than 2 to 3,it means there may be a disk I/O bottleneck. This applies to systems with a single disk. In disk arrays, however, the queue length may be different and not necessarily indicate a Linux bottleneck; it may be under the control of the I/O controller using cache or other methods. Average wait -- This is a measurement of the average time in ms that it takes for an I/O request to be serviced. The wait time consists of the actual I/O operation and the time it waits in the I/O queue. Transfers per second -- This refers to the number of I/O operations per second (reads and writes). Blocks read/write per second -- This refers to the reads/writes per second in blocks of 512 bytes in the kernel 2.6 style.
Google matched content
Tuning IBM System x Servers for Performance
Tuning Red Hat Enterprise Linux on IBM Eserver xSeries Servers, IBM Redpaper July 2005
Last modified: March 12, 2019