|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
| HOW-TOs | Man Pages | FAQ | |||||
| Linux Multipath | Logical Volume Manager (LVM) | Humor | Etc |
The most common multipathed environment today is a Fibre Channel (FC) Storage Area Network (SAN). This beasts can be found in most Datacenters.
The multipathing layer sits above the protocols (FCP or iSCSI), and determines whether or not the devices discovered on the target, represent separate devices or whether they are just separate paths to the same device. In this case, Device Mapper (DM) is the multipathing layer for Linux.
To determine which SCSI devices/paths correspond to the same LUN, the DM initiates a SCSI Inquiry. The inquiry response, among other things, carries the LUN serial number. Regardless of the number paths a LUN is associated with, the serial number for the LUN will always be the same. This is how multipathing SW determines which and how many paths are associated with each LUN.
Novell has a useful chapter about multipathing Managing Multipath I/O for Devices
This section describes how to manage failover and path load balancing for multiple paths between the servers and block storage devices.
|
|||||||
gurus
I have the total opposite problem of what most folks are describing here.
I have a PE8650 with 2 QLE2460's.One does exactly what I want but the other acts weird.
It shows 26 devices although I only have 25 devices masked to this wwpn. The extra one has a * on it which means the OS doesn't see it. The OS never discovers it either yet the hba insists its there.
Here the details.
1. Relevant output of /proc/scsi/qla2xxx/2
before zoning:
scsi-qla1-adapter-port=2100001b321734f0;
FC Port Information:
SCSI LUN Information:
(Id:Lun) * - indicates lun is not registered with the OS.
As you can see there is not visible target and therefore no devices.
on the dmx I have a masking entry for 25 devices for this wwpn (6 GK and 19 real devices)
2100001b321734f0 Fibre JOLIE_QLE_P2 2100001b321734f0 005E:0063
0070
0098
009C
00A0
00F1
012B
0155
0169
016D
0175
0179
0181
0185
0199
019D
01A7
01BD
01C1
01C7
When I create the zone and activate it /proc/scsi/qla2xxx/2 immediately shows 26 devices :
FC Port Information:
scsi-qla1-port-0=5006048ad5f0c350:5006048ad5f0c350:010000:81;
SCSI LUN Information:
(Id:Lun) * - indicates lun is not registered with the OS.
( 0: 0): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:25): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:26): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:27): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:28): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:29): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:30): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:31): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:32): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:33): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:34): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:35): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:36): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:37): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:38): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:39): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:40): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:41): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:42): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:43): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:44): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:45): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:46): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:47): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:48): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:49): Total reqs 0, Pending reqs 0, flags 0x0*, 1:0:81 00
At this point they're all not registered with the OS.
First weird this is how can I see 26 but the worst thing is that after rebooting the box the output looks like that :
FC Port Information:
scsi-qla1-port-0=5006048ad5f0c350:5006048ad5f0c350:010000:81;
SCSI LUN Information:
(Id:Lun) * - indicates lun is not registered with the OS.
( 0: 0): Total reqs 3, Pending reqs 0, flags 0x0*, 1:0:81 00
( 0:25): Total reqs 278, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:26): Total reqs 102, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:27): Total reqs 102, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:28): Total reqs 102, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:29): Total reqs 102, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:30): Total reqs 102, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:31): Total reqs 181, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:32): Total reqs 170, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:33): Total reqs 170, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:34): Total reqs 181, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:35): Total reqs 181, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:36): Total reqs 170, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:37): Total reqs 181, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:38): Total reqs 170, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:39): Total reqs 159, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:40): Total reqs 170, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:41): Total reqs 167, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:42): Total reqs 178, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:43): Total reqs 167, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:44): Total reqs 178, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:45): Total reqs 167, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:46): Total reqs 178, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:47): Total reqs 185, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:48): Total reqs 5817, Pending reqs 0, flags 0x0, 1:0:81 00
( 0:49): Total reqs 241, Pending reqs 0, flags 0x0, 1:0:81 00
As you can see 25 devices are there plus one extra one.
Problem is simply what is this, why do I see it and why does the OS not see it then ?
The other qle card on the same host only sees 25 devices and has no * just like desired.
On closer inspection I realize that this is lun0 which should only be visible via QLE port 1 but it seems stuck in QLE port 2 also. Almost as if the scsi layer still thinks it's there when it isn't anymore. But should a reboot fix this since udev will kick in ?
According to symcli lun 0 and lun 19 are the same device (gatekeeper 005E). Something seems double visible but I can't see where.
PP is happy too and sees the 25 device I want it to see.
Has anyone seen this ?
Over the past couple of years a flurry of OS Native multipathing solutions have become available. As a result we are seeing a trend towards these solutions and away from vendor specific multipathing software.
The latest OS Native multipathing solution is Device Mappper-Multipath (DM-Multipath) available with Red Hat Enterprise Linux 4.0 U2 and SuSE SLES 9.0 PS2.
I had the opportunity to configure it in my lab a couple of days ago and I was pleasantly surprised as to how easy was to configure it. Before I show how it's done, let me talk a little about how it works.
The multipathing layer sits above the protocols (FCP or iSCSI), and determines whether or not the devices discovered on the target, represent separate devices or whether they are just separate paths to the same device. In this case, Device Mapper (DM) is the multipathing layer for Linux.
To determine which SCSI devices/paths correspond to the same LUN, the DM initiates a SCSI Inquiry. The inquiry response, among other things, carries the LUN serial number. Regardless of the number paths a LUN is associated with, the serial number for the LUN will always be the same. This is how multipathing SW determines which and how many paths are associated with each LUN.
Before you get started you want to make a sure the following things are loaded:
- device-mapper-1.01-1.6 RPM is loaded
- multipath-tools-0.4.5-0.11
- Netapp FCP Linux Host Utilities 3.0
Make a copy of the /etc/multipath.conf file. Edit the original file and make sure you only have the following entries uncommented out. If you don't have Netapp the section then add it.
defaults {
user_friendly_names yes
}
#
devnode_blacklist {
devnode "sd[a-b]$"
devnode "^(ramrawloopfdmddm-srscdst)[0-9]*"
devnode "^hd[a-z]"
devnode "^cciss!c[0-9]d[0-9]*"
}
devices {
device {
vendor "NETAPP "
product "LUN"
path_grouping_policy group_by_prio
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
prio_callout"/opt/netapp/santools/mpath_prio_ontap /dev/n"
features "1 queue_if_no_path"
path_checker readsector0
failback immediate
}
}The devnode_blacklist includes devices for which you do not want multipathing enabled. So if you have a couple of local SCSI drives (i.e sda and sdb) the first entry in the blacklist will exclude them. Same for IDE drives (hd).
Add the multipath service to the boot sequence by entering the following:
chkconfig --add multipathd
chkconfig multipathd on
Multipathing on Linux is Active/Active with a Round-Robin algorithm.
The path_grouping_policy is group_by_prio. It assigns paths
into Path Groups based on path priority values. Each path is given a priority
(high value = high priority) based on a callout program written by Netapp
Engineering (part of the FCP Linux Host Utilities 3.0).
The priority values for each path in a Path Group are summed and you obtain
a group priority value. The paths belonging to the Path Group with the higher
priority value are used for I/O.
If a path fails, the value of the failed path is subtracted from the Path
Group priority value. If the Path Group priority value is still higher than
the values of the other Path Groups, I/O will continue within that Path
Group. If not, I/O will switch to the Path Group with highest priority.
Create and map some LUNs to the host. If you are using the latest Qlogic
or Emulex drivers, then run the respective utilities they provide to discover
the LUN:
Posted by Nick Triantos at 10:19 AM
Labels: MultipathingDate: Tue, 23 May 2006 14:57:52 +0200 Message-ID: <CA1361E9F77A4243A99E04D98F5CC724023E8E9C@ZARDPEXCH001.medscheme.com> From: "Stephen Hughes" <stephenh@Medscheme.co.za> Subject: RE: [suse-sles-e] Adding Disk on the flyHi Group,
Thanks for all your help with this matter. I managed to use the
rescan-scsi-bus.sh command on one of my servers to add a SAN attached
disk, but now I've assigned more disk to another server I have running
SLES9. I run the rescan-scsi-bus.sh command with the various switches
but it still does not pick up my new disk. Below id the output from my
lsscsi command as well as the command I ran to try and pick up the disk.
The new disk according to my Navisphere client should come in after
/dev/sdbp.
I also looked at some of the other replies but I don't have the rescan
file to echo a "1" to as suggested in one of the replies:
"# echo 1 > /sys/bus/scsi/devices/0:0:0:0/rescan"
mamba:/usr/local/bin # lsscsi
[0:0:6:0] process DELL 1x4 U2W SCSI BP 1.27 -
[0:2:0:0] disk MegaRAID LD0 RAID0 69356R 161J /dev/sda
[1:0:0:0] disk EMC SYMMETRIX 5671 /dev/sdb
[1:0:0:1] disk EMC SYMMETRIX 5671 /dev/sdc
[1:0:0:2] disk EMC SYMMETRIX 5671 /dev/sdd
[1:0:0:3] disk EMC SYMMETRIX 5671 /dev/sde
[1:0:0:4] disk EMC SYMMETRIX 5671 /dev/sdf
[1:0:0:5] disk EMC SYMMETRIX 5671 /dev/sdg
[1:0:0:6] disk EMC SYMMETRIX 5671 /dev/sdh
[1:0:0:7] disk EMC SYMMETRIX 5671 /dev/sdi
[1:0:0:8] disk EMC SYMMETRIX 5671 /dev/sdj
[1:0:0:9] disk EMC SYMMETRIX 5671 /dev/sdk
[1:0:0:10] disk EMC SYMMETRIX 5671 /dev/sdl
[1:0:0:11] disk EMC SYMMETRIX 5671 /dev/sdm
[1:0:0:12] disk EMC SYMMETRIX 5671 /dev/sdn
[1:0:0:13] disk EMC SYMMETRIX 5671 /dev/sdo
[1:0:0:14] disk EMC SYMMETRIX 5671 /dev/sdp
[1:0:0:15] disk EMC SYMMETRIX 5671 /dev/sdq
[1:0:0:16] disk EMC SYMMETRIX 5671 /dev/sdr
[1:0:0:17] disk EMC SYMMETRIX 5671 /dev/sds
[1:0:0:18] disk EMC SYMMETRIX 5671 /dev/sdt
[1:0:0:19] disk EMC SYMMETRIX 5671 /dev/sdu
[2:0:0:0] disk EMC SYMMETRIX 5671 /dev/sdv
[2:0:0:1] disk EMC SYMMETRIX 5671 /dev/sdw
[2:0:0:2] disk EMC SYMMETRIX 5671 /dev/sdx
[2:0:0:3] disk EMC SYMMETRIX 5671 /dev/sdy
[2:0:0:4] disk EMC SYMMETRIX 5671 /dev/sdz
[2:0:0:5] disk EMC SYMMETRIX 5671 /dev/sdaa
[2:0:0:6] disk EMC SYMMETRIX 5671 /dev/sdab
[2:0:0:7] disk EMC SYMMETRIX 5671 /dev/sdac
[2:0:0:8] disk EMC SYMMETRIX 5671 /dev/sdad
[2:0:0:9] disk EMC SYMMETRIX 5671 /dev/sdae
[2:0:0:10] disk EMC SYMMETRIX 5671 /dev/sdaf
[2:0:0:11] disk EMC SYMMETRIX 5671 /dev/sdag
[2:0:0:12] disk EMC SYMMETRIX 5671 /dev/sdah
[2:0:0:13] disk EMC SYMMETRIX 5671 /dev/sdai
[2:0:0:14] disk EMC SYMMETRIX 5671 /dev/sdaj
[2:0:0:15] disk EMC SYMMETRIX 5671 /dev/sdak
[2:0:0:16] disk EMC SYMMETRIX 5671 /dev/sdal
[2:0:0:17] disk EMC SYMMETRIX 5671 /dev/sdam
[2:0:0:18] disk EMC SYMMETRIX 5671 /dev/sdan
[3:0:0:0] disk DGC LUNZ 0219 /dev/sdao
[3:0:0:28] disk DGC RAID 5 0219 /dev/sdap
[3:0:1:0] tape STK T9940B 1.34 /dev/st0
[3:0:2:0] tape STK T9940B 1.34 /dev/st1
[3:0:3:0] tape SEAGATE ULTRIUM06242-XXX 1613 /dev/st2
[3:0:3:1] tape SEAGATE ULTRIUM06242-XXX 1613 /dev/st3
[3:0:4:0] tape SEAGATE ULTRIUM06242-XXX 1613 /dev/st4
[3:0:4:1] tape SEAGATE ULTRIUM06242-XXX 1536 /dev/st5
[3:0:5:0] disk DGC RAID 5 0219 /dev/sdaq
[3:0:5:1] disk DGC RAID 5 0219 /dev/sdar
[3:0:5:2] disk DGC RAID 5 0219 /dev/sdas
[3:0:5:3] disk DGC RAID 5 0219 /dev/sdat
[3:0:5:4] disk DGC RAID 5 0219 /dev/sdau
[3:0:5:5] disk DGC RAID 5 0219 /dev/sdav
[3:0:5:6] disk DGC RAID 5 0219 /dev/sdaw
[3:0:5:7] disk DGC RAID 5 0219 /dev/sdax
[3:0:5:8] disk DGC RAID 5 0219 /dev/sday
[3:0:5:9] disk DGC RAID 5 0219 /dev/sdaz
[3:0:5:10] disk DGC RAID 5 0219 /dev/sdba
[3:0:5:11] disk DGC RAID 5 0219 /dev/sdbb
[3:0:5:12] disk DGC RAID 5 0219 /dev/sdbc
[3:0:5:13] disk DGC RAID 5 0219 /dev/sdbd
[3:0:5:14] disk DGC RAID 5 0219 /dev/sdbe
[3:0:5:15] disk DGC RAID 5 0219 /dev/sdbf
[3:0:5:16] disk DGC RAID 5 0219 /dev/sdbg
[3:0:5:17] disk DGC RAID 5 0219 /dev/sdbh
[3:0:5:18] disk DGC RAID 5 0219 /dev/sdbi
[3:0:5:19] disk DGC RAID 5 0219 /dev/sdbj
[3:0:5:20] disk DGC RAID 5 0219 /dev/sdbk
[3:0:5:21] disk DGC RAID 5 0219 /dev/sdbl
[3:0:5:22] disk DGC RAID 5 0219 /dev/sdbm
[3:0:5:23] disk DGC RAID 5 0219 /dev/sdbn
[3:0:5:24] disk DGC RAID 5 0219 /dev/sdbo
[3:0:5:25] disk DGC RAID 5 0219 /dev/sdbp
Command I executed: (26,27,28,29 because I'm adding 4 LUNS)
mamba:/usr/local/bin # rescan-scsi-bus.sh --hosts=3 --ids=5
--luns=26,27,28,29
Thanks
Stephen
-----Original Message-----
From: Matt Gillard [mailto:Matt.Gillard@colesmyer.com.au]
Sent: 04 May 2006 08:06 AM
To: Stephen Hughes; Denis Brown; suse-sles-e@suse.com
Subject: RE: [suse-sles-e] Adding Disk on the fly
/bin/rescan-scsi-bus.sh is what you are after.
Customers are always looking for ways to get their cost of Linux deployments down lower, and make management easier on their staff. One of, at least in my opinion, the best options they have is to get rid of 3rd party multi path IO solutions for your SAN and disk management.I was at one of my customers the other day helping them set up MPIO that is built into SLES 10. While I was there I took a few notes for what we did to get things working for their environment. These same instructions should work with other SAN’s that can handle multi path IO.
SLES 10 supports a lot of SAN’s right out of the box and automatically detects them so you don’t really need an /etc/multipath.conf. My customer likes to be able to change the black list for various types of hardware they use and wanted user-friendly names. To do this I created a multipath.conf for them that looked like the following…
## /etc/multipath.conf file for SLES 10
## You may find a full copy of this file, with comments, here..
## /usr/share/doc/packages/multipath-tools/multipath.conf# Setup user friendly names
# name : user_friendly_names
# scope : multipath
# desc : If set to “yes”, using the bindings file
# /var/lib/multipath/bindings to assign a persistent and
# unique alias to the multipath, in the form of mpath<n>.
# If set to “no” use the WWID as the alias. In either case
# this be will be overriden by any specific aliases in this
# file.
# values : yes|no
# default : nodefaults {
user_friendly_names yes}
# Setup the blacklisted devices….
# name : blacklist
# scope : multipath & multipathd
# desc : list of device names that are not multipath candidates
# default : cciss, fd, hd, md, dm, sr, scd, st, ram, raw, loop
#blacklist {
devnode “^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*”
devnode “^hd[a-z][[0-9]*]”
devnode “^cciss!c[0-9]d[0-9]*[p[0-9]*]”}
If your curious about what platforms that SLES 10 supports out of the box a list is in the SLES documentation.
Assuming you already have your LUNs assigned to you. And once you have the /etc/multipath.conf file there are a few services that need to be started to make all this work.
# service boot.multipth start
# service multipathd startThat should start the demons and load kernel modules that you need. To check that do an lsmod to see if you see dm_multipath and multipath. Once that is done you can check your setup to see if it is correct…
# multipath –v2 -d
create: mpath10 (360080480000290100601544032363831) EMC,SYMMETRIX
[size=200G][features=0][hwhandler=0]
\_ round-robin 0 [prio=4][undef]
\_ 11:0:0:39 sdbr 68:80 [undef][ready]
\_ 11:0:1:39 sdcc 69:0 [undef][ready]
\_ 10:0:0:39 sdl 8:176 [undef][ready]
\_ 10:0:1:39 sdw 65:96 [undef][ready]
create: mpath11 (360080480000290100601544032363832) EMC,SYMMETRIX
[size=400G][features=0][hwhandler=0]
\_ round-robin 0 [prio=4][undef]
\_ 11:0:0:40 sdbs 68:96 [undef][ready]
\_ 11:0:1:40 sdcd 69:16 [undef][ready]
\_ 10:0:0:40 sdm 8:192 [undef][ready]
\_ 10:0:1:40 sdx 65:112 [undef][ready]
That is what it looks like for the EMC Symmetrix I was working with so your mileage may vary.
Once you have the devices showing up correctly you need to make sure the multi path modules load on reboot. To do that run the following commands…
# chkconfig multipathd on
# chkconfig boot.multipath onThe next thing is to configure LVM to scan these devices so you can use them in your volume groups. To do this you will need to edit /etc/lvm/lvm.conf in the following places…
filter = [ “a|/dev/disk/by-id/.*|”, “r|.*|” ]types = [ “device-mapper”, 253 ]Above limits the devices that LVM will scan to only devices that show up by-id. If your using LVM to manage other disks that are not in that directory, think local scsi drives, you will need to make sure those are still available by adjusting your filter more like this…
filter = [ “a|/dev/disk/by-id/.*|”, “a|/dev/sda1$/”, “r|.*|” ]Once that is done do a lvmdiskscan to get LVM to see the new drives.
A few other things that customers often ask for is how to have SLES scan for new LUNs on the san without rebooting. With SLES10 it’s s simple as passing a few parameters to the sys file system.
# echo 1 > /sys/class/fc_host/host<number>/issue_lip
That will make the kernel aware of the new devices at a very low level, but the devices are not yet usable. To make them usable do the following…
# echo “- - -” > /sys/class/scsi_host/host<number>/scan
That will scan all devices and add the new ones for you. All of this information is in the SLES 10 Storage Administration Guide, including various ways to recover from issues.
Also since SP1 SLES10 has been able to boot a mpio device from the SAN. The doc for doing that in SP1 is located here.
Have fun and enjoy..
Last 3 posts by Daniel
- Fun with your SAN and Multi-path - May 6th, 2008
Popularity: 27% [?]
« SAP Business All-In-One to ship preconfigured on SLES and HP | Managing your iPod with RhythmBox & Linux »
2 Responses to “ Fun with your SAN and Multi-path ”
Comments:
> supportHow to setup / use multipathing on SLES
Information
Preamble: The procedure described within this article is only supported on SLES9 SP2 level and higher. Earlier releases may not work as expected.
1. Introduction
The Multipath IO (MPIO) support in SLES9 (SP2) is based on the Device Mapper (DM) multipath module of the Linux kernel, and the multipath-tools user-space package. These have been enhanced and integrated into SLES9 SP2 by SUSE Development.
DM MPIO is the preferred form of MPIO on SLES9 and the only option completely supported by Novell/SUSE.
DM MPIO features automatic configuration of the MPIO subsystem for a large variety of setups. Active/passive or active/active (with round-robin load balancing) configurations of up to 8 paths to each device are supported.
The framework is extensible both via specific hardware handlers (see below) or via more sophisticated load balancing algorithms than round-robin.
The user-space component takes care of automatic path discovery and grouping, as well as automated path retesting, so that a previously failed path is automatically reinstated when it becomes healthy again. This minimizes, if not obviates, the need for administrator attention in a production environment.
2. Supported configurations
- Supported hardware: Architectures
MPIO is supported on all seven architectures: IA32, AMD64/EM64T, IPF/IA64, p-Series (32-bit/64-bit), z-Series (31-bit and 64-bit).
- Supported hardware: Storage subsystems
The multipath-tools package is currently aware of the following storage subsystems:
- 3Pardata VV
- Compaq HSV110 / MSA1000
- DDN SAN MultiDirector
- DEC HSG80
- EMC CLARiiON CX
- FSC CentricStor
- HP HSV110 / A6189A / Open-
- Hitachi DF400 / DF500 / DF600
- IBM 3542 / ProFibre 4000R
- NETAPP
- SGI TP9100 / TP9300 / TP9400 / TP9500
- STK OPENstorage DS280
- SUN StorEdge 3510 / T4
In general, most other storage subsystems should work; however, the ones above will be detected automatically. Others might require an appropriate entry in the
/etc/multipath.confdevices section.Storage arrays which require special commands on fail-over from one path to the other, or require special non-standard error handling, might require more extensive support; however, the DM framework has hooks for hardware handlers, and one such handler for the EMC CLARiiON CX family of arrays is already provided.
- Hardware support: Host bus adapters
- Qlogic
- Emulex
- LSI
In general, all Fibre Channel / SCSI cards should work, as our MPIO implementation is above the device layer.
- Supported software configurations summary
Currently, DM MPIO is not available for either the root or the boot partition, as the boot loader does not know how to handle MPIO.
All auxiliary data partitions such as
/homeor application data can be placed on an MPIO device.LVM2 is supported on top of DM MPIO. See the setup notes.
Partitions are supported in combination with DM MPIO, but have limitations. See the setup notes.
Software RAID on top of DM MPIO is also supported; however, note that auto-discovery is not available and that you will need to setup
/etc/raidtab(if using raidtools) or/etc/mdadm.conf(if using mdadm) correctly.3. Installation notes
- Software installation
Upgrade a system to SLES9 SP2 level (or more recent) and install the multipath-tools package.
- Changing system configuration
Using an editor of your choice, within /etc/sysconfig/hotplug set this value:
HOTPLUG_USE_SUBFS=noIn addition to the above change, please configure the system to automatically load the device drivers for the controllers the MPIO devices are connected to within the INITRD. The boot scripts will only detect MPIO devices if the modules for the respective controllers are loaded at boot time. To achieve this, simply add the needed driver module to the variable INITRD_MODULES within the file /etc/sysconfig/kernel.
Example:
Your system contains a RAID controller that is accessed by the cciss driver and you are using ReiserFS as a filesystem. The MPIO devices will be connected to a Qlogic controller accessed by the driver qla2xxx, which is not yet configured to be used on this system. The mentioned entry within /etc/sysconfig/kernel will then probably look like this:
INITRD_MODULES="cciss reiserfs"Using an editor, you would now change this entry:
INITRD_MODULES="cciss reiserfs qla2xxx"When you have applied this change, you will need to recreate the INITRD on your system to reflect it. Simply run this command:
mkinitrdWhen you are using GRUB as a bootmanager, you do not use to make any further changes. Upon the next reboot the needed driver will be loaded within the INITRD. If you are using LILO as bootmanager, please remember to run it once to update the boot record.
- Configuring multipath-tools
If your system is one of those listed above, no further configuration should be required.
You might otherwise have to create
/etc/multipath.conf(see the examples under/usr/share/doc/packages/multipath-tools/) and add an appropriate devices entry for your storage subsystem.One particularly interesting option in the
/etc/multipath-tools.conffile is the "polling_interval" which defines the frequency of the path checking that can be configured.Alternatively, you might choose to blacklist certain devices which you do not want multipath-tools to scan.
You can then run:
multipath -v2 -dto perform a 'dry-run' with this configuration. This will only scan the devices and print what the setup would look like.
The output will look similar to:
3600601607cf30e00184589a37a31d911 [size=127 GB][features="0"][hwhandler="1 emc"] \_ round-robin 0 [first] \_ 1:0:1:2 sdav 66:240 [ready ] \_ 0:0:1:2 sdr 65:16 [ready ] \_ round-robin 0 \_ 1:0:0:2 sdag 66:0 [ready ] \_ 0:0:0:2 sdc 8:32 [ready ]showing you the name of the MPIO device, its size, the features and hardware handlers involved, as well as the (in this case, two) priority groups (PG). For each PG, it shows whether it is the first (highest priority) one, the scheduling policy used to balance IO within the group, and the paths contained within the PG. For each path, its physical address (host:bus:target:lun), device nodename and major:minor number is shown, and of course whether the path is currently active or not.
Paths are grouped into priority groups; there's always just one priority group in active use. To model an active/active configuration, all paths end up in the same group; to model active/passive, the paths which should not be active in parallel will be placed in several distinct priority groups. This normally happens completely automatically on device discovery.
- Enabling the MPIO components
Now run
/etc/init.d/boot.multipath start /etc/init.d/multipathd startas user root. The multipath devices should now show up automatically under
/dev/disk/by-name/; the default naming will be the WWN of the Logical Unit, which you can override via/etc/multipath.confto suit your tastes.Run
insserv boot.multipath multipathdto integrate the multipath setup into the boot sequence.
From now on all access to the devices should go through the MPIO layer.
- Querying MPIO status
To query the current MPIO status, run
multipath -lThis will output the current status of the multipath maps in a format similar to the command already explained above:
3600601607cf30e00184589a37a31d911 [size=127 GB][features="0"][hwhandler="1 emc"] \_ round-robin 0 [active][first] \_ 1:0:1:2 sdav 66:240 [ready ][active] \_ 0:0:1:2 sdr 65:16 [ready ][active] \_ round-robin 0 [enabled] \_ 1:0:0:2 sdag 66:0 [ready ][active] \_ 0:0:0:2 sdc 8:32 [ready ][active]However, it includes additional information about which priority group is active, disabled or enabled, as well as for each path whether it is currently active or not.
- Tuning the fail-over with specific HBAs
HBA timeouts are typically setup for non-MPIO environments, where longer timeouts make sense - as the only alternative would be to error out the IO and propagate the error to the application. However, with MPIO, some faults (like cable failures) should be propagated upwards as fast as possible so that the MPIO layer can quickly take action and redirect the IO to another, healthy path.
For the QLogic 2xxx family of HBAs, the following setting in
/etc/modprobe.conf.localis thus recommended:options qla2xxx qlport_down_retry=1 ql2xfailover=0 ql2xretrycount=5- Managing IO in error situations
In certain scenarios, where the driver, the HBA or the fabric experiences spurious errors,it is advisable that DM MPIO is configured to queue all IO in case of errors leading loss of all paths, and never propagate errors upwards.
This can be achieved by setting
defaults { default_features "1 queue_if_no_path" }in /etc/multipath.conf.
As this will lead to IO being queued forever, unless a path is reinstated, make sure that multipathd is running and works for your scenario. Otherwise, IO might be stalled forever on the affected MPIO device, until reboot or until you manually issue a
dmsetup message 3600601607cf30e00184589a37a31d911 0 fail_if_no_path(substituting the correct map name), which will immediately cause all queued IO to fail. You can reactivate the queue if no path feature by issueing
dmsetup message 3600601607cf30e00184589a37a31d911 0 queue_if_no_pathYou can also use these two commands to switch between both modes for testing, before committing the command to your /etc/multipath.conf.
4. Using the MPIO devices
- Using the whole MPIO devices directly
If you want to use the whole LUs directly (if for example you're using the SAN features to partition your storage), you can simply use the
/dev/disk/by-name/xxxnames directly for mkfs,/etc/fstab, your application, etc.- Using LVM2 on top of the MPIO devices
To make LVM2 recognize the MPIO devices as possible Physical Volumes (PVs), you will have to modify
/etc/lvm/lvm.conf. You will also want to modify it so that it does not scan and use the physical paths, but only accesses your MPIO storage via the MPIO layer.Thus, change the "filter" entry in
lvm.confas follows and add the types extension to make LVM2 recognize them:filter = [ "a|/dev/disk/by-name/.*|", "r|.*|" ] types = [ "device-mapper", 1 ]This will allow LVM2 to only scan the by-name paths and reject everything else. (If you are also using LVM2 on non-MPIO devices, you will of course need to make the necessary adjustments to suit your setup.)
You can then use pvcreate and the other LVM2 commands as usual on the
/dev/disk/by-name/path.- Partitions on top of MPIO devices
It is not currently possible to partition the MPIO devices themselves. However, if the underlying physical device is partitioned, the MPIO device will reflect those partitions and the MPIO layer will provide
/dev/disk/by-name/>name<p1 ... pNdevices so you can access the
Novell Doc SLES 10 Storage Administration Guide - SLES 10 Storage Administration Guide
This guide provides information about how to manage storage devices on a SUSE® Linux Enterprise Server 10 Support Pack 2 server with an emphasis on using the Enterprise Volume Management System (EVMS) 2.5.5 or later to manage devices. Related storage administration issues are also covered as noted below.
Multipath I-O - Wikipedia, the free encyclopedia
Linux Multipath Howto (RHEL 4) - SWiK
Storage- Linux Native Multipathing (Device Mapper-Multipath)
The Linux multipath implementation
- up2date device-mapper-multipath
- Edit /etc/multipath.conf
For detailed information, see: "SAN Persistent Binding and Multipathing in the 2.6 Kernel"- modprobe dm-multipath
- modprobe dm-round-robin
- service multipathd start
- multipath -v2
Will show multipath luns and groups. Look for the multipath group number, this is the dm-# listed in /proc/partitions. The multipath lun is accessed via the /dev/dm-# device entry.- Format each SCSI DEVICE:
- sfdisk /dev/sdX
- (Optional) Create multipath devices for each partition:
(not needed if using LVM, since you will just mount the logical volume device)
- kpartx -a /dev/dm-#
- Enable multipath to start on bootup:
- chkconfig multipathd on
Configuring Linux to enable Multipath I/O
Linux multipath IO (MPIO) using multipath-tools | CaliviaCopyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Disclaimer:
Created May 16, 1996; Last modified: August 15, 2009
July 17th, 2008 at 11:52 am
I would just add, for IBM DS8000, DS6000, ESS, or SVC 4.2 disk systems, you need to use the multipath.conf file located in a table here: http://www-1.ibm.com/support/docview.wss?rs=540&context=ST52G7&dc=D430&uid=ssg1S4000107&loc=en_US&cs=utf-8&lang=en
July 24th, 2008 at 10:58 am
Daniel - Nice article, helped me with MPIO. I’m trying MPIO with EMC CX-300 / SLES 10 SP2 and Xen. Thanks -Bruce
Think you have one typo:
# service boot.multipth start
should be
# service boot.multipath start