Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

GPFS on Red Hat

News

High Performance Components

Books

Recommended Links

Message Passing Interface

GPFS on Red Hat Beowulf cluster Oracle Grid Engine
Performance Monitoring Oracle Performance Tuning Apache performance tuning Solaris Performance Tuning AIX performance tuning NFS performance tuning Unix System Monitoring  
  Linux Troubleshooting Linux Troubleshooting Tips Suse performance tuning Network Performance tuning Linux Performance Tuning Database Performance Tuning    
uptime command free top ps pmap ptree sar  
mostat vmstat iostat procstat nfsstat Admin Horror Stories Humor Etc

Adapted from IBM General Parallel File System - Wikipedia and IBM Redbooks Implementing the IBM General Parallel File System (GPFS) in a Cross Platform Environment

Contents


Introduction

The IBM General Parallel File System (GPFS) is a high performance shared-disk file management solution that provides fast, reliable access to data from multiple nodes in a cluster environment. It was initially designed for AIX on RS/6000 system(1998).

GPFS allows applications on multiple nodes to share file data. It stripes data across multiple disks for higher performance. GPFS is based on a shared disk model which provides lower overhead access to disks not directly attached to the application nodes and uses a distributed locking protocol to provide full data coherence for access from any node.

It offers many of the standard POSIX file system interfaces allowing most applications to execute without modification or recompiling. These capabilities are available while allowing high speed access to the same data from all nodes of the cluster and providing full data coherence for operations occurring on the various nodes. GPFS attempts to continue operation across various node and component failures assuming that sufficient resources exist to continue.

GPFS can exploit IP over InfiniBand and Remote Direct Memory Access (RDMA) to provide access to the file system.  It is especially useful in GPFS clusters. As a cluster file system GPFS provides a global namespace, shared file system access among GPFS clusters, simultaneous file access from multiple nodes, high recoverability and data availability through replication, the ability to make changes while a file system is mounted, and simplified administration even in large environments.

The same file can be accessed concurrently from multiple nodes. GPFS is designed to provide high availability through advanced clustering technologies, dynamic file system management and data replication. GPFS can continue to provide data access even when the cluster experiences storage or node malfunctions. GPFS scalability and performance are designed to meet the needs of data intensive applications such as engineering design, digital media, data mining, relational databases, financial analytics, seismic data processing, scientific research and scalable file serving.

The unique differentiation points for GPFS versus other files systems are as follows:

In addition to providing filesystem storage capabilities, GPFS provides tools for management and administration of the GPFS cluster and allows for shared access to file systems from remote GPFS clusters.

GPFS is the software licensed by IBM. It is neither free not open source. It is offered as part of the IBM System Cluster 1350. The most recent release of GPFS is 3.5

 History

GPFS began as the Tiger Shark file system, a research project at IBM's Almaden Research Center as early as 1993. Shark was initially designed to support high throughput multimedia applications. This design turned out to be well suited to scientific computing. 

Another ancestor of GPFS is IBM's Vesta filesystem, developed as a research project at IBM's Thomas J. Watson Research Center between 1992-1995. Vesta introduced the concept of file partitioning to accommodate the needs of parallel applications that run on high-performance multicomputers with parallel I/O subsystems. With partitioning, a file is not a sequence of bytes, but rather multiple disjoint sequences that may be accessed in parallel. The partitioning is such that it abstracts away the number and type of I/O nodes hosting the filesystem, and it allows a variety of logical partitioned views of files, regardless of the physical distribution of data within the I/O nodes. The disjoint sequences are arranged to correspond to individual processes of a parallel application, allowing for improved scalability.

Vesta was commercialized as the PIOFS filesystem around 1994 and was succeeded by GPFS around 1998. [8] The main difference between the older and newer filesystems was that GPFS replaced the specialized interface offered by Vesta/PIOFS with the standard Unix API: all the features to support high performance parallel I/O were hidden from users and implemented under the hood.[3][8] Today, GPFS is used by many of the top 500 supercomputers listed on the Top 500 Supercomputing Sites web site. Since inception GPFS has been successfully deployed for many commercial applications including: digital media, grid analytics and scalable file service.

In 2010 IBM released a version of GPFS that included a capability known as GPFS-SNC where SNC stands for Shared Nothing Cluster. This allows GPFS to be used as a filesystem for locally attached disks on a cluster of network connected servers rather than requiring sharing of disks using a SAN with dedicated servers. GPFS-SNC is suitable for workloads with high data locality

Architecture

GPFS provides high performance by allowing data to be accessed over multiple computers at once. Most existing file systems are designed for a single server environment, and adding more file servers does not improve performance. GPFS provides higher input/output performance by "striping" blocks of data from individual files over multiple disks, and reading and writing these blocks in parallel. Other features provided by GPFS include high availability, support for heterogeneous clusters, disaster recovery, security, DMAPI, HSM and ILM.

According to (Schmuck and Haskin), a file that is written to the filesystem is broken up into blocks of a configured size, less than 1 megabyte each. These blocks are distributed across multiple filesystem nodes, so that a single file is fully distributed across the disk array. This results in high reading and writing speeds for a single file, as the combined bandwidth of the many physical drives is high. This makes the filesystem vulnerable to disk failures -any one disk failing would be enough to lose data. To prevent data loss, the filesystem nodes have RAID controllers — multiple copies of each block are written to the physical disks on the individual nodes. It is also possible to opt out of RAID-replicated blocks, and instead store two copies of each block on different filesystem nodes.

Other features of the filesystem include

It is interesting to compare this with Hadoop's HDFS filesystem, which is designed to store similar or greater quantities of data on commodity hardware — that is, datacenters without RAID disks and a Storage Area Network (SAN).

  1. HDFS also breaks files up into blocks, and stores them on different filesystem nodes.
  2. HDFS does not expect reliable disks, so instead stores copies of the blocks on different nodes. The failure of a node containing a single copy of a block is a minor issue, dealt with by re-replicating another copy of the set of valid blocks, to bring the replication count back up to the desired number. In contrast, while GPFS supports recovery from a lost node, it is a more serious event, one that may include a higher risk of data being (temporarily) lost.
  3. GPFS makes the location of the data transparentapplications are not expected to know or care where the data lies. In contrast, Google GFS and Hadoop HDFS both expose that location, so that MapReduce programs can be run near the data.
  4. GPFS supports full Posix filesystem semantics. Niether HDFS nor GFS support full Posix compliance.
  5. GPFS distributes its directory indices and other metadata across the filesystem. Hadoop, in contrast, keeps this on the Primary and Secondary Namenodes, large servers which must store all index information in-RAM.
  6. GPFS breaks files up into small blocks. Hadoop HDFS likes blocks of 64 MB or more, as this reduces the storage requirements of the Namenode. Small blocks or many small files fill up a filesystem's indices fast, limiting the filesystem's size.

Information Lifecycle Management (ILM) tools

Storage pools allow for the grouping of disks within a file system. Tiers of storage can be created by grouping disks based on performance, locality or reliability characteristics. For example, one pool could be high performance fibre channel disks and another more economical SATA storage.

A fileset is a sub-tree of the file system namespace and provides a way to partition the namespace into smaller, more manageable units. Filesets provide an administrative boundary that can be used to set quotas and be specified in a policy to control initial data placement or data migration. Data in a single fileset can reside in one or more storage pools. Where the file data resides and how it is migrated is based on a set of rules in a user defined policy.

There are two types of user defined policies in GPFS: File placement and File management. File placement policies direct file data as files are created to the appropriate storage pool. File placement rules are determined by attributes such as file name, the user name or the fileset. File management policies allow the file's data to be moved or replicated or files deleted. File management policies can be used to move data from one pool to another without changing the file's location in the directory structure. File management policies are determined by file attributes such as last access time, path name or size of the file.

The GPFS policy processing engine is scalable and can be run on many nodes at once. This allows management policies to be applied to a single file system with billions of files and complete in a few hours.


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

IBM GPFS configuration on RHEL5

Introduction

We're installing a two node GPFS cluster : gpfs1 and gpfs2. Those are RHEL5 systems, accessing a shared disk as '/dev/sdb'. We're not using client/server GPFS feature but just two NSD.

Installation

On each node, make sure you've got those packages installed,

rpm -q \

libstdc++ \

compat-libstdc++-296 \

compat-libstdc++-33 \

libXp \

imake \

gcc-c++ \

kernel \

kernel-headers \

kernel-devel \

xorg-x11-xauth

On each node, make sure the nodes reslove and are able to login as root to each one, even itself,

cat /etc/hosts

ssh-keygen -t dsa -P ''

copy/paste the public keys from each node,

cat .ssh/id_dsa.pub

to one same authorized_keys2 on all the nodes,

vi ~/.ssh/authorized_keys2

check the nodes can connect to each other, even to itselfs,

ssh gpfs1

ssh gpfs2

On each node, extract and install IBM Java,

./gpfs_install-3.2.1-0_i386 --text-only

rpm -ivh /usr/lpp/mmfs/3.2/ibm-java2-i386-jre-5.0-4.0.i386.rpm

extract again and install the GPFS RPMs,

./gpfs_install-3.2.1-0_i386 --text-only

rpm -ivh /usr/lpp/mmfs/3.2/gpfs*.rpm


On each node, get the latest GPFS update (http://www14.software.ibm.com/webapp/set2/sas/f/gpfs/download/home.html) and install it,

mkdir /usr/lpp/mmfs/3.2.1-13

tar xvzf gpfs-3.2.1-13.i386.update.tar.gz -C /usr/lpp/mmfs/3.2.1-13

rpm -Uvh /usr/lpp/mmfs/3.2.1-13/*.rpm

On each node, prepare the portability layer build,

#mv /etc/redhat-release /etc/redhat-release.dist

#echo 'Red Hat Enterprise Linux Server release 5.3 (Tikanga)' > /etc/redhat-release

cd /usr/lpp/mmfs/src

export SHARKCLONEROOT=/usr/lpp/mmfs/src

rm config/site.mcr

make Autoconfig

check for those values into the configuration,

grep ^LINUX_DISTRIBUTION config/site.mcr

grep 'define LINUX_DISTRIBUTION_LEVEL' config/site.mcr

grep 'define LINUX_KERNEL_VERSION' config/site.mcr

Note. "2061899" for kernel "2.6.18-128.1.10.el5"


On each node, build it,

make clean

make World

make InstallImages

On each node, edit the PATH,

vi ~/.bashrc

add this line,

PATH=$PATH:/usr/lpp/mmfs/bin

apply,

source ~/.bashrc

On some node, create the cluster,

mmcrcluster -N gpfs1:quorum,gpfs2:quorum -p gpfs1 -s gpfs2 -r /usr/bin/ssh -R /usr/bin/scp

Note. gpfs1 as primary configuration server, gpfs2 as secondary


On some node, start the cluster on all the nodes,

mmstartup -a

On some node, create the NSD,

vi /etc/diskdef.txt

like,

/dev/sdb:gpfs1,gpfs2::::

apply,

mmcrnsd -F /etc/diskdef.txt

On some node, create the filesystem,

mmcrfs gpfs1 -F /etc/diskdef.txt -A yes -T /gpfs

Note. '-A yes' for automount

Note. check for changes into '/etc/fstab'


On some node, mount /gpfs on all the nodes,

mmmount /gpfs -a

On some node, check you've got access to the GUI,

/etc/init.d/gpfsgui start

Note. if you need to change the default ports, edit those file and change "80" and "443" to the ports you want,

#vi /usr/lpp/mmfs/gui/conf/config.properties

#vi /usr/lpp/mmfs/gui/conf/webcontainer.properties

wait a few seconds (starting JAVA...) and go to node's GUI URL,

https: //gpfs2/ibm/console/

On each node, you can now disable the GUI to save some RAM,

/etc/init.d/gpfsgui stop

chkconfig gpfsgui off

and make sure gpfs is enable everywhere,

chkconfig --list | grep gpfs

Note. also make sure the shared disk shows up at boot.

Usage

For toubleshooting, watch the logs there,

tail -F /var/log/messages | grep 'mmfs:'

On some node, to start the cluster and mount the file system on all the nodes,

mmstartup -a

mmmount /gpfs -a

Note. "mmshutdown" to stop the cluster.


Show cluster informations,

mmlscluster

#mmlsconfig

#mmlsnode

#mmlsmgr


Show file systems and mounts,

#mmlsnsd

#mmlsdisk gpfs1

mmlsmount all

show file systems options,

mmlsfs gpfs1 -a

To disable automount,

mmchfs gpfs1 -A no

to reenable automount,

mmchfs gpfs1 -A yes

References

Install and configure General Parallel File System (GPFS) on xSeries : http://www.ibm.com/developerworks/eserver/library/es-gpfs/

General Parallel File System (GPFS) : http://www.ibm.com/developerworks/wikis/display/hpccentral/General+Parallel+File+System+(GPFS)

Managing File Systems : http://www.ibm.com/developerworks/wikis/display/hpccentral/Managing+File+Systems

GPFS V3.1 Problem Determination Guide : http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.cluster.gpfs.doc/gpfs31/bl1pdg1117.html

GPFS : http://csngwinfo.in2p3.fr/mediawiki/index.php/GPFS

GPFS V3.2 and GPFS V3.1 FAQs : http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.cluster.gpfs.doc/gpfs_faqs/gpfsclustersfaq.html

Install and configure General Parallel File System (GPFS) on xSeries

Outdated document related to RHEL 4.0
Mar 21, 2006 | IBM

A file system describes the way information is stored on a hard disk, such as ext2, ext3, ReiserFS, and JFS. The General Parallel File System (GPFS) is another type of file system available for a clustered environment. The design of GPFS has better throughput and a high fault tolerance.

This article discusses a simple case of GPFS implementation. To keep things easy, you'll use machines with two hard disks -- the first hard disk is used for a Linux® installation and the second is left "as is" (in raw format).


Recommended Links

Google matched content

Softpanorama Recommended



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: August 19, 2015