|May the source be with you, but remember the KISS principle ;-)|
|Contents||Bulletin||Scripting in shell and Perl||Network troubleshooting||History||Humor|
|News||Installation Planning||Recommended Links||Son of Grid Engine||Installation of Son of Grid Engine 8.1.8 RPMs for Master Host||Installation of the Execution Hosts|
|SGE implementations||Usage of NFS||Network File System (NFS)||Mounting NFS directory owned by root||Adding NFS partitions to /etc/fstab|
|Private and Public key managemen||Passwordless SSH login||SSH Configuration||RHEL NTP configuration||Troubleshooting NTP on Red Hat Linux||SGE Parallel Environment|
|SGE Queues||Grid Engine Config Tips||Configuring Hosts From the Command Line||Perl Admin Tools and Scripts||Humor||Etc|
We will assume that installation is performed on RHEL 6.5 or 6.6. We also assume that NFS is used for sharing files with master host.
Degree of sharing is not that important but generally $SGE_ROOT/$SGE_CELL should be shared. Efficiency consideration that are sited by many are overblown and without careful measurements and determining real bottleneck you might fall into classical trap called "premature optimization". And as Donald Knuth used to say "Premature optimization is the source of all evil". and long before him Talleyrand gave the following advice to young diplomats: "First and foremost, not too too much zeal". Just substitute "young diplomats" for novice SGE administrators.
The same issue applies to a choice between classic spooling vs. Berkeley DB. Without measurements the selection of Berkeley DB is fools gold.
We will use Son of Grid Engine as an example. Installation for other flavors of Grid Engine is very similar, although some like Oracle Grid engine and Univa are distributed as tar files, not RPMs (see Oracle Grid Engine for some information about tar files based installation). RPMs produce huge pain with dependencies. The problem is that for Son of Grid Engine 8.1.8 they are not documented and that makes this task painful "trial and error" process.
Installation consists of two main parts:
Some people also install shadow master on a different host which serves as backup headnode. This might help some maintenance tasks, but from the point of view of reliability does not get you much. It is simpler to have a cold reserve and move drives and network cable if head node hardware went south (chances for this event are pretty slim with modern hardware -- headnode is not loaded server that operates on the limit of its temperature regime, even if it simultaneously serves as NFS server). Head node can benefit from using SSDs. For simplicity we will avoid installing it. Remember the here we are discussing small clusters.
The master host installation procedure creates the appropriate directory hierarchy that the master daemon requires and starts the Grid Engine master daemon sge_qmaster on the master host (aka head node). The master host is also registered as a host with administrative and submit privileges.
If you do not plan to run jobs on master hosts (and for sizable cluster you should not do it) you do not need to install execution host on the head node. and if you install it you can later remove it from list of running daemons and shut it down.
If, at any time during the installation, you think something went wrong, you can quit the installation procedure and restart it. All installation is of master daemon is less then 30 min operation.
RPMs shipped with SGE are not real RPMs. Additional installation using SGE installation scripts is still required. They are more like proxy for tar files. In addition to storing files RPM serve as a tool to check presence of necessary libraries and Perl modules . Attempt to install binaries compiled for RHEL will fail on SLES because of libraries problems. Welcome to RPM hell...
Content of RPM can be viewed by using rpm -qlp file command
You can relocate installation directory from the default location /opt/SGE during installation using RPM option --prefix, for example
rpm -iv --prefix=/opt/sge/sge gridengine-8.1.8-1.el6.x86_64.rpm
But /opt/sge is good enough for most cases.
After that "real" installation should be performed as described below
For example see instructions for installation of Son of Grid engine from RPMs
RPMs installation creates the installation directory. You can also create it beforehand, probably as a separate partition. For example:
After that you need to create and export SGE_ROOT environment variable.
./util/setfileperm.sh $SGE_ROOTThis step is important as it creates several SUID-root executables
Note: this utility requres that gridengine-execd-8.1.8-1.el6.x86_64 RPM be installed on the master host too.
WARNING WARNING WARNING ----------------------- We will set the the file ownership and permission to UserID: 0 GroupID: 0 In directory: /opt/sge We will also install the following binaries as SUID-root: $SGE_ROOT/utilbin//rlogin $SGE_ROOT/utilbin//rsh $SGE_ROOT/utilbin//testsuidroot $SGE_ROOT/bin//sgepasswd $SGE_ROOT/bin//authuser Do you want to set the file permissions (yes/no) [NO] >> yes Verifying and setting file permissions and owner in >bin< Verifying and setting file permissions and owner in >examples< Verifying and setting file permissions and owner in >inst_sge< Verifying and setting file permissions and owner in >install_execd< Verifying and setting file permissions and owner in >install_qmaster< Verifying and setting file permissions and owner in >lib< Verifying and setting file permissions and owner in >mpi< Verifying and setting file permissions and owner in >pvm< Verifying and setting file permissions and owner in >util< Verifying and setting file permissions and owner in >utilbin< Verifying and setting file permissions and owner in >doc< Verifying and setting file permissions and owner in >man< Verifying and setting file permissions and owner in >qmon< Verifying and setting file permissions and owner in >start_gui_installer< Your file permissions were set
Full installation includes the following tasks:
GUI installer does not work in Son of Grid Engine 8.1.8. So you need to use command line installer.
# echo $SGE_ROOT
Check the content of /etc/services. You need to add two ports that are used by SGE. But default those are 6444 and 6445. You can change them to higher ports if necessary. (typically people use the default ports 6444 and 6445, but your mileage may vary; for example if you need two SGE instances they should use different ports)
If lines are missing (RPMs add those lines) then add lines manuallysge_qmaster 6444/tcp # Grid Engine Qmaster Service sge_qmaster 6444/udp # Grid Engine Qmaster Service sge_execd 6445/tcp # Grid Engine Execution Service sge_execd 6445/udp # Grid Engine Execution Service
Set an environment variable and then install the qmaster as such:
export SGE_ROOT=/opt/sge cd $SGE_ROOT ./install_qmaster
Now go through the interactive install process:
You will now be asked about port configuration for the master, normally you would choose the default (2) which uses the /etc/services file
You will now be asked about port configuration for the master, normally you would choose the default (2) which uses the /etc/services file
enter "classic" for classic spooling (berkeleydb may be more appropriate for large clusters)
accept the default spool dir or specify a different folder (for example if you wish to use a shared or local folder outside of SGE_ROOT
Now that we are back to a shell (finally) we need to add a few things to our root .bashrc so that we can access the SGE binaries. Add the following lines to /root/.bashrc
# SGE settings export SGE_ROOT=/usr/sge export SGE_CELL=default if [ -e $SGE_ROOT/$SGE_CELL ] then . $SGE_ROOT/$SGE_CELL/common/settings.sh fi
And then be sure to re-source your .bashrc
Now we can add our own username as an admin so that we can manage the system without becoming root.
qconf -am <myusername>
NOTE: You can automate steps listed below by creating a small script:
#!/bin/bash # # Post install operations for SGE execution host # . /$SGE_ROOT/default/common/settings.sh # Add sgeexecd.$SGE_CLUSTER_NAME (or whatever is your cluster name) to default services on level 3 and 5 chkconfig sgemaster.$SGE_CLUSTER_NAME on # On the execution host: start the sge_execd service service sgemaster.$SGE_CLUSTER_NAME start # add nessesary commands to /etc/profile echo ". /$SGE_ROOT/default/common/settings.sh" >> /etc/profile
# chkconfig sgemaster.$SGE_CLUSTER_NAME on sgemaster.p6444 0:off 1:off 2:on 3:on 4:on 5:on 6:off
# service sgemaster.$SGE_CLUSTER_NAME startNOTE: The first start takes two-three minutes of more. It's really slow even on a very fast server.
- Run the following command.ps -ef | grep sge
- You should see that the sge_master daemon is running.
- If you do not see similar output, the daemon required on the execution host is not running. Restart the daemon by hand. For example for Linux you can use service command:/sbin/service sgemaster.$SGE_CLUSTER_NAME start
Most SGE installation share the whole /opt/sge directory from the master host with Apps/sge as a subdirectory. It should be mounted under the same name on the execution host. See Usage of NFS in Grid Engine.
/opt/sge 10.194.186.254(rw,no_root_squash) 10.194.181.26(rw,no_root_squash)
Check if you can mount it on onr of execution hosts
# cat /etc/fstab | grep "/opt/sge" m17:/opt/sge /opt/sge nfs rw,hard,intr,tcp,rsize=32768,wsize=32768 1 2
If master host is simultaneously NFS server restart the NFS daemon on qmaster host to reread export file:
# service nfs restart Shutting down NFS mountd: [ OK ] Shutting down NFS daemon: [ OK ] Shutting down NFS quotas: [ OK ] Shutting down NFS services: [ OK ] Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ]
Create passwordless login environment.
Tip: If you already have configured it just copy file authorized_hosts from already configured execution host.
cd /root/.ssh scp sge01:/root/.ssh/authorized_hosts .Check ssh access from the master host to the node on which you install the execution host (b5 in the example below):
root@m17: # ssh b5 The authenticity of host 'b5 (10.194.181.46)' can't be established. RSA key fingerprint is 18:35:6e:96:11:77:27:fc:ac:1c:8e:46:36:2b:ae:2b. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'b5,10.194.181.46' (RSA) to the list of known hosts. Last login: Thu Jul 26 08:29:41 2012 from sge_master.firma.net
Softpanorama hot topic of the month
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.
ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Haterís Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least
Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info|
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: October 11, 2015