Softpanorama

May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

Installation of the Son of Grid Engine 8.1.8 RPMs for Execution Host

News Son of Grid Engine Recommended Links Installation Planning Usage of NFS Installation of Son of Grid Engine 8.1.8 Master Host on Red Hat 6.5 or 6.6
Migrating to Son of Grid Engine 8.1.8 SGE cheat sheet Installation of the Grid Engine Execution Host qconf qsub qstat
Starting and Killing SGE Daemons SGE Queues Configuring Hosts From the Command Line SGE Submit Scripts Humor Etc

Introduction

We will assume that installation is performed on RHEL 6.5 or 6.6.  We also assume that NFS is used for sharing files with master host.

Degree of sharing is not that important but generally $SGE_ROOT/$SGE_CELL should be shared.  Efficiency consideration that are sited by many are overblown and without careful measurements and determining real bottleneck you might fall into classical trap called "premature optimization". And as Donald Knuth used to say "Premature optimization is the source of all evil". and long before him Talleyrand gave the following advice to young diplomats: "First and foremost, not too too much zeal".  Just substitute "young diplomats" for novice SGE administrators.

The same issue applies to a choice between classic spooling vs. Berkeley DB. Without measurements the selection of Berkeley DB is fools gold.

Installation of dependencies

The are three dependencies that need to be installed so that installation can proceed with just standard RHEL repositories.

========================================================================================================================
 Package                   Arch             Version                  Repository                                    Size
========================================================================================================================
Installing:
 gridengine                x86_64           8.1.8-1.el6              /gridengine-8.1.8-1.el6.x86_64                40 M
 perl-XML-Simple           noarch           2.18-6.el6               /perl-XML-Simple-2.18-6.el6.noarch           155 k
Installing for dependencies:
 hwloc                     x86_64           1.5-3.el6_5              rhel-x86_64-server-6                         1.4 M
 jemalloc                  x86_64           3.6.0-1.el6              epel                                         100 k

Transaction Summary
========================================================================================================================
Install       4 Package(s)

Installation of Grid Engine RPMs

The first thing to do is to create necessary directories and mount NFS share. If this is the first execution host in the cluster you need to decide how much of $SGE_ROOT tree you want to share.

For small installations the master host can also serve as NFS server. But of cause you can get better result using specialized server. Select how much you need to share.  Most simple SGE installation for small clusters share either

For RPM-based installation the second method is preferable as you already installed binaries on the host: it make no sense to put executable on NSF so minimum is $SGE_ROOT/$SGE_CELL/common directory. Of course, in this case, you need install executables on each execution host. But as installing prerequisite RPMs is enough trouble (and you need to do it in any case), so two more RPMs does not make much difference. 

Installation of the RPMs should be carried out using YUM as any additional software dependencies will be automatically resolved from already installed RPMs. You need the following RPMs to be put in some directory:

 # ll
total 17644
-rw-r--r--  1 root    root    14941856 Nov  5 10:30 gridengine-8.1.8-1.el6.x86_64.rpm
-rw-r--r--  1 root    root     1440192 Nov  5 10:30 gridengine-execd-8.1.8-1.el6.x86_64.rpm
-rw-r--r--  1 root    root     1490312 Sep 22 06:33 hwloc-1.5-3.el6_5.x86_64.rpm
-rw-r--r--  1 root    root      102624 Apr  1  2014 jemalloc-3.6.0-1.el6.x86_64.rpm
-rw-r--r--  1 root    root       74068 Nov  6 12:39 perl-XML-Simple-2.18-6.el6.noarch.rpm

If you put all the necessary RPMs into a single directory, then you can install execution host using command 

yum install *.rpm  

You will see the following messages:

[0]root@lustwzb2: # yum install *.rpm
Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
This system is receiving updates from RHN Classic or RHN Satellite.
rhel-x86_64-server-6                                                                             | 1.8 kB     00:00
Setting up Install Process
Examining gridengine-8.1.8-1.el6.x86_64.rpm: gridengine-8.1.8-1.el6.x86_64
Marking gridengine-8.1.8-1.el6.x86_64.rpm to be installed
Examining gridengine-execd-8.1.8-1.el6.x86_64.rpm: gridengine-execd-8.1.8-1.el6.x86_64
Marking gridengine-execd-8.1.8-1.el6.x86_64.rpm to be installed
Examining hwloc-1.5-3.el6_5.x86_64.rpm: hwloc-1.5-3.el6_5.x86_64
Marking hwloc-1.5-3.el6_5.x86_64.rpm to be installed
Examining jemalloc-3.6.0-1.el6.x86_64.rpm: jemalloc-3.6.0-1.el6.x86_64
Marking jemalloc-3.6.0-1.el6.x86_64.rpm to be installed
Examining perl-XML-Simple-2.18-6.el6.noarch.rpm: perl-XML-Simple-2.18-6.el6.noarch
Marking perl-XML-Simple-2.18-6.el6.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package gridengine.x86_64 0:8.1.8-1.el6 will be installed
---> Package gridengine-execd.x86_64 0:8.1.8-1.el6 will be installed
--> Processing Dependency: xterm for package: gridengine-execd-8.1.8-1.el6.x86_64
---> Package hwloc.x86_64 0:1.5-3.el6_5 will be installed
---> Package jemalloc.x86_64 0:3.6.0-1.el6 will be installed
---> Package perl-XML-Simple.noarch 0:2.18-6.el6 will be installed
--> Running transaction check
---> Package xterm.x86_64 0:253-1.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

========================================================================================================================
 Package                    Arch             Version               Repository                                      Size
========================================================================================================================
Installing:
 gridengine                 x86_64           8.1.8-1.el6           /gridengine-8.1.8-1.el6.x86_64                  40 M
 gridengine-execd           x86_64           8.1.8-1.el6           /gridengine-execd-8.1.8-1.el6.x86_64           3.8 M
 hwloc                      x86_64           1.5-3.el6_5           /hwloc-1.5-3.el6_5.x86_64                      1.9 M
 jemalloc                   x86_64           3.6.0-1.el6           /jemalloc-3.6.0-1.el6.x86_64                   315 k
 perl-XML-Simple            noarch           2.18-6.el6            /perl-XML-Simple-2.18-6.el6.noarch             155 k
Installing for dependencies:
 xterm                      x86_64           253-1.el6             rhel-x86_64-server-6                           357 k

Transaction Summary
========================================================================================================================
Install       6 Package(s)

Total size: 47 M
Total download size: 357 k
Installed size: 46 M
Is this ok [y/N]: y
Downloading Packages:
xterm-253-1.el6.x86_64.rpm                                                                       | 357 kB     00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : hwloc-1.5-3.el6_5.x86_64                                                                             1/6
  Installing : jemalloc-3.6.0-1.el6.x86_64                                                                          2/6
  Installing : perl-XML-Simple-2.18-6.el6.noarch                                                                    3/6
  Installing : gridengine-8.1.8-1.el6.x86_64                                                                        4/6
  Installing : xterm-253-1.el6.x86_64                                                                               5/6
  Installing : gridengine-execd-8.1.8-1.el6.x86_64                                                                  6/6
  Verifying  : gridengine-8.1.8-1.el6.x86_64                                                                        1/6
  Verifying  : gridengine-execd-8.1.8-1.el6.x86_64                                                                  2/6
  Verifying  : jemalloc-3.6.0-1.el6.x86_64                                                                          3/6
  Verifying  : xterm-253-1.el6.x86_64                                                                               4/6
  Verifying  : perl-XML-Simple-2.18-6.el6.noarch                                                                    5/6
  Verifying  : hwloc-1.5-3.el6_5.x86_64                                                                             6/6

Installed:
  gridengine.x86_64 0:8.1.8-1.el6        gridengine-execd.x86_64 0:8.1.8-1.el6        hwloc.x86_64 0:1.5-3.el6_5
  jemalloc.x86_64 0:3.6.0-1.el6          perl-XML-Simple.noarch 0:2.18-6.el6

Dependency Installed:
  xterm.x86_64 0:253-1.el6

Complete!

Correct a bug with sgeadmin account creation

Now you need to correct a bug with sgeadmin account creation:

RPM creates the account using generic useradd command and does not synchronizes the value of UID and GID with the master host. This is a nasty bug: if number of accounts and groups on execution host is differnt from the master host (and typically it is ), UID and GID will be different. As a result you execution daemon will not be able to communicate with the master host. Installer does not detect this error. See how b2 looks in the qhost command below:

z99: # qhost
HOSTNAME          ARCH         NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
----------------------------------------------------------------------------------------
global                  -               -    -    -    -     -       -       -       -       -
z99               lx-amd64       40    2   20   40     -   62.9G       -   15.6G       -
b1                lx-amd64       20    2   20   20  0.00   62.9G  945.9M     0.0     0.0
b2                -               -    -    -    -     -       -       -       -       -
For example you can have on the execution host
sgeadmin:x:496:492:Grid Engine admin:/:/sbin/nologin
And on the master host
sgeadmin:x:495:490:Grid Engine admin:/:/sbin/nologin
You now need to execute the commands
groupmod -g 490 sgeadmin
usermod -u 495 sgeadmin
chown sgeadmin:sgeadmin /opt/sge/default
to correct this situation.

In other words, the creation of sgeadmin account by RPM is no so good idea. It is better to do it int he installer, as installer is running as root and can read any files on the shared part of the master tree, if any.

Mount NFS partition

Generally for small installations  (say, less then 32 nodes and 640 cores) that do not have huge load of small (several minutes length) jobs, there is not much difference whether you share $SGE_ROOT or $SGE_ROOT/default. In this particular case sharing less does not really improve efficiency on modern servers.  See Usage of NFS in Grid Engine.

For example

export SGE_ROOT=/opt/sge   # installation directory should be identical with master host
export SGE_CELL=default    # should be the same as on master host
SGE_MASTER=qmaster         # hostname of your SGE master host. Should be in /etc/hosts
SGE_NFS_SHARE=$SGE_ROOT/$SGE_CELL # specify how much you share
echo "$SGE_MASTER:$SGE_NFS_SHARE $SGE_NFS_SHARE  nfs  vers=3,rw,hard,intr,tcp,rsize=32768,wsize=32768 1 2" >> /etc/fstab
mount $SGE_NFS_SHARE

Copy missing files in the  /opt/sge/utilbin/lx-amd64/ directory from the master host

Copy spool files from /opt/sge/utilbin/lx-amd64/ on the master host to the execution host

scp spool* b2:/opt/sge/utilbin/lx-amd64/spool[di]*
spooldefaults                                                                         100%  297KB 297.3KB/s   00:00
spoolinit                                                                             100% 1414KB   1.4MB/s   00:00
  1. Change the current directory to $SGE_ROOT directory and run 
    ./install_execd -nobincheck

 

Run the installer

During installation the execution host on which you are now installing sgeexec daemon should made administrative host using the command  the qconf -ah <hostname>  Ensure this is the case (you can remove it as an admin host after the install if you wish), then press enter to continue

If you share less the $SGE_ROOT you need to use option -nobincheck with the installer:

./install_execd -nobincheck

See step by step instructions in Installation of the Grid Engine Execution Host

Grid Engine messages

Grid Engine messages can be found in syslog during startup:

After startup the daemons log their messages in their spool directories.

Qmaster:

$SGE_ROOT/$SGE_DELL/spool/qmaster/messages

Exec daemon:

$SGE_ROOT/$SGE_DELL/spool/$EXEC_HOSTNAME/messages

Where $EXEC_HOSTNAME is hostname of execution host we want to see the messages from.


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Mar 02, 2016] Son of Grid engine version 8.1.9 is availble

Mar 02, 2016 | liv.ac.uk

README

This is Son of Grid Engine version v8.1.9.

See <http://arc.liv.ac.uk/repos/darcs/sge-release/NEWS> for information on recent changes. See <https://arc.liv.ac.uk/trac/SGE> for more information.

The .deb and .rpm packages and the source tarball are signed with PGP key B5AEEEA9.

* sge-8.1.9.tar.gz, sge-8.1.9.tar.gz.sig:  Source tarball and PGP signature

* RPMs for Red Hat-ish systems, installing into /opt/sge with GUI
  installer and Hadoop support:

  * gridengine-8.1.9-1.el5.src.rpm:  Source RPM for RHEL, Fedora

  * gridengine-*8.1.9-1.el6.x86_64.rpm:  RPMs for RHEL 6 (and
    CentOS, SL)

  See < https://copr.fedorainfracloud.org/coprs/loveshack/SGE/ > for
  hwloc 1.6 RPMs if you need them for building/installing RHEL5 RPMs.

* Debian packages, installing into /opt/sge, not providing the GUI
  installer or Hadoop support:

  * sge_8.1.9.dsc, sge_8.1.9.tar.gz:  Source packaging.  See
    <http://wiki.debian.org/BuildingAPackage> , and see
    < http://arc.liv.ac.uk/downloads/SGE/support/  > if you need (a more
    recent) hwloc.

  * sge-common_8.1.9_all.deb, sge-doc_8.1.9_all.deb,
    sge_8.1.9_amd64.deb, sge-dbg_8.1.9_amd64.deb: Binary packages
    built on Debian Jessie.

* debian-8.1.9.tar.gz:  Alternative Debian packaging, for installing
  into /usr.

* arco-8.1.6.tar.gz:  ARCo source (unchanged from previous version)

* dbwriter-8.1.6.tar.gz:  compiled dbwriter component of ARCo
  (unchanged from previous version)

More RPMs (unsigned, unfortunately) are available at < http://copr.fedoraproject.org/coprs/loveshack/SGE/ >.

Installation instructions for Sun of Grid engine 8.1.6.

A useful installation instructions for Sun of Grid engine 8.1.6.
fsl.fmrib.ox.ac.uk

This is a quick walk through to get Grid Engine going on Linux for those who would like to use it for something like FSL. This documentation is a little old, being written when the Grid Engine software was owned by Sun and often referred to as SGE (Sun Grid Engine). However, this covers the basic requirements. A quick start guide for Ubuntu/Debian is available here, but more detailed setup can be found on this page.

Since the demise of the open source (Sun) Grid Engine, various ports have sprung up. Ubuntu/Debian package the last publicly available release (6.2u5), but users of Red Hat variants (CentOS, Scientific Linux) or Debian/Ubuntu users wishing to use a more modern release should look to installing Son of Grid Engine which makes available RPM and DEB packages and is still actively maintained (last update November 2013).

Grid Engine generally consists of one master (qmaster) and a number of execute (exec) hosts, note that the qmaster machine can also be an exec host which is fine for small deployments, but large clusters should look to keeping these functions separate.

This documentation was originally produced by A. Janke (a.janke@gmail.com) and is now maintained by the FSL team.

NFS

Although Grid Engine can be configured such that all machines are self contained, the instructions here assume that at least some of the Grid Engine folders are shared amongst the controller (qmaster) and clients (exec hosts). To achieve this you will typically need to setup one or more NFS shares, typically at least the configuration files (see http://arc.liv.ac.uk/SGE/howto/nfsreduce.html). Further, the FSL binaries and datasets to be operated on should be made available to all exec hosts in the same filesystem location. In the case of the FSL software, you could install this to the same location on all execution hosts or install to one location and NFS mount this to the same location on all hosts. In the case of datasets, the instructions here assume you are using NFS mounts, but through prolog and epilog scripts it is possible to setup Grid Engine to copy data to/from exec hosts.

Setting up NFS shares is beyond the scope of this document.

Name services

Grid Engine needs to be able to locate exec hosts/qmasters based on host name. Assuming all of your hosts are known to your DNS service then you will have to do no work to set this up. If you don't have a DNS zone then you may need to configure the local /etc/hosts file to resolve hostnames or look into host aliases (man host_aliases) configuration

User accounts

Grid Engine runs the scheduled job as the user who submitted it, using the textual name form (not numeric ID). Consequently, all exec hosts need to know about all users who are going to submit jobs. In a very small scale setup you may wish to add the required users directly to each exec host, but this quickly becomes unmanageable, so we would recommend setting up some kind of centralised user database, e.g. LDAP, Active Directory.

Setting this up shared user accounts is beyond the scope of this document.

Admin account

The Grid Engine software has to run as a privileged user in order to be able to run jobs as the submitting user. However, as this is a potential security issue, the grid software that communicates with the network can be run under an admin account that doesn't have root access. This account needs to be available on all cluster hosts, so either set this up locally, or add it to your central LDAP/user account system.

If you decide to have a locally defined daemon account then set this up as follows (run as the root user) (this is Red Hat dialect, for Ubuntu/Debian use the interactive adduser command).

useradd --home /opt/sge --system sgeadmin

which will add a system account (e.g. no home folder creation, no ageing of the account etc). This should be run on the qmaster and all exec hosts.

Service ports

Grid Engine communicates over two statically configured ports. These ports have to be the same on all computers, and can be configured in the file /etc/services or by changing the Grid Engine configuration setup files that all users need to source to be able to use the software. The latter option is best where you need to have more than one cluster in a location, as each qmaster/exec host has to communicate with the different clusters on different ports. Modern Linux distributions are already setup with entries for Grid Engine (use grep sge_qmaster /etc/services to confirm). If your distribution does not include entries, then you need to add the following to this file:

sge_qmaster     6444/tcp                # Grid Engine Qmaster Service
sge_qmaster     6444/udp                # Grid Engine Qmaster Service
sge_execd       6445/tcp                # Grid Engine Execution Service
sge_execd       6445/udp                # Grid Engine Execution Service

commenting out any prior definitions for the ports 6444 and 6445.

... ... ...

Installation

Where we refer to $SGE_ROOT, when using the Son Of Grid Engine packages, this will be /opt/sge.

QMaster

Red Hat Enterprise etc

Installation of the RPMs should be carried out using YUM as any additional software dependancies will be automatically resolved. A Grid master can be installed using:

yum install gridengine-8.1.6-1.el6.x86_64.rpm gridengine-qmaster-8.1.6-1.el6.x86_64.rpm gridengine-execd-8.1.6-1.el6.x86_64.rpm gridengine-qmon-8.1.6-1.el6.x86_64.rpm gridengine-guiinst-8.1.6-1.el6.noarch.rpm

Set an environment variable and then install the qmaster as such:

export SGE_ROOT=/opt/sge
cd $SGE_ROOT
./install_qmaster

Now go through the interactive install process:

Now that we are back to a shell (finally) we need to add a few things to our root .bashrc so that we can access the SGE binaries. Add the following lines to /root/.bashrc

   # SGE settings
   export SGE_ROOT=/usr/sge
   export SGE_CELL=default
   if [ -e $SGE_ROOT/$SGE_CELL ]
   then
      . $SGE_ROOT/$SGE_CELL/common/settings.sh
   fi

And then be sure to re-source your .bashrc

. /root/.bashrc

Now we can add our own username as an admin so that we can manage the system without becoming root.

qconf -am <myusername>

e.g qconf -am jbloggs if your username is jbloggs.

Exec Host

The process for installing exec hosts is as follows

  1. Add the exec host to the master host as an admin host. If your exec host is called client.foo.com then run this on your master host:
    • qconf -ah client.foo.com
  2. On the client (client.foo.com)
    1. Add the sgeadmin username as per above
    2. Add the lines to /etc/services if required
    3. Add the SGE bits to /root/.bashrc and re-source it (. /.bashrc)
    4. Ensure the binaries have been installed
  3. Set an environment variable and then install the exec host (this might be the same machine as the queue master, for example if you only have one computer)
    • export SGE_ROOT=/opt/sge
      cd $SGE_ROOT
      ./install_execd
  4. Now go through the interactive install process:
  5. The installer will ask that you check that this host has been added as an administrative host with the qconf -ah <hostname> command. Ensure this is the case (you can remove it as an admin host after the install if you wish), then press enter to continue
  6. Make sure the Grid Engine root matches that configured on the Qmaster (/opt/sge)
  7. Ensure the cell name matches that configured on the master (default is usually fine "default")
  8. Accept the age_execd port setting
  9. Accept the message about the host being known as an admin host
  10. Make a decision about the spool directory. For medium to large clusters local spool directories are the best option, for small (this should be an NFS mount) or stand-alone installs the default is fine. An appropriate local spool folder name might be /var/spool/sge. If you choose to have a local spool folder you will now receive a warning that the change of 'execd_spool_dir' not being effective before execd has been restarted - you will have to stop/start the execd after completing the install for this to take effect.
  11. press "y" to install the startup scripts
  12. confirm you have read the following messages
  13. When asked about adding a default queue instance for this host answer "n" - FSL requires specific queues, so it is better to define these rather than the default queue.
  14. press enter to accept the next message and "n" to refuse to see the previous screen again and then finally enter to exit the installer

Repeat this installation procedure on all of the execution hosts...

Installing and Setting Up Sun Grid Engine on a Single Multi-Core PC

To embark on the installation of the gridengine packages, run the following command on your terminal:
1

2

sudo apt-get install \

  gridengine-master gridengine-exec gridengine-common gridengine-qmon gridengine-client

Instead, you can run the shorter, and perhaps more error-prone, command

1 sudo apt-get install gridengine-*

A pop-up window will appear within the terminal during installation, with title “Configuring gridengine-common”. A series of questions show up sequentially in this window:

  1. Question: “Configure SGE automatically?” Answer: highlight “<Yes>” and press “Enter”.
  2. Question: “SGE cell name:” Answer: type “default”, then press “Tab” to highlight “<Ok>” and press “Enter”.
    Note here that you are free to choose any name you want for your SGE cell instead of “default”, such as “sge_cell” for example. If you alter the SGE cell name, you will have to subsequently set the SGE_CELL variable in your ~/.bashrc file accordingly (assuming that bash is your default shell). For instance, if you set the SGE cell name to be sge_cell, you will add the following line in your ~/.bashrc:

     

    1 export SGE_CELL="sge_cell"

    Furthermore, you will need to add the above line of code in your /root/.bashrc file so that the SGE cell is also known to the root. It is advised that you leave the SGE cell name as it is, holding the “default” value.

  3. Question: “SGE master hostname:” Answer: type “localhost”, then press “Tab” to highlight “<Ok>” and press “Enter”.
    Instead of “localhost”, you can choose the hostname of your computer, which can by found by running the “hostname” command from the terminal:

     

    1 hostname

After answering these three questions, the pop-up window closes and the installation continues on the terminal. If for any reason you need to reconfigure the gridengine-master package, you can do so by invoking the following command:

1 sudo dpkg-reconfigure gridengine-master

The installation of gridengine is now complete, yet this does not mean that you are necessarily ready to use SGE. First of all, check whether sge_qmaster and sge_execd are running by using the command

1 ps aux | grep "sge"

The output I got verified that sge_qmaster and sge_execd are running:

1

2

3
sgeadmin  1310  0.0  0.1 135968  5376 ?        Sl   13:41   0:00 /usr/lib/gridengine/sge_qmaster

sgeadmin  1336  0.0  0.0  54760  1544 ?        Sl   13:41   0:00 /usr/lib/gridengine/sge_execd

1000      3171  0.0  0.0   7780   860 pts/0    S+   13:54   0:00 grep --colour=auto sge

If this is not the case for you, then start up sge_qmaster and sge_execd by executing the following three commands:

1

2

3
sudo su

sge_qmaster

sge_execd

Once you ensure that sge_qmaster and sge_execd are running, try to start qmon, the graphical user interface (GUI) for the administration of SGE:

1 sudo qmon

It is likely that the qmon window will not load, but instead you will get an error message. This is what I got:

1

2

3

4

5

6

7
Warning: Cannot convert string "-adobe-courier-medium-r-*--14-*-*-*-m-*-*-*" to type FontStruct

Warning: Cannot convert string "-adobe-courier-bold-r-*--14-*-*-*-m-*-*-*" to type FontStruct

Warning: Cannot convert string "-adobe-courier-medium-r-*--12-*-*-*-m-*-*-*" to type FontStruct

X Error of failed request:  BadName (named color or font does not exist)

  Major opcode of failed request:  45 (X_OpenFont)

  Serial number of failed request:  643

  Current serial number in output stream:  654

The error message indicates that some fonts are missing. The package which contains the necessary fonts is called xfonts-75dpi. In my case, xfonts-75dpi was installed automatically alongside the installation of the gridengine packages. Nevertheless, I got the error message because the fonts were not loaded after their installation. So, I merely restarted my computer. After rebooting, the “sudo qmon” command loaded the qmon window. If xfonts-75dpi is not installed on your system, then install it using the following command and then reboot:

1 sudo apt-get install xfonts-75dpi

After having resolved any possible font-related issues “sudo qmon” should load the SGE admin window. If you let the window remain idle or if you try to press any of its buttons, such as “Job Control”, the most likely event will be the appearance of a message pop-up window with the text “cannot reach qmaster”. Click on the “Abort” button of the pop-up window to terminate qmon. Try also the qstat command, which in my case gave the following error message:

1

2

error: commlib error: access denied (client IP resolved to host name "localhost". This is not identical to clients host name "russell")

error: unable to contact qmaster using port 6444 on host "russell"

It is useful to delve in the error message in conjunction with the /etc/hosts file of my system:

1

2

3

4

5

6

7

8

9
127.0.0.1   localhost

127.0.1.1   russell

# The following lines are desirable for IPv6 capable hosts

::1     localhost ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

ff02::3 ip6-allhosts

The hostname of my computer is “russell”. According to the error message, SGE set the client hostname to “russell”, whose LAN IP address is 127.0.1.1, while it set the client IP to 127.0.0.1, which is the LAN IP designated to the hostname “localhost”. To resolve this ambiguity, I changed the first two lines of my /etc/hosts so that both hostnames “localhost” and “russell” share the same LAN IP (as a word of warning, make a backup of your /etc/hosts file before making any changes to it). To be more specific, I deleted the second line and appended the “russell” hostname to the end of the first line. My /etc/hosts file thus became:

1

2

3

4

5

6

7

8

127.0.0.1   localhost russell

# The following lines are desirable for IPv6 capable hosts

::1     localhost ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

ff02::3 ip6-allhosts

Moreover, it is possible that your /etc/hosts file contains by default the string “localhost.localdomain” in the first line, for example as in

1 127.0.0.1   localhost localhost.localdomain russell

If that’s the case, make sure you remove “localhost.localdomain” so that only “localhost” and your machine’s hostname (“russell” is my hostname), are tied to the LAN IP 127.0.0.1:

1 127.0.0.1   localhost russell

You may restart sge_qmaster and sge_execd, although it is not advised given that you made a fundamental change to your system’s state by reconfiguring the association between IPs and hostnames in the /etc/hosts file. Instead, you are advised to restart your computer before you proceed any further. After rebooting, “qstat” and “sudo qmon” should run without returning any error messages.

Installing Debian Linux, DRBL and the Sun Grid Engine (SGE) - Woojay Jeon

sites.google.com/site/woojay
V. Installing the Sun Grid Engine

I basically followed the installation instructions on the Grid Engine website to install qmaster via "./inst_sge -m". I used Padraig's specifications, which I am going to quote here:

I will just add that if the installer asks if you want to enable a JMX MBean server, you can answer no.
 

After installation, I ran:

   source /opt/oge/default/common/settings.sh
to configure various environment variables. I also added this command to my .bashrc file.

Recommended Links

Softpanorama hot topic of the month

Softpanorama Recommended

Installation of the Grid Engine Execution Host

Installation instructions for Sun of Grid engine 8.1.6.  A useful installation instructions for Sun of Grid engine 8.1.6. fsl.fmrib.ox.ac.uk



Etc

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. 

ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.  

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least


Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: May 08, 2017