Softpanorama

May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

qdel

News  SGE Commands Reference qalter -- Change Job Priority qmod Starting and Killing Daemons  qsub -- Submitting Jobs To Queue Instance Creating and modifying SGE Queues
qhost qping qacct qconf qhold qdel qalter -- Change Job Priority
Getting information about hosts qsub qrsh        
SGE cheat sheet Creating and modifying SGE Queues Monitoring Queues and Jobs Submitting Jobs To Queue Instanc Monitoring and Controlling Jobs Humor Etc

From Usage of the Sun Grid Engine scheduler NIIF Institute

Sometimes we want to delete a job before its running. For this you can use the

      qdel job_id

command.

      qdel 903

The example deletes the job number 903.

      qdel -f 903

It can delete the running jobs immediately.  For pending and then continuing jobs, it is better to use qmod {-s,-us}.

      qmod -s 903
      qmod -us 903

The previous one suspends the running of number 903 (SIGSTOP), while the latter one allows (SIGCONT).

If there is a need to change the resource requirements of a job in the waiting list, it can be done with the command: qalter

      qalter -l h_cpu=0:12:0 903

The previous command alternates the hard-CPU requirements of the job number 903 (h_cpu) and changes it to 12 minutes. The switches of the qalter command are mainly overlap the ones of the qsub command.

In a special case, we have to execute the same task, but on different data. These tasks are the array jobs. With SGE we can upload several jobs to the waiting. For example in the pi task shown in previous chapter, it can be posted multiple times, with different parameters, with the following script:array.sh

      #!/bin/sh
      #$ -N PI_ARRAY_TEST
      ./pi_gcc `expr $SGE_TASK_ID \* 100000`
The SGE_TASK_ID is an internal integer used by the SGE, which created values for each running job. The interval can be set up when posting the block:
      qsub -t 1-7 array.sh

meaning that the array.sh program will run in seven issues, and the SGE_TASK_ID will have the value of 1, 2, ..., 7 in every running issue. The qstat -f shows how the block tasks are split:

     ---------------------------------------------------------------------------------
      parallel.q@cn30                       BIP   0/0/24         0    linux-x64   
     ---------------------------------------------------------------------------------
      test.q@cn32                        BIP   0/7/24       7.15     linux-x64
         907 1.00000 PI_ARRAY_T stefan       r     06/04/2011 10:34:14     1 1
         907 0.50000 PI_ARRAY_T stefan       t     06/04/2011 10:34:14     1 2
         907 0.33333 PI_ARRAY_T stefan       t     06/04/2011 10:34:14     1 3
         907 0.25000 PI_ARRAY_T stefan       t     06/04/2011 10:34:14     1 4
         907 0.20000 PI_ARRAY_T stefan       t     06/04/2011 10:34:14     1 5
         907 0.16667 PI_ARRAY_T stefan       t     06/04/2011 10:34:14     1 6
         907 0.14286 PI_ARRAY_T stefan       t     06/04/2011 10:34:14     1 7    

It is clear, that behind the tasks there are their array index with which we can refer to the components to the task. For example, in the case of block tasks, there is a possibility to delete particular parts of the block. If we want to delete the subtasks from 5-7 of the previous task, the command

      qdel -f 907.5-7 

will delete chosen components, but leaves the tasks 907.1-4 intact.

The result of the running is seven individual files, with seven different running solutions:

It can happen; that the task placed in the queue wonít start. This case the:

      qstat -j job_id

command will show the detailed scheduling information, containing which running parameters are unfulfilled by the task.

The priority of the different tasks only means the gradiation listed in the pending jobs. The scheduler will analyze the tasks in this order. Since it requires the reservation of resources, it is not sure, that the tasks will run exactly the same order.

If we wonder why a certain job wonít start, hereís how you can get information:

      qalter -w v job_id

One possible outcome

      Job 53505 cannot run in queue "parallel.q" because it is not contained in its hard queue list (-q)
      Job 53505 (-l NONE) cannot run in queue "cn30.budapest.hpc.niif.hu" because exclusive resource (exclusive) is already in use
      Job 53505 (-l NONE) cannot run in queue "cn31.budapest.hpc.niif.hu" because exclusive resource (exclusive) is already in use
      Job 53505 cannot run in PE "mpi" because it only offers 0 slots
      verification: no suitable queues

You can check with this command where the jobs are running:

   qhost -j -q

HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
cn01                    linux-x64      24 24.43  62.9G    3.0G     0.0     0.0
   serial.q             BI    0/42/48       
    120087 0.15501 run.sh     roczei      r     09/23/2012 14:25:51 MASTER 22
    120087 0.15501 run.sh     roczei      r     09/23/2012 15:02:21 MASTER 78
    120087 0.15501 run.sh     roczei      r     10/01/2012 07:58:21 MASTER 143
    120087 0.15501 run.sh     roczei      r     10/01/2012 08:28:51 MASTER 144
    120087 0.15501 run.sh     roczei      r     10/04/2012 17:41:51 MASTER 158
    120340 0.13970 pwhg.sh    roczei   r     09/24/2012 23:24:51 MASTER 3
    120340 0.13970 pwhg.sh    roczei   r     09/24/2012 23:24:51 MASTER 5
    120340 0.13970 pwhg.sh    roczei   r     09/24/2012 23:24:51 MASTER 19
    120340 0.13970 pwhg.sh    roczei   r     09/24/2012 23:24:51 MASTER 23
    120340 0.13970 pwhg.sh    roczei   r     09/24/2012 23:24:51 MASTER 31
    120340 0.13970 pwhg.sh    roczei   r     09/24/2012 23:24:51 MASTER 33
    120340 0.13970 pwhg.sh    roczei   r     09/26/2012 13:42:51 MASTER 113
    120340 0.13970 pwhg.sh    roczei   r     10/01/2012 07:43:06 MASTER 186
    120340 0.13970 pwhg.sh    roczei   r     10/01/2012 07:58:36 MASTER 187
    ...

 

Getting information on the waiting lineís status:

qstat -g c

CUSTER QUEUE         CQLOAD   USED    RES  AVAIL  TOTAL aoACDS  cdsuE  
--------------------------------------------------------------------------------
parallel.q                       0.52    368       0     280       648        0       0 
serial.q                          0.05       5       0     91         96        0       0 
test.q                            0.00       0       0     24        24         0       0

 


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

SGE Configuration - Wiki

Deleting jobs from hung hosts

This is not applicable you SGE is configured with Berkeley DB Spooling --Eddale 10:40, 23 July 2009 (EDT)

When a job is running on a host and the host dies, SGE can get completely wedged and never let it be deleted (sticks in dr state). To fix this (as with a sledgehammer) someone suggests the following:

ssh bass-files
sudo /etc/init.d/sgemaster stop
cd /var/spool/gridengine/bass/qmaster/jobs
sudo \rm -rf *
sudo /etc/init.d/sgemaster start

Recommended Links

Softpanorama hot topic of the month

Softpanorama Recommended

UGE Manual Pages

Grid Engine Documentation



Etc

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. 

ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.  

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Haterís Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least


Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: September 18, 2014