|May the source be with you, but remember the KISS principle ;-)|
|Contents||Bulletin||Scripting in shell and Perl||Network troubleshooting||History||Humor|
|News||Grid engine||Recommended Links||SGE cheat sheet||Installation Planning||Installation of the Master Host||Installation of the Execution Hosts|
|sge_execd - Sun Grid Engine job execution agent||Changing SGE spool to local directory||Optimizing usage of NFS in Grid Engine||Grid Engine Config Tips||SGE Parallel Environment||Configuring Hosts From the Command Line||SGE Queues|
|Enterprise Unix System Administration||Perl Admin Tools and Scripts||Duke University Tools for SGE||Simple Unix Backup Tools||Sysadmin Horror Stories||Humor||Etc|
Grid Engine is a full function, general purpose Distributed Resource Management (DRM) tool. The scheduler
component in Grid Engine supports a wide range of different compute farm scenarios. To get the maximum
performance from your compute environment it can be worthwhile to review which features are enabled
and which are really needed to solve your load management problem. Disabling/Enabling these features
can have a performance benefit on the throughput of your cluster. Each feature contains in parentheses
when it was introduced. If not otherwise stated, it is available in higher versions as well.
Experience has shown utilization of NFS or similar shared file systems for
distributing files required by Grid Engine can have a critical share in both
overall network load and file server load. Thus keeping such files locally is
always beneficially for overall cluster throughput. See
Optimizing usage of NFS in Grid
scheduler monitoring See also sge_execd - Sun Grid Engine job execution agent
Scheduler monitoring can be helpful to find out the reason why certain jobs are not dispatched
providing this information for all jobs at any time can be resource consuming (memory and cpu time)
and is usually not needed. To disable scheduler monitoring set schedd_job_info to false
in scheduler configuration sched_conf(5).
In case of array jobs the finished job list in qmaster
can become quite big. Switching it off will save memory and speed up qstat commands because
qstat also fetches the finished jobs list. Set finished_jobs to 0 in global
configuration. See sge_conf(5).
Forcing validation at job submission time can be a valuable
tool to prevent non-dispatchable jobs from remaining in pending state foreever. However, It can
be a time consuming job to validate jobs, especially in heterogeneous environments with a variety
of different execution nodes and consumable resources and where every user has his own job profile.
In homogeneous environments with only a couple of different jobs, a general job validation usually
can be omitted. Job verification is disabled per default and should only be used (qsub(1):
-w [v|e|w]) when needed. [It is enables by default with DRMAA]
Load thresholds are needed if you deliberately oversubscribe your machines, and you need a mechanism to prevent excessive system load. Suspend thresholds are also used for this. The other case in which load thresholds are needed is when the execution node is open for interactive load which is not under control of Grid Engine, and you want to prevent the node from being overloaded. If a compute farm is more single-purpose, e. g., each CPU at a compute node is represented by only one queue slot, and no interactive load is expected at these nodes, then load_thresholds can be omitted. To disable both thresholds set load_thresholds to none and suspend_thresholds to none. See queue_conf(5).
Load adjustments are used to virtually increase the measured load after a job has been dispached. This mechanism is helpful in the case of oversubscribed machines in order to align with load thresholds. Load adjustments should be switched off if they are not needed, because they impose on the scheduler some additional work in connection sorting hosts and load thresholds verification. To disable load adjustments set job_load_adjustments to none and load_adjustment_decay_time to 0 in the scheduler configuration. See sched_conf(5).
The default for Grid Engine is to start scheduling runs in a fixed scheduling interval (see schedule_interval in schedd_conf(5)). The good thing with fixed intervals is that they limit the cpu time consumption of the qmaster/scheduler. The bad thing is that they throttle the scheduler artificially, resulting in a limited throughput. In many compute farms there are machines specifically dedicated to qmaster/scheduler and in such setups there is no reason for throttling the scheduler. How many seconds one should use for flush times is difficult to say. It depends on the time the scheduler needs for a single run and the number of jobs in the system. A couple test runs with the scheduler profiling (Add profile=1 to the params in the schedd_conf(5).) should give one enough data to select a good value.
Softpanorama hot topic of the month
Grid Engine Configuration Recipes by Dave Love
Reducing and Eliminating NFS usage by Grid Engine
Sun Grid Engine Tuning guide -- short and outdated notes
Grid Engine Tuning guide
Grid Engine Profiling HOWTO
Monitoring SGE Performance with DTrace
Grid Engine, Infiniband and general tuning tips - OpenEye HiveMind
Grid Engine, Infiniband and general tuning tips? - OpenEye ... hivemind.eyesopen.com/questions/...infiniband-and-general-tuning-tips infiniband ◊1 sge ◊1 tuning ◊1. Asked: Jul 02 at 10:19. Seen: 188 times. Last updated: Jul 02 at 10:19 Related questions. about | faq | privacy | support | contact.
Discussion list for users of Grid Engine - Gmane comments.gmane.org/gmane.comp.clustering.gridengine.users/22457 We did a bunch of SGE tuning but still had the random occasional failures very close to what you describe.
Ubuntu Manpage: sge_conf - Sun Grid Engine configuration files manpages.ubuntu.com/manpages/natty/man5/sge_conf.5.html ... at a well known location in the Sun Grid Engine internal directory ... Changing the global execd_spool_dir parameter set at installation ...
Discussion list for users of Grid Engine - Gmane comments.gmane.org/gmane.comp.clustering.gridengine.users/20753 ... [1084:4222]: execvp(/var/spool/sge/default/spool/n3 ... No such file or directory > > SGE is able to grab ... Discussion list for users of Grid Engine. Search ...
Install and Configure Sun Grid Engine (SGE) Job Scheduler ... rgrid.blog.com/...software-tools/...sun-grid-engine-sge-job-scheduler
Sun Grid Engine is an open source ... The qmaster spool directory is the ... A notification about the chosen configurations in case you need to change something ...
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.
ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Haterís Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least
Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info|
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: November, 04, 2016