|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
|
Grid Engine is a full function, general purpose Distributed Resource Management (DRM) tool. The scheduler
component in Grid Engine supports a wide range of different compute farm scenarios. To get the maximum
performance from your compute environment it can be worthwhile to review which features are enabled
and which are really needed to solve your load management problem. Disabling/Enabling these features
can have a performance benefit on the throughput of your cluster. Each feature contains in parentheses
when it was introduced. If not otherwise stated, it is available in higher versions as well.
Experience has shown utilization of NFS or similar shared file systems for
distributing files required by Grid Engine can have a critical share in both
overall network load and file server load. Thus keeping such files locally is
always beneficially for overall cluster throughput. See
Optimizing usage of NFS in Grid
Engine
scheduler monitoring See also sge_execd - Sun Grid Engine job execution agent
Scheduler monitoring can be helpful to find out the reason why certain jobs are not dispatched
(displayed via
qstat). However,
providing this information for all jobs at any time can be resource consuming (memory and cpu time)
and is usually not needed. To disable scheduler monitoring set schedd_job_info to false
in scheduler configuration sched_conf(5).
In case of array jobs the finished job list in qmaster
can become quite big. Switching it off will save memory and speed up qstat commands because
qstat also fetches the finished jobs list. Set finished_jobs to 0 in global
configuration. See sge_conf(5).
Forcing validation at job submission time can be a valuable
tool to prevent non-dispatchable jobs from remaining in pending state foreever. However, It can
be a time consuming job to validate jobs, especially in heterogeneous environments with a variety
of different execution nodes and consumable resources and where every user has his own job profile.
In homogeneous environments with only a couple of different jobs, a general job validation usually
can be omitted. Job verification is disabled per default and should only be used (qsub(1):
-w [v|e|w]) when needed. [It is enables by default with DRMAA]
Load thresholds are needed if you deliberately oversubscribe your machines, and you need a mechanism to prevent excessive system load. Suspend thresholds are also used for this. The other case in which load thresholds are needed is when the execution node is open for interactive load which is not under control of Grid Engine, and you want to prevent the node from being overloaded. If a compute farm is more single-purpose, e. g., each CPU at a compute node is represented by only one queue slot, and no interactive load is expected at these nodes, then load_thresholds can be omitted. To disable both thresholds set load_thresholds to none and suspend_thresholds to none. See queue_conf(5).
Load adjustments are used to virtually increase the measured load after a job has been dispached. This mechanism is helpful in the case of oversubscribed machines in order to align with load thresholds. Load adjustments should be switched off if they are not needed, because they impose on the scheduler some additional work in connection sorting hosts and load thresholds verification. To disable load adjustments set job_load_adjustments to none and load_adjustment_decay_time to 0 in the scheduler configuration. See sched_conf(5).
The default for Grid Engine is to start scheduling runs in a fixed scheduling interval (see schedule_interval in schedd_conf(5)). The good thing with fixed intervals is that they limit the cpu time consumption of the qmaster/scheduler. The bad thing is that they throttle the scheduler artificially, resulting in a limited throughput. In many compute farms there are machines specifically dedicated to qmaster/scheduler and in such setups there is no reason for throttling the scheduler. How many seconds one should use for flush times is difficult to say. It depends on the time the scheduler needs for a single run and the number of jobs in the system. A couple test runs with the scheduler profiling (Add profile=1 to the params in the schedd_conf(5).) should give one enough data to select a good value.
Google matched content |
Grid Engine Configuration Recipes by Dave Love
Reducing and Eliminating NFS usage by Grid Engine
Sun Grid Engine Tuning guide -- short and outdated notes
Grid Engine Tuning guide
Grid Engine Profiling HOWTO
Monitoring SGE Performance with DTrace
Grid Engine, Infiniband and general tuning tips - OpenEye HiveMind
Grid Engine, Infiniband and general tuning tips? - OpenEye ... hivemind.eyesopen.com/questions/...infiniband-and-general-tuning-tips infiniband ×1 sge ×1 tuning ×1. Asked: Jul 02 at 10:19. Seen: 188 times. Last updated: Jul 02 at 10:19 Related questions. about | faq | privacy | support | contact.
Discussion list for users of Grid Engine - Gmane comments.gmane.org/gmane.comp.clustering.gridengine.users/22457 We did a bunch of SGE tuning but still had the random occasional failures very close to what you describe.
Ubuntu Manpage: sge_conf - Sun Grid Engine configuration files manpages.ubuntu.com/manpages/natty/man5/sge_conf.5.html ... at a well known location in the Sun Grid Engine internal directory ... Changing the global execd_spool_dir parameter set at installation ...
Discussion list for users of Grid Engine - Gmane comments.gmane.org/gmane.comp.clustering.gridengine.users/20753 ... [1084:4222]: execvp(/var/spool/sge/default/spool/n3 ... No such file or directory > > SGE is able to grab ... Discussion list for users of Grid Engine. Search ...
Install and Configure Sun Grid Engine (SGE) Job Scheduler ... rgrid.blog.com/...software-tools/...sun-grid-engine-sge-job-scheduler
Sun Grid Engine is an open source ... The qmaster spool directory is the ... A notification about the chosen configurations in case you need to change something ...
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March, 12, 2019