|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
The best way for managing licenses in SGE is the use of consumable resources (CR). Floating licenses can easily be managed with a global CR. The classic example of built-in consumable resource in SGE are slots.
The SGE batch scheduling system allows for arbitrary "consumable resources" to be created that users can then make requests against. Thus they can be used to limit access to software licenses based on availability of license tokens. What a job that uses a special software package starts it they request one (or more) license from SGE and consumable resource will decrement the counter for that license pool. If no more resources are available (i.e. the internal counter is at 0), then the job will be delayed until a currently-used resource is freed up.
The consumable parameter can have three values:
It can be set to 'yes' and 'JOB' only for numeric attributes (INT, DOUBLE, MEMORY, TIME - see type above). If set to 'yes' or 'JOB' the consumption of the corresponding resource can be managed by Sun Grid Engine internal bookkeeping. In this case Sun Grid Engine accounts for the consumption of this resource for all running jobs and ensures that jobs are only dispatched if the Sun Grid Engine internal bookkeeping indicates enough available consumable resources. Consumables are an efficient means to manage limited resources such a available memory, free space on a file system, network bandwidth or floating software licenses.
There are two types of consumables: per job and per slot.
Consumables can be combined with default or user defined load parameters (see sge_conf(5) and host_conf(5)), i.e. load values can be reported for consumable attributes or the consumable flag can be set for load attributes.
The Sun Grid Engine consumable resource management takes both the load (measuring availability of the resource) and the internal bookkeeping into account in this case, and makes sure that neither of both exceeds a given limit.
To enable consumable resource management the basic availability of a resource has to be defined. This can be done on a cluster global, per host and per queue basis while these categories may supersede each other in the given order (i.e. a host can restrict availability of a cluster resource and a queue can restrict host and cluster resources).
The definition of resource availability is performed with the complex_values entry in host_conf(5) and queue_conf(5).
Basically a complex is a resource of value that can be requested by a
job with the -l
switch to qsub
By setting a complex to be consumable, it
means that when a job requests that complex the number available is decreased.
The complex_values definition of the "global" host specifies cluster global consumable settings. To each consumable complex attribute in a complex_values list a value is assigned which denotes the maximum available amount for that resource. The internal bookkeeping will subtract from this total the assumed resource consumption by all running jobs as expressed through the jobs' resource requests.
Notes:
See the Sun Grid Engine Installation and Administration Guide for examples on the usage of the consumable resources facility.
Here is hoe how to achieve "license tokens" consumption in SGE (aka licence token management)% qconf -mc#name shortcut type relop requestable consumable default urgency
accel accel INT <= YES JOB 0 0
% qconf -me global
complex_values accel=19
% qsub -l accel=10 -pe mpi 8 <myjob.sh>
The "per job" setting ensure that the requested tokens are *not* multiplied with the number of requested slots.
Another example from Setting Up A Global Consumable Resource in Grid Engine
Step 1: Configure the "global" complex
First create/modify a complex called "global" (the name is reserved, like the complexes which are managing resources on a per host/queue basis are called "host" and "queue"). This can be found by clicking the "Complexes Configuration" button in qmon.
Enter the following values for the complex (verilog is used in this example):
#name shortcut type value relop requestable consumable default #------------------------------------------------------------- verilog vl INT 0 <= YES YES 0The above says: there is a complex attribute called "verilog" with the shortcut name "vl" and it is of type integer. The "value" for consumable resources has no meaning here (therefore it is 0). This resource is requestable (YES), and it is consumable (YES).The "default" field should be set to 0 (it is a default value for users who don't request anything, but for a global value it is not useful here).
When using qmon, do not forget to press the "Add" button to add the new complex definition to the table below before applying with the "Ok" button.
After the complex is configured, it can be viewed by running the following command at the prompt:
% qconf -sc globalStep 2: Configure the "global" host
Since a global consumable resource is being created (all hosts have access to this resource), the pseudo host "global" must be configured.
Using qmon:
qmon -> Host Configuration -> Execution host
Select the "global" host and click on "Modify". Select the tab titled "Consumable/Fixed Attributes". It is correct that the "global" complex does not show in the window (the global host has it by default, just as a host has the "host" complex by default).
Now click on the "Name/Value" title bar on the right (above the trash bin icon). A window pops up and there will be the resource "verilog". Select OK and verilog will be added to the first column of the table. Now enter the number of licenses of verilog in the second column.
Press "Ok" and the new resource and number in the will appear in the "Consumables/Fixed Attributes" window. Click the "Done" button to close this window.
Step 3: View the consumable attribute
To view the attribute, type the following:
% qstat -F verilog queuename qtype used/tot. load_avg arch states --------------------------------------------------------------------------- balrog.q BIC 0/4 0.45 solaris64 gc:verilog=10.000000 --------------------------------------------------------------------------- bilbur.q BIC 0/4 0.46 solaris gc:verilog=10.000000 --------------------------------------------------------------------------- dwain.q BIC 0/4 0.82 irix6 gc:verilog=10.000000See qstat(1) for the various meanings of "gc", etc. (Try "qstat -F" to see a long list of attributes associated with each queue).
"gc" means it is a (g)lobal (c)onsumable resource
Since it is global, all queues have inherited this value.
Step 4: Use the consumable attribute
The following submits a job, and requests the verilog resource:
% qsub -l vl=1 myjob.shWhen the job is running, the effect can be seen by running qstat:% qstat -F vl queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- balrog.q BIC 0/4 0.40 solaris64 gc:verilog=9.000000 ---------------------------------------------------------------------------- gloin.q2 BIC 0/4 0.02 osf4 gc:verilog=9.000000 ---------------------------------------------------------------------------- lis.q BIC 0/4 0.35 glinux gc:verilog=9.000000 ---------------------------------------------------------------------------- ori.q BIC 1/4 0.15 glinux gc:verilog=9.000000 3026 0 sleeper.sh andy t 11/02/1999 15:55:25 MASTERTo see which running job requested which resources:
% qstat -F vl -r -s r queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- [...] ---------------------------------------------------------------------------- ori.q BIC 1/4 0.12 glinux gc:verilog=9.000000 3026 0 sleeper.sh andy r 11/02/1999 15:55:25 MASTER Full jobname: sleeper.sh Hard Resources: verilog=1 h_fsize=0 (default)
|
||||
Bulletin | Latest | Past week | Past month |
|
Server Fault
We're using SGE (Sun Grid Manager). We have some limitations on the total number of concurrent jobs from all users.I would like to know if it's possible to set a temporary, voluntary limit on the number of concurrent running jobs for a specific user.
For example user
dave
is about to submit 500 jobs, but he would like no more than 100 to run concurrently, e.g. since he knows the jobs do lots of I/O which stuck the filesytem (true story, unfortunately).Is that possible?
Kamil Kisiel
8,12412250asked Sep 24 '10 at 0:25
You can define a complex with
qconf -mc
. Call it something likehigh_io
or whatever you'd like, and set the consumable field toYES
. Then in either the global configuration withqconf -me global
or in a particular queue withqconf -mq <queue name>
sethigh_io=500
in the complex values. Now tell your users to specify-l high_io=1
or however many "tokens" you'd like them to use. This will limit the number of concurrent jobs to whatever you set the complex value to.The other way to do this is with quotas. Add a quota with
qconf -arqs
that looks something like:Thanks Kamil and sorry for the late reply. A couple of follow-ups, since I'm quite new to{ name dave_max_slots description "Limit dave to 500 slots" enabled true limit users {dave} to slots=500 }
qconf
. Regarding your first suggestion, could you be a bit more explicit? What is "consumable"?After configuring as mentioned, fo I simply tell the user to
qsub
with-l high_io=1
? – David B Sep 28 '10 at 9:39Basically a complex is a resource of value that can be requested by a job with the
-l
switch toqsub
. By setting a complex to be consumable, it means that when a job requests that complex the number available is decreased. So if a queue has 500 of the high_io complex, and a job requests 20, there will be 480 available for other jobs. You'd request the complex just as in your example. – Kamil Kisiel Sep 28 '10 at 22:42Thank you Kamil. Sorry I can't vote up (not enough reputation yet). – David B Oct 1 '10 at 9:08
SGE - MerlinWiki
- matyldaX
- scratchX
- ram_free, mem_free
- disk_free, tmp_free
- gpu
We have found that for some tasks, it is advantageous to specify the info on required resources to SGE. It has sense in case an excessive use of RAM/netowrk storage is expected. The limits are soft and hard (parameters -soft, -hard), the limits themselves are:
-l resource=valueFor example, in case a job needs at least 400MB RAM: qsub -l ram_free=400M my_script.sh Another often requested resource is the space in /tmp:
qsub -l tmp_free=10G my_script.sh.
Or both:
qsub -l ram_free=400M,tmp_free=10G my_script.shOf course, it is possible (and preferable if the number does not change) to use the construction #$ -l ram_free=400M directly in the script. The actual status of given resource on all nodes can be obtained by: qstat -F ram_free, or more things by: qstat -F ram_free,tmp_free.
Details on other standard available resources are in /usr/local/share/SGE/doc/load_parameters.asc. In case you do not specify value for given resource, implicit value will be used (for space on /tmp it is 1GB, for RAM 100MB)
WARNING: You need to distinguish, if you request resources that are available at the time of submission (so called non-consumable resources), or if you need to allocate given resource for the whole runtime of your computation - for example, your program will need 400MB of memory but in the first 10 min of computation, it will allocate only 100MB. In case you use the standard resource mem_free, and during the first 10min another jobs will be submitted to the given node, SGE will interpret it in the following way: you wanted 400MB but you finally use only 100MB so that the rest of 300MB will be given to someone else (i.e. it will submit another task requesting this memory).
For these purposes, it is better to use consumable resources, that are computed independently on the current status of the task - for memory it is ram_free, for disc tmp_free. For example, resource ram_free does not look at the actual free RAM, but it computes the occupation of RAM only based on the requests of individual scripts. It works with the size of RAM of the given machine and subtracts the amount requested by the job that should be run on this machine. In case the job does not specify ram_free, implicit value of ram_free=100M will be used.
For the disk space in /tmp (tmp_free), the situation is more tricky: in case a job does not clean up properly its mess after it finishes, the disk can actually have less space than defined by the resource. Unfortunately, nothing can be done about this.
Known problems with SGE
- Use of paths - for home directory it is necessary to use the official path - i.e. /homes/kazi/... or /homes/eva (or simply the variable $HOME). In case the path of the internal mountpoint of the automounter is used - i.e. - /var/mnt/... an error will occur. (this is not an error of SGE, the internal path is not fully functional for access)
- Availability of nodes - due to the existence of nodes with limited access (employees' PCs), it is necessary to specify a list of nodes, on which your job can run. This can be done using parameter -q. The machines that are available are nodes in IBM Blades and also some computer labs in case you turn the machines on over night. The list of queues for -q must be only on one line even if it is very long. For the availability of given groups of nodes, the parameter -q can be used in the following way:
#$ -q all.q@@blade,all.q@@PCNxxx,all.q@@serversMain groups of computers are: @blade, @servers, @speech, @PCNxxx, @PCN2xxx - the full and actual list can be obtained by qconf -shgrpl
- The syntax for access is QUEUE@OBJECT - i.e. all.q@OBJECT. The object is either one computer, for example all.q@svatava, or a group of computers (which begins also by @ - @blade) i.e. all.q@@blade.
- The computers in the labs are sometimes restarted by students during computation - we can't do much about this. In case you really need the computation to finish (i.e. it is not easy to re-run a job in case it is brutally killed) use newly defined groups of computers:
@stable - @blade, @servers - servers that run all the time w/o restarting @PCOxxx, @PCNxxx - computer labs, there is a possibility that any node might be restarted at any time, a student or someone can shut the machine down by error or "by error". It is more or less sure that these machines will run smoothly over night and during weekends. There is also a group for each independent lab e.g. @PCN103.
- Runnnig other scripts than bash - it is necessary to specify the interpret on the first line of your script (it is probably already there), for example #!/usr/bin/perl, etc.
- Does your script generate a heavy traffic on matyldas ? It is necessary to set -l matyldaX=10, (for example 10 - i.e. in total 100/10 = 10 concurrent jobs from given matyldaX), where X is the number of matylda used (in case you use several matyldas, specify -l matyldaX=Y several times). We have created an SGE resource for each matylda (each matylda has 100 points in total) and the jobs using -l matyldaX=Y are submitted until given matylda has free points. This can be used to balance the load of given storage server from the user side. The same holds for servers scratch0X.
- Attention to parameter -cwd, is is not guaranteed that it will work all the time, better use cd /where/do/i/want at the beginning of your script.
- In case a node is restarted, a job will still be shown in SGE, although it is not running any more. This is because SGE is waiting until the node confirms termination of the computation (i.e. until it boots Linux again and starts the SGE client). In case you use qdel to delete a job, it will be only marked by flag d. Jobs marked by this flag are automatically deleted by the server every hour.
Parallel jobs - OpenMP
For parallel tasks with threads, it is enough to use parallel environment smp and to set the number of threads:
#!/bin/sh # #$ -N OpenMPjob #$ -o $JOB_NAME.$JOB_ID.out #$ -e $JOB_NAME.$JOB_ID.err # # PE_name CPU_Numbers_requested #$ -pe smp 4 # cd SOME_DIR_WITH_YOUR_PROGRAM export OMP_NUM_THREADS=$NSLOTS ./your_openmp_program [options]Parallel jobs - OpenMPI
- Open MPI is now fully supported, and it is the default parallel environment (mpirun is by default Open MPI)
- The SGE parallel environment is openmpi
- The allocation rule is $fill_in$ which means that the preferred allocation is on the same machine.
- Open MPI is compiled with tight SGE integration:
- mpirun will automatically submit to machines reserved by SGE
- qdel will automatically clean all MPI stubs
- In the parallel task, do not forget (preferably directly in the script) to use parameter -R y, this will turn on the reservation of slots, i.e. you won't be jumped by processes requesting less slots.
- in case a parallel task is launched using qlogin, there is no variable containing information on what slots were reserved. A useful tool is then qstat -u `whoami` -g t | grep QLOGIN, which says what parallel jobs are running.
Listing follows:
#!/bin/bash # --------------------------- # our name #$ -N MPI_Job # # use reservation to stop starvation #$ -R y # # pe request #$ -pe openmpi 2-4 # # --------------------------- # # $NSLOTS # the number of tasks to be used echo "Got $NSLOTS slots." mpirun -n $NSLOTS /full/path/to/your/executable
Jul 16, 2008
The SGE batch scheduling system allows for arbitrary "consumable resources" to be created that users can then make requests against. In general, this is used to limit access to a pool of software licenses or make sure that memory usage is planned for properly. E.g. when a user wants to use a special software package, they request 1 license from SGE and it will decrement its internal counter for that license pool. If no more resources are available (i.e. the internal counter is at 0), then the job will be delayed until a currently-used resource is freed up.
We can also create arbitrary consumable resources to help users self-limit their usage of the DSCR. We can set up a resource, or counter, that will be decremented every time you submit a job. This way, you can submit 1000's of jobs to SGE, but you won't be swamping the machines or otherwise impeding other users.
If a user is given their own job-control resource, say 'cpus_user001', they should then submit jobs with an extra resource request using the '-l' option:
% qsub -l cpus_user001=1 myjob.qBefore running the job, SGE will make sure that there are sufficient resources. Thus, if there are 100 resources set aside for 'cpus_user001', then the 101st simultaneous job-request will have to wait for one of the previous jobs to complete ''even if there are empty machines in the cluster.''Alternately, you can embed this within your SGE submission script. At the top of the file ("myjob.q" in the above example), you can insert:
$ -l cpus_user001=1
Server Fault
up vote0down votefavoriteI am using a tool called starcluster http://star.mit.edu/cluster to boot up an SGE configured cluster in the amazon cloud. The problem is that it doesn't seem to be configured with any pre-set consumable resources, excepts for SLOTS, which I don't seem to be able to request directly with a
qsub -l slots=X
. Each time I boot up a cluster, I may ask for a different type of EC2 node, so the fact that this slot resource is preconfigured is really nice. I can request a certain number of slots using a pre-configured parallel environment, but the problem is that it was set up for MPI, so requesting slots using that parallel environment sometimes grants the job slots spread out across several compute nodes.Is there a way to either 1) make a parallel environment that takes advantage of the existing pre-configured HOST=X slots settings that starcluster sets up where you are requesting slots on a single node, or 2) uses some kind of resource that SGE is automatically aware of? Running
qhost
makes me think that even though theNCPU
andMEMTOT
are not defined anywhere I can see, that SGE is somehow aware of those resources, are there settings where I can make those resources requestable without explicitely defining how much of each are available?Thanks for your time!
qhost
output:HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS ------------------------------------------------------------------------------- global - - - - - - - master linux-x64 2 0.01 7.3G 167.4M 0.0 0.0 node001 linux-x64 2 0.01 7.3G 139.6M 0.0 0.0
qconf -mc
output:#name shortcut type relop requestable consumable default urgency #---------------------------------------------------------------------------------------- arch a RESTRING == YES NO NONE 0 calendar c RESTRING == YES NO NONE 0 cpu cpu DOUBLE >= YES NO 0 0 display_win_gui dwg BOOL == YES NO 0 0 h_core h_core MEMORY <= YES NO 0 0 h_cpu h_cpu TIME <= YES NO 0:0:0 0 h_data h_data MEMORY <= YES NO 0 0 h_fsize h_fsize MEMORY <= YES NO 0 0 h_rss h_rss MEMORY <= YES NO 0 0 h_rt h_rt TIME <= YES NO 0:0:0 0 h_stack h_stack MEMORY <= YES NO 0 0 h_vmem h_vmem MEMORY <= YES NO 0 0 hostname h HOST == YES NO NONE 0 load_avg la DOUBLE >= NO NO 0 0 load_long ll DOUBLE >= NO NO 0 0 load_medium lm DOUBLE >= NO NO 0 0 load_short ls DOUBLE >= NO NO 0 0 m_core core INT <= YES NO 0 0 m_socket socket INT <= YES NO 0 0 m_topology topo RESTRING == YES NO NONE 0 m_topology_inuse utopo RESTRING == YES NO NONE 0 mem_free mf MEMORY <= YES NO 0 0 mem_total mt MEMORY <= YES NO 0 0 mem_used mu MEMORY >= YES NO 0 0 min_cpu_interval mci TIME <= NO NO 0:0:0 0 np_load_avg nla DOUBLE >= NO NO 0 0 np_load_long nll DOUBLE >= NO NO 0 0 np_load_medium nlm DOUBLE >= NO NO 0 0 np_load_short nls DOUBLE >= NO NO 0 0 num_proc p INT == YES NO 0 0 qname q RESTRING == YES NO NONE 0 rerun re BOOL == NO NO 0 0 s_core s_core MEMORY <= YES NO 0 0 s_cpu s_cpu TIME <= YES NO 0:0:0 0 s_data s_data MEMORY <= YES NO 0 0 s_fsize s_fsize MEMORY <= YES NO 0 0 s_rss s_rss MEMORY <= YES NO 0 0 s_rt s_rt TIME <= YES NO 0:0:0 0 s_stack s_stack MEMORY <= YES NO 0 0 s_vmem s_vmem MEMORY <= YES NO 0 0 seq_no seq INT == NO NO 0 0 slots s INT <= YES YES 1 1000 swap_free sf MEMORY <= YES NO 0 0 swap_rate sr MEMORY >= YES NO 0 0 swap_rsvd srsv MEMORY >= YES NO 0 0
qconf -me master
output (one of the nodes as an example):hostname master load_scaling NONE complex_values NONE user_lists NONE xuser_lists NONE projects NONE xprojects NONE usage_scaling NONE report_variables NONE
qconf -msconf
output:algorithm default schedule_interval 0:0:15 maxujobs 0 queue_sort_method load job_load_adjustments np_load_avg=0.50 load_adjustment_decay_time 0:7:30 load_formula np_load_avg schedd_job_info false flush_submit_sec 0 flush_finish_sec 0 params none reprioritize_interval 0:0:0 halftime 168 usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000 compensation_factor 5.000000 weight_user 0.250000 weight_project 0.250000 weight_department 0.250000 weight_job 0.250000 weight_tickets_functional 0 weight_tickets_share 0 share_override_tickets TRUE share_functional_shares TRUE max_functional_jobs_to_schedule 200 report_pjob_tickets TRUE max_pending_tasks_per_job 50 halflife_decay_list none policy_hierarchy OFS weight_ticket 0.010000 weight_waiting_time 0.000000 weight_deadline 3600000.000000 weight_urgency 0.100000 weight_priority 1.000000 max_reservation 0 default_duration INFINITY
qconf -mq all.q
output:qname all.q hostlist @allhosts seq_no 0 load_thresholds np_load_avg=1.75 suspend_thresholds NONE nsuspend 1 suspend_interval 00:05:00 priority 0 min_cpu_interval 00:05:00 processors UNDEFINED qtype BATCH INTERACTIVE ckpt_list NONE pe_list make orte rerun FALSE slots 1,[master=2],[node001=2] tmpdir /tmp shell /bin/bash prolog NONE epilog NONE shell_start_mode posix_compliant starter_method NONE suspend_method NONE resume_method NONE terminate_method NONE notify 00:00:60 owner_list NONE user_lists NONE xuser_lists NONE subordinate_list NONE complex_values NONE projects NONE xprojects NONE calendar NONE initial_state default s_rt INFINITY h_rt INFINITY s_cpu INFINITY h_cpu INFINITY s_fsize INFINITY h_fsize INFINITY s_data INFINITY h_data INFINITY s_stack INFINITY h_stack INFINITY s_core INFINITY h_core INFINITY s_rss INFINITY
John St. John
1082The solution I found is to make a new parallel environment that has the
$pe_slots
allocation rule (seeman sge_pe
). I set the number of slots available to that parallel environment to be equal to the max since$pe_slots
limits the slot usage to per-node. Since starcluster sets up the slots at cluster bootup time, this seems to do the trick nicely. You also need to add the new parallel environment to the queue. So just to make this dead simple:qconf -ap by_node
and here are the contents after I edited the file:
pe_name by_node slots 9999999 user_lists NONE xuser_lists NONE start_proc_args /bin/true stop_proc_args /bin/true allocation_rule $pe_slots control_slaves TRUE job_is_first_task TRUE urgency_slots min accounting_summary FALSE
Also modify the queue (called
all.q
by starcluster) to add this new parallel environment to the list.qconf -mq all.q
and change this line:
pe_list make orte
to this:
pe_list make orte by_node
I was concerned that jobs spawned from a given job would be limited to a single node, but this doesn't seem to be the case. I have a cluster with two nodes, and two slots each.
I made a test file that looks like this:
#!/bin/bash qsub -b y -pe by_node 2 -cwd sleep 100 sleep 100
and executed it like this:
qsub -V -pe by_node 2 test.sh
After a little while,
qstat
shows both jobs running on different nodes:job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 25 0.55500 test root r 10/17/2012 21:42:57 all.q@master 2 26 0.55500 sleep root r 10/17/2012 21:43:12 all.q@node001 2
I also tested submitting 3 jobs at once requesting the same number of slots on a single node, and only two run at a time, one per node. So this seems to be properly set up!
Stack Overflow
We have a cluster of machines, each with 4 GPUs. Each job should be able to ask for 1-4 GPUs. Here's the catch: I would like the SGE to tell each job which GPU(s) it should take. Unlike the CPU, a GPU works best if only one process accesses it at a time. So I would like to:Job #1 GPU: 0, 1, 3 Job #2 GPU: 2 Job #4 wait until 1-4 GPUs are avaliable
The problem I've run into, is that the SGE will let me create a GPU resource with 4 units on each node, but it won't explicitly tell a job which GPU to use (only that it gets 1, or 3, or whatever).
I thought of creating 4 resources (
gpu0, gpu1, gpu2, gpu3
), but am not sure if the-l
flag will take a glob pattern, and can't figure out how the SGE would tell the job which gpu resources it received. Any ideas?gpu gpgpu sungridengine multiple-gpu
Daniel Blezek
2,6521915 When you have multiple GPUs and you want your jobs to request a GPU but the Grid Engine scheduler should handle and select a free GPUs you can configure a RSMAP (resource map) complex (instead of a INT). This allows you to specify the amount as well as the names of the GPUs on a specific host in the host configuration. You can also set it up as a HOST consumable, so that independent of the slots your request, the amount of GPU devices requested with -l cuda=2 is for each host 2 (even if the parallel job got i.e. 8 slots on different hosts).qconf -mc #name shortcut type relop requestable consumable default urgency #---------------------------------------------------------------------------------------------- gpu gpu RSMAP <= YES HOST 0 0
In the execution host configuration you can initialize your resources with ids/names (here simply GPU1 and GPU2).
qconf -me yourhost hostname yourhost load_scaling NONE complex_values gpu=2(GPU1 GPU2)
Then when requesting -l gpu=1 the Univa Grid Engine scheduler will select GPU2 if GPU1 is already used by a different job. You can see the actual selection in the qstat -j output. The job gets the selected GPU by reading out the $SGE_HGR_gpu environment variable, which contains in this case the chose id/name "GPU2". This can be used for accessing the right GPU without having collisions.
If you have a multi-socket host you can even attach a GPU directly to some CPU cores near the GPU (near the PCIe bus) in order to speed up communication between GPU and CPUs. This is possible by attaching a topology mask in the execution host configuration.
qconf -me yourhost hostname yourhost load_scaling NONE complex_values gpu=2(GPU1:SCCCCScccc GPU2:SccccSCCCC)
Now when the UGE scheduler selects GPU2 it automatically binds the job to all 4 cores (C) of the second socket (S) so that the job is not allowed to run on the first socket. This does not even require the -binding qsub param.
More configuration examples you can find on www.gridengine.eu.
Note, that all these features are only available in Univa Grid Engine (8.1.0/8.1.3 and higher), and not in SGE 6.2u5 and other Grid Engine version (like OGE, Sun of Grid Engine etc.). You can try it out by downloading the 48-core limited free version from univa.com.
|
||||
Bulletin | Latest | Past week | Past month |
|
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March 12, 2019