Project

General

Profile

Slurm » History » Version 58

Martin Kuemmel, 07/26/2016 12:22 PM

1 21 Kerstin Paech
{{toc}}
2 21 Kerstin Paech
3 53 Sebastian Bocquet
h1. Hardware overview
4 53 Sebastian Bocquet
5 53 Sebastian Bocquet
You access the Euclid cluster through alexandria@usm.uni-muenchen.de
6 53 Sebastian Bocquet
7 53 Sebastian Bocquet
* alexandria is the file server and should not be used for computing
8 53 Sebastian Bocquet
* There are 12 compute nodes named euclides1--euclides12
9 53 Sebastian Bocquet
* euclides8 hosts a virtual machine and is not available for computing
10 53 Sebastian Bocquet
* euclides12 is only available for debugging, see below
11 56 Martin Kuemmel
* euclides11 is currently used to test a different OS
12 53 Sebastian Bocquet
* each node has 32 logical CPUs and 64GB of RAM
13 53 Sebastian Bocquet
14 46 Roy Henderson
h1. How to run jobs on the euclides nodes (using Slurm)
15 1 Kerstin Paech
16 42 Kerstin Paech
Use slurm to submit jobs or login to the euclides nodes (euclides1-12).
17 1 Kerstin Paech
18 9 Kerstin Paech
*Please read through this entire wikipage so everyone can make efficient use of this cluster*
19 9 Kerstin Paech
20 1 Kerstin Paech
h2. alexandria
21 1 Kerstin Paech
22 1 Kerstin Paech
*Please do not use alexandria as a compute node* - it's hardware is different from the nodes. It hosts our file server and other services that are important to us. 
23 1 Kerstin Paech
24 1 Kerstin Paech
You should use alexandria to
25 1 Kerstin Paech
* transfer files
26 51 Sebastian Bocquet
* compile your code
27 51 Sebastian Bocquet
* submit jobs to the nodes
28 51 Sebastian Bocquet
29 51 Sebastian Bocquet
If you need to debug and would like to login to a node, please start an interactive job to one of the nodes using slurm. For instructions see below.
30 51 Sebastian Bocquet
31 51 Sebastian Bocquet
h2. euclides nodes
32 51 Sebastian Bocquet
33 1 Kerstin Paech
34 1 Kerstin Paech
Job submission to the euclides nodes is handled by the slurm jobmanager (see http://slurm.schedmd.com and https://computing.llnl.gov/linux/slurm/). 
35 52 Sebastian Bocquet
*Important: In order to run jobs, you need to be added to the slurm accounting system - please contact the admin*
36 1 Kerstin Paech
37 4 Kerstin Paech
All slurm commands listed below have very helpful man pages (e.g. man slurm, man squeue, ...). 
38 4 Kerstin Paech
39 4 Kerstin Paech
If you are already familiar with another jobmanager the following information may be helpful to you http://slurm.schedmd.com/rosetta.pdf‎.
40 1 Kerstin Paech
41 1 Kerstin Paech
h3. Scheduling of Jobs
42 1 Kerstin Paech
43 9 Kerstin Paech
At this point there are two queues, called partitions in slurm: 
44 9 Kerstin Paech
* *normal* which is the default partition your jobs will be sent to if you do not specify it otherwise. At this point there is a time limit of
45 9 Kerstin Paech
two days. Jobs at this point can only run on 1 node.
46 16 Kerstin Paech
* *debug* which is meant for debugging, you can only run one job at a time, other jobs submitted will remain in the queue. Time limit is
47 16 Kerstin Paech
12 hours.
48 1 Kerstin Paech
49 38 Kerstin Paech
The default memory per core used is 2GB, if you need more or less, please specify with the --mem or --mem-per-cpu option.
50 38 Kerstin Paech
51 9 Kerstin Paech
We have also set up a scheduler that goes beyond the first come first serve - some jobs will be favoured over others depending
52 9 Kerstin Paech
on how much you or your group have been using euclides in the past 2 weeks, how long the job has been queued and how much
53 9 Kerstin Paech
resources it will consume.
54 9 Kerstin Paech
55 9 Kerstin Paech
This is serves as a starting point, we may have to adjust parameters once the slurm jobmanager is used. Job scheduling is a complex
56 9 Kerstin Paech
issue and we still need to build expertise and gain experience what are the user needs in our groups. Please feel free to speak out if
57 9 Kerstin Paech
there is something that can be improved without creating an unfair disadvantage for other users.
58 9 Kerstin Paech
59 9 Kerstin Paech
You can run interactive jobs on both partitions.
60 9 Kerstin Paech
61 41 Kerstin Paech
h3. Running an interactive job with slurm (a.k.a. logging in)
62 1 Kerstin Paech
63 9 Kerstin Paech
To run an interactive job with slurm in the default partition, use
64 1 Kerstin Paech
65 1 Kerstin Paech
<pre>
66 14 Kerstin Paech
srun -u --pty bash
67 1 Kerstin Paech
</pre>
68 9 Kerstin Paech
69 15 Shantanu Desai
If you want to use tcsh use
70 15 Shantanu Desai
71 15 Shantanu Desai
<pre>
72 15 Shantanu Desai
srun -u --pty tcsh
73 15 Shantanu Desai
</pre>
74 15 Shantanu Desai
75 30 Shantanu Desai
If you want to use a larger memory per job do
76 30 Shantanu Desai
77 30 Shantanu Desai
<pre>
78 31 Shantanu Desai
srun -u --mem-per-cpu=8000 --pty tcsh
79 30 Shantanu Desai
</pre>
80 30 Shantanu Desai
81 20 Kerstin Paech
In case you want to open x11 applications, use the --x11=first option, e.g.
82 20 Kerstin Paech
<pre>
83 20 Kerstin Paech
srun --x11=first -u   --pty  bash
84 20 Kerstin Paech
</pre>
85 20 Kerstin Paech
86 9 Kerstin Paech
In case the 'normal' partition is overcrowded, to use the 'debug' partition, use:
87 9 Kerstin Paech
<pre>
88 14 Kerstin Paech
srun --account cosmo_debug -p debug -u --pty bash # if you are part of the Cosmology group
89 14 Kerstin Paech
srun --account euclid_debug -p debug -u --pty bash  # if you are part of the EuclidDM group
90 12 Kerstin Paech
</pre> As soon as a slot is open, slurm will log you in to an interactive session on one of the nodes.
91 1 Kerstin Paech
92 44 Kerstin Paech
h3. limited ssh access
93 44 Kerstin Paech
94 44 Kerstin Paech
If you have an active job (batch or interactive), you can login to the node the job is running on. Your ssh session will be killed if the job terminates. Your ssh session will be restricted to the same resources as your job (so you cannot accidentally bypass the job scheduler and harm other user's jobs).
95 44 Kerstin Paech
96 10 Kerstin Paech
h3. Running a simple once core batch job with slurm using the default partition
97 1 Kerstin Paech
98 1 Kerstin Paech
* To see what queues are available to you (called partitions in slurm), run:
99 1 Kerstin Paech
<pre>
100 1 Kerstin Paech
sinfo
101 1 Kerstin Paech
</pre>
102 1 Kerstin Paech
103 1 Kerstin Paech
* To run slurm, create a myjob.slurm containing the following information:
104 1 Kerstin Paech
<pre>
105 1 Kerstin Paech
#!/bin/bash
106 1 Kerstin Paech
#SBATCH --output=slurm.out
107 1 Kerstin Paech
#SBATCH --error=slurm.err
108 1 Kerstin Paech
#SBATCH --mail-user <put your email address here>
109 1 Kerstin Paech
#SBATCH --mail-type=BEGIN
110 8 Kerstin Paech
#SBATCH -p normal
111 1 Kerstin Paech
112 1 Kerstin Paech
/bin/hostname
113 1 Kerstin Paech
</pre>
114 1 Kerstin Paech
115 1 Kerstin Paech
* To submit a batch job use:
116 1 Kerstin Paech
<pre>
117 1 Kerstin Paech
sbatch myjob.slurm
118 1 Kerstin Paech
</pre>
119 1 Kerstin Paech
120 1 Kerstin Paech
* To see the status of you job, use 
121 1 Kerstin Paech
<pre>
122 1 Kerstin Paech
squeue
123 1 Kerstin Paech
</pre>
124 1 Kerstin Paech
125 11 Kerstin Paech
* To kill a job use:
126 11 Kerstin Paech
<pre>
127 11 Kerstin Paech
scancel <jobid>
128 11 Kerstin Paech
</pre> the <jobid> you can get from using squeue.
129 11 Kerstin Paech
130 1 Kerstin Paech
* For some more information on your job use
131 1 Kerstin Paech
<pre>
132 1 Kerstin Paech
scontrol show job <jobid>
133 11 Kerstin Paech
</pre>the <jobid> you can get from using squeue.
134 1 Kerstin Paech
135 10 Kerstin Paech
h3. Running a simple once core batch job with slurm using the debug partition
136 10 Kerstin Paech
137 10 Kerstin Paech
Change the partition to debug and add the appropriate account depending if you're part of
138 10 Kerstin Paech
the euclid or cosmology group.
139 10 Kerstin Paech
140 10 Kerstin Paech
<pre>
141 10 Kerstin Paech
#!/bin/bash
142 10 Kerstin Paech
#SBATCH --output=slurm.out
143 10 Kerstin Paech
#SBATCH --error=slurm.err
144 10 Kerstin Paech
#SBATCH --mail-user <put your email address here>
145 10 Kerstin Paech
#SBATCH --mail-type=BEGIN
146 57 Martin Kuemmel
#SBATCH --account [cosmo_debug/euclid_debug]
147 10 Kerstin Paech
#SBATCH -p debug
148 10 Kerstin Paech
149 10 Kerstin Paech
/bin/hostname
150 10 Kerstin Paech
</pre>
151 10 Kerstin Paech
152 22 Kerstin Paech
h3. Accessing a node where a job is running or starting additional processes on a node
153 22 Kerstin Paech
154 25 Kerstin Paech
You can attach an srun command to an already existing job (batch or interactive). This
155 22 Kerstin Paech
means you can start an interactive session on a node where a job of yours is running
156 26 Kerstin Paech
or start an additional process.
157 22 Kerstin Paech
158 22 Kerstin Paech
First determine the jobid of the desired job using squeue, then use 
159 22 Kerstin Paech
160 22 Kerstin Paech
<pre>
161 22 Kerstin Paech
srun  --jobid <jobid> [options] <executable> 
162 22 Kerstin Paech
</pre>
163 22 Kerstin Paech
Or more concrete
164 22 Kerstin Paech
<pre>
165 22 Kerstin Paech
srun  --jobid <jobid> -u --pty  bash # to start an interactive session
166 22 Kerstin Paech
srun  --jobid <jobid> ps -eaFAl  # to start get detailed process information 
167 22 Kerstin Paech
</pre>
168 22 Kerstin Paech
169 24 Kerstin Paech
The processes will only run on cores that have been allocated to you. This works 
170 24 Kerstin Paech
for batch as well as interactive jobs. 
171 23 Kerstin Paech
*Important: If the original job that was submitted is finished, any process 
172 23 Kerstin Paech
attached in this fashion will be killed.*
173 22 Kerstin Paech
174 10 Kerstin Paech
175 6 Kerstin Paech
h3. Batch script for running a multi-core job
176 6 Kerstin Paech
177 17 Kerstin Paech
mpi is installed on alexandria.
178 17 Kerstin Paech
179 18 Kerstin Paech
To run a 4 core job for an executable compiled with mpi you can use
180 6 Kerstin Paech
<pre>
181 6 Kerstin Paech
#!/bin/bash
182 6 Kerstin Paech
#SBATCH --output=slurm.out
183 6 Kerstin Paech
#SBATCH --error=slurm.err
184 6 Kerstin Paech
#SBATCH --mail-user <put your email address here>
185 6 Kerstin Paech
#SBATCH --mail-type=BEGIN
186 6 Kerstin Paech
#SBATCH -n 4
187 1 Kerstin Paech
188 18 Kerstin Paech
mpirun <programname>
189 1 Kerstin Paech
190 1 Kerstin Paech
</pre>
191 18 Kerstin Paech
and it will automatically start on the number of nodes specified.
192 1 Kerstin Paech
193 18 Kerstin Paech
To ensure that the job is being executed on only one node, add
194 18 Kerstin Paech
<pre>
195 18 Kerstin Paech
#SBATCH -n 4
196 18 Kerstin Paech
</pre>
197 18 Kerstin Paech
to the job script.
198 17 Kerstin Paech
199 19 Kerstin Paech
If you would like to run a program that itself starts processes, you can use the
200 19 Kerstin Paech
environment variable $SLURM_NPROCS that is automatically defined for slurm
201 19 Kerstin Paech
jobs to explicitly pass the number of cores the program can run on.
202 19 Kerstin Paech
203 17 Kerstin Paech
To check if your job is acutally running on the specified number of cores, you can check
204 17 Kerstin Paech
the PSR column of
205 17 Kerstin Paech
<pre>
206 17 Kerstin Paech
ps -eaFAl
207 17 Kerstin Paech
# or ps -eaFAl | egrep "<yourusername>|UID" if you just want to see your jobs
208 6 Kerstin Paech
</pre>
209 27 Jiayi Liu
210 28 Kerstin Paech
h3. environment for jobs
211 27 Jiayi Liu
212 29 Kerstin Paech
By default, slurm does not initialize the environment (using .bashrc, .profile, .tcshrc, ...)
213 29 Kerstin Paech
214 28 Kerstin Paech
To use your usual system environment, add the following line in the submission script:
215 27 Jiayi Liu
<pre>
216 27 Jiayi Liu
#SBATCH --get-user-env
217 1 Kerstin Paech
</pre>
218 1 Kerstin Paech
219 58 Martin Kuemmel
h2. desdb node
220 58 Martin Kuemmel
221 58 Martin Kuemmel
Some specific jobs in cosmodb, such as the "catalog ingest", need to be performed on the machines desdb1/2. For those jobs there is the slurm account "euclid_cat_ing" with the partition "cat_ing". Only selected persons from the Euclid group have access to this node. Please specify "-p cat_ing" and "--account euclid_cat_ing" on the command line or in the slurm script.
222 28 Kerstin Paech
223 28 Kerstin Paech
h2. Software specific setup
224 28 Kerstin Paech
225 28 Kerstin Paech
h3. Python environment 
226 28 Kerstin Paech
227 28 Kerstin Paech
You can use the python 2.7.3 installed on the euclides cluster by using
228 27 Jiayi Liu
229 27 Jiayi Liu
<pre>
230 27 Jiayi Liu
source /data2/users/ccsoft/etc/setup_all
231 37 Kerstin Paech
source  /data2/users/ccsoft/etc/setup_python2.7.3
232 33 Shantanu Desai
</pre>
233 32 Shantanu Desai
234 32 Shantanu Desai
235 34 Shantanu Desai
h2. Notes For Euclid users
236 32 Shantanu Desai
237 35 Shantanu Desai
For those submitting jobs to euclides* nodes through Cosmo DM pipeline  here are some things which need to be specified for customized job submissions,
238 35 Shantanu Desai
since a different interface to slurm is used.
239 34 Shantanu Desai
240 34 Shantanu Desai
* To use larger memory per block , specify max_memory = 6000 (for 6G) and so on. inside block definition or in the submit file (in
241 34 Shantanu Desai
case you want to use it for all blocks)
242 34 Shantanu Desai
243 34 Shantanu Desai
* If you want to run on multiple cores/cores then use 
244 34 Shantanu Desai
nodes='<number of nodes>:ppn=<number of cores> inside the block definition of a particular block or in the submit file in case you want
245 1 Kerstin Paech
to use it for all blocks.
246 34 Shantanu Desai
247 35 Shantanu Desai
* If you want to use a larger wall time then specify wall_mod=<wall time in minutes> inside the module definition
248 39 Shantanu Desai
249 40 Shantanu Desai
* note that queue=serial does not work on alexandria(we usually use it for c2pap)
250 45 Roy Henderson
251 45 Roy Henderson
h1. Admin
252 45 Roy Henderson
253 49 Martin Kuemmel
There is a user "slurm" which however is not really necessary for the administration work. The slurm administrator needs sudo access. Some script for adding a user and similar things are in "/data1/users/slurm". With the sudo access the admin can execute those scripts. In the mysql database there is the username "slurmdb" with password.
254 48 Martin Kuemmel
255 50 Sebastian Bocquet
h2. Overview over users, accounts, etc.
256 50 Sebastian Bocquet
257 50 Sebastian Bocquet
No sudo access needed:
258 50 Sebastian Bocquet
<pre>
259 50 Sebastian Bocquet
/usr/local/bin/sacctmgr show account withassoc
260 50 Sebastian Bocquet
</pre>
261 50 Sebastian Bocquet
262 45 Roy Henderson
h2. Adding a new user
263 45 Roy Henderson
264 45 Roy Henderson
As root on @alexandria@,
265 45 Roy Henderson
266 45 Roy Henderson
<pre>
267 45 Roy Henderson
cd /data1/users/slurm/
268 55 Sebastian Bocquet
./add_user.sh UserName account(cosmo or euclid)
269 54 Sebastian Bocquet
/usr/local/bin/.scontrol reconfigure
270 45 Roy Henderson
</pre>
271 45 Roy Henderson
272 45 Roy Henderson
h2. To increase memory, cores etc for a user
273 45 Roy Henderson
274 45 Roy Henderson
Inside script above, various commands for changing user settings, e.g.
275 45 Roy Henderson
276 45 Roy Henderson
<pre>
277 45 Roy Henderson
/usr/local/bin/sacctmgr -i modify user  name=$1 set GrpCPUs=32
278 45 Roy Henderson
/usr/local/bin/sacctmgr -i modify user  name=$1 set GrpMem=128000
279 45 Roy Henderson
</pre>
280 50 Sebastian Bocquet
281 50 Sebastian Bocquet
h2. Node state "drain"
282 50 Sebastian Bocquet
283 50 Sebastian Bocquet
When a node is in "drain" state when calling <pre>sinfo</pre>
284 50 Sebastian Bocquet
run
285 50 Sebastian Bocquet
<pre>
286 50 Sebastian Bocquet
/usr/local/bin/scontrol update nodename=NODE_NAME state=resume
287 50 Sebastian Bocquet
</pre>
288 50 Sebastian Bocquet
to put it back to operation.
289 48 Martin Kuemmel
290 48 Martin Kuemmel
h2. Nodes down
291 48 Martin Kuemmel
292 48 Martin Kuemmel
Sometimes nodes are reported as "down". This seems to happen as a result of network problems. Here is some "troubleshooting":https://computing.llnl.gov/linux/slurm/troubleshoot.html#nodes for this situation. Also after a re-boot of alexandria some manual work on slurm might be necessary to get going again.
Redmine Appliance - Powered by TurnKey Linux