Using SPIDER on Parallel Clustered Systems

This page illustrates the usage of some SPIDER operations that are helpful in creating and controlling the execution of multiple SPIDER jobs running in parallel on a loosely coupled, clustered parallel system.

Example: Alignment of Single Particles

b_master.pam

Master task. Started on one node only. Coordinates and synchronizes all tasks.

; ArDean Leith Nov 2000

; INPUT:
; x41 (Starting micrograph number)
; x42 (Ending micrograph number)

; OUTPUT:
; none

X41 = 16 ; starting micrograph number
X42 = 40 ; ending micrograph number

MD
TR OFF ; decrease output to results file
MD
VB OFF ; decrease output to results file
MD
SET MP ; use SMP on 2 processors per node
2

VM ; dir: out{..} NEEDED
mkdir out

x11=1
; create slave task for each micrograph
DO LB1 X77=X41,X42
; Create document file with register settings
; The doc. file can passinfo to the slave tasks.
SD 41,x41
jnkdoc{***x77}.${DATEXT}
SD 42,x42
jnkdoc{***x77}.${DATEXT}
SD E
jnkdoc{***x77}.${DATEXT}

VM spider pam/pre @b_align {***X77} X77={***X77}
LB1

VM
echo "b_master waiting for all alignments"

MY FL ; flush results

; wait for alignments to finish
@b_wait[x12,x13 ]

; Can carry out futher consolidation of alignments here EN

b_align.pam

Runs on each node execept for the master node. This procedure calls a SPIDER procedure b_doalign.pam (not shown here) which carries out the actual alignment for this group. When the alignment is finished, this procedure creates a new doc. file: jnkdocparamout{***x77} which signals b_master that it can continue.

; ArDean Leith Nov 2000

; INPUT:
; reg: 77 (group, on command line)
; jnkdoc{***grp} (doc file created by b_master)

; OUTPUT:
; jnkdocparmout (signal file contains x11 & x47)

MD
TR OFF ; decrease output to results file
MD
VB OFF ; decrease output to results file
MD
SET MP ; use SMP on 2 processors per node
2

; Started by b_master
; retrieve registers stored in doc file: jnkdoc{***x77}
UD IC,41,X41
jnkdoc{***x77}
UD IC,42,X42
jnkdoc{***x77}

UD ICE
jnkdoc{***x77}

VM ; remove this sync. doc file
\rm -f jnkdoc{***x77}*

VM
date
VM
echo "starting group: {**x77}"

X11
MY FL ; flush results file
<\P>
@p_doalign[X41,X42,X77] ; runs alignment for this group.

; Signal b_master to re-awaken now
; (b_master wakes when it sees jnkdocparamout{***x77})
SD 11,X11 ; set sync file output
jnkdocparamout{***x77}

SD E
jnkdocparamout{***x77}

VM
echo "ending group: {**x77}" LB1

EN

b_wait.pam

b_master running on the master node calls this procedure after starting the b_align tasks to carry out the alignment. When an alignment is finished, the alignment creates a new doc. file: jnkdocparamout{***x77}) This procedure causes b_master to wait for the creation of these files from each of the b_align tasks.

[x12,x13]
; ArDean Leith Nov 2000

; Used in b_master. Waits for slaves to finish.

; INPUT:
; reg: 12 (starting group)
; reg: 13 (ending group)
; doc file: jnkdocparamout{***grp}*

; OUTPUT:
; none

; wait for all micrograph groups -------------
DO LB3 x77=x12,x13
MY FL ; flush results file
IQ SYNC
jnkdocparamout{***x77}
(10 36000)

VM
date

VM
echo "synced group: {**x77} "

DE
jnkdocparamout{***x77}

MY FL ; flush results file
LB3 ; end wait loop over groups -------

RE


Source: techs/parallel/parallel.html     Last update: 20 March 2001     ArDean Leith