<PREV> <INDEX> <NEXT>

Schedulers

Evolution:


### redo the slide with slurm

Structure:  

Job submission:
qsub -> pbs_server -> pbs_sched -> pbs_mom -> code runs -> acks sent in reverse

Options inside job file:
    #PBS -flag value

Server config: /var/spool/torque

Node config: /var/spool/torque/mom_priv
    $pbsserver server
    $status_update_time 30
    $logevent 255

Utilities:
momctl manage/diagnose MOM (node execution) daemon
pbsdsh launch tasks within a parallel job
pbsnodes view/modify batch status of compute nodes
qalter modify queued batch jobs
qdel delete/cancel batch jobs
qhold hold batch jobs
qmgr manage policies and other batch configuration
qrerun rerun a batch job
qrls release batch job holds
qrun start a batch job
qsig send a signal to a batch job
qstat view queues and jobs
qsub submit jobs
qterm shutdown pbs server daemon
tracejob trace job actions and states recorded in TORQUE logs

Monitoring:
xpbs
xpbs

xpbsmon
xpbsmon


Cloud Computing

Containerized workloads and services: uses OS virtualization
Clustering and job scheduling:



Other Clustering Techniques
Hadoop:


<PREV> <INDEX> <NEXT>