01. Slurm-集群管理和作业调度系统

简介

https://slurm.schedmd.com/overview.html

Overview
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

Architecture


01. Slurm-集群管理和作业调度系统_第1张图片
image

主从式架构,一个primary(slurmctld), 负责作业管理, 多个 nodes(slurmd), 负责执行计算任务, primary有一个可选的backup.

tutorial

https://slurm.schedmd.com/tutorials.html

直接看这份文档 https://www.open-mpi.org/video/slurm/Slurm_EMC_Dec2012.pdf

概念:

SLURM Entities

  • Jobs: Resource allocation requests
  • Job steps: Set of (typically parallel) tasks
  • Partitions: Job queues with limits and access controls
  • Nodes
    • NUMA boards
      • Sockets
        • Cores
          • Hyperthreads
      • Memory
      • Generic Resources (e.g. GPUs)
  • Users submit jobs to a partition (queue)
  • Jobs are allocated resources
  • Jobs spawn steps, which are allocated resources from
    within the job's allocation
  • Job States


    01. Slurm-集群管理和作业调度系统_第2张图片
    截屏2019-12-14下午1.57.32.png
  • Linux Job Launch Sequence


    01. Slurm-集群管理和作业调度系统_第3张图片
    截屏2019-12-14下午3.23.13.png
操作
几种运行模式
  • srun
    Create a job allocation (if needed) and launch
    a job step (typically an MPI job)
  • salloc
    Create job allocation and start a shell to use it
    (interactive mode)
  • sbatch
    Submit script for later execution (batch
    mode)
  • sattach
    Connect stdin/out/err for an existing job or
    job step
其他命令
  • sinfo
  • squeue
  • smap
  • sbcast
  • scanncel
MPI 支持
  • Many different MPI implementations are supported:
    • MPICH1, MPICH2, MVAPICH, OpenMPI, etc.
  • Many use srun to launch the tasks directly
  • Some use “mpirun” or another tool within an existing SLURM allocation (they reference SLURM environment variables to determine what resources are allocated to the job)
  • Details are online:
    http://www.schedmd.com/slurmdocs/mpi_guide.html
发布节奏借鉴

持续集成,定期发布可用特性

  • New minor release about every 9 months
    • 2.4.x June 2012
    • 2.5.x December 2012
  • Micro releases with bug fixes about once each month
构建和安装

Slurm 自带Test Suite, 安装好以后可以用来做回归验证

2019.12.14 Tutorial 看完。

你可能感兴趣的:(01. Slurm-集群管理和作业调度系统)