About

Partitions

A partition is a logical collection of nodes that comprise different hardware resources and limits to help meet the wide variety of jobs that get scheduled on the cluster. Occasionally, the Research Computing team might need to make updates to the partitions based on monitoring job submissions to help reduce job wait times. As our cluster grows, changes to the partitions also help to ensure the fair, efficient distribution of resources for all jobs being submitted to the cluster.

On Explorer, there are several partitions available to users.

  • General Access: debugexpressshortgpu-shortgpu, gpu-interactive
  • Application Only: longlargemultigpu
  • Courses: Dedicated to the use of students and instructors for the duration of a course.
  • PI Owned: Private partitions that include hardware that a PI owns; can only be accessed by members of the PI’s group.

General and Application Only Partition Information

The general access and application only partitions span the hardware on the cluster, with gpu-shortgpu, and multigpu spanning the GPUs on the cluster and the other partitions spanning the CPUs. For example, if you use the debug partition you are using the same hardware as short, with different time, job, and core limits. Refer to the tables below for detailed information on the current partitions.

In the following table, the Running Jobs Per User, Core and RAM limits are set per user, across all running jobs (not pending). Please keep in mind that the number of running jobs is limited by the available resources on the cluster at the time of the job submission and may not adhere to the number stated below.

General Access Partitions:

NameRequires Approval?Time Limit
(Default/Max)
Running JobsSubmitted JobsCore LimitRAM LimitUse Case
debugNo20 minutes/20 minutes101000128256GBSerial and parallel jobs that can run under 20 minutes. Good for testing code.
expressNo30 minutes/60 minutes501000204825TBSerial and parallel jobs that can run under 60 minutes.
shortNo4 hours/48 hours501000102425TBSerial or small parallel jobs (–nodes=2 max) that need to run for up to 48 hours.
longYes1 day/ 5 days251000102425TBSerial or parallel jobs that need to run for more than 24 hours. Need to prove that your code cannot checkpoint to use this partition.
largeYes6 hours/6 hours1001000N/AN/ARunning parallel jobs that can efficiently use more than 2 nodes. Need to demonstrate that your code is optimized for running on more than 2 nodes.
sharingNo30 minutes/60 minutes24N/AN/AIdeal for short running jobs that do not require more than 2 nodes.

GPU Partitions:

NameRequires Approval?Time Limit
(Default/Max)
Running JobsSubmitted JobsGPU LimitUse Case
gpu-shortNo1 hour/2 hours241Ideal for quick-running batch jobs. The priority is higher than the ‘gpu’ partition, reducing the wait time for users.
gpu-interactiveNo1 hour/2 hours241Open OnDemand and interactive sessions. You can reserve and run GPU jobs (e.g., debugging GPU applications/routines) on the gpu-interactive partition via srun sessions through the terminal directly.
gpuNo4 hours/8 hours481Jobs that can run on a single GPU processor.
multigpuYes12 hours/24 hours488Jobs that require more than one GPU and take up to 24 hours to run.
sharingNo30 minutes/60 minutes241Ideal for quick running batch jobs.

Please Note: It is possible for you to see the message, job violates accounting/QOS policy in the output of squeue -u $USER, even if you submitted fewer jobs than the number of submitted jobs indicated in the table below. This implies that Slurm has reached the hard-coded limit of 10,000 total jobs/per account at that time. Here the account refers to the Slurm account that you, as a cluster user, are associated with. So you will continue to receive this error message until some of your jobs start running and/or get completed before you can submit more jobs.

Viewing Partition Information

Slurm commands allow you to view information about the partitions. Three commands that can show you partition information are sinfosacct, and scontrol. The following are common options to use with these commands:

CommandExplanation
sinfo -p `partition name`Displays the state of the nodes on a specific partition.
sinfo -p `partition name` –Format=time,nodes,cpus,socketcorethread,memory,nodeai,featuresDisplays more detailed information using the Format option, including features like the type of processors.
sacct –partition `partition name`Displays the jobs that have been run on this partition.
scontrol show partition `partition name`Displays the Slurm configuration of the partition.

Allocating Partitions in your Jobs

To specify a partition when running jobs, use the option --partition=<partition name> with either srun or sbatch. When using a partition with your job and specifying the options of --nodes= and --ntasks= and/or --cpus-per-task=, make sure that you are requesting options that best fit your job.

Please Note: Requesting the maximum number of nodes or tasks will not make your job run faster or give you higher priority in the job queue. This can actually have the opposite effect on jobs that are better suited to running with smaller requirements, as you have to wait for the extra resources that your job will not use. 

You should always try to have job requests that will attempt to allocate the best resources for the job you want to run. For example, if you are running a job that is not parallelized, you only need to request one node (--nodes=1). For some parallel jobs, such as a small MPI job, you can also use one node (--nodes=1) with the –-ntasks= option set to correspond to the number of MPI ranks (tasks) in your code. For example, for a job that has 12 MPI ranks, request 1 node and 12 tasks within that node (--nodes=1 –-ntasks=12). If you request 12 nodes, Slurm is going to run code between those nodes, which could slow your job down significantly if it is not optimized to run between nodes.

If your code is optimized to run on more than two nodes and needs less than one hour to run, you can use the express partition. If your code needs to run on more than 2 nodes for more than one hour, you should apply to use the large partition. See the section Partition Access Request below for more information.

Partition Access Request

If you need access to the largelong, or multigpu partition, you need to submit a Partition Access Request; please note that access is not automatically granted. You will need to provide details and test results that demonstrate your need for access for these partitions.

If you need temporary access to multigpu to perform testing before applying for permanent access, you should also submit the Partition Access Request. All requests are evaluated by members of the RC team. Once reviewed and approved, you will be added to have access to that partition.