MSI Queues¶
Publically available queues at MSI
System | Queue | Nodes | Cores/node | Mem/node | Walltime | Running jobs |
---|---|---|---|---|---|---|
lab | lab (default) | 73 | 8-32 | 15-128 | 72 | 6 |
lab-long | ? | 8 | 15 | 150 | 6 | |
lab-600 | ? | 8 | 128 | 600 | 1 | |
oc | 4 | 12 | 23 | 72 | 3 | |
Itasca | batch (default) | 1086 | 8 | 22 | 24 | 2 |
devel | 32 | 8 | 22 | 2 | ? | |
long | 28 | 8 | 22 | 48 | ? | |
jay | 1 | 8 | 22 | 24 | ? | |
sb | 35 | 16 | 64 | 48 | 2 | |
sb128 | 8 | 16 | 128 | 96 | 2 | |
sb256 | 8 | 16 | 256 | 96 | 2 | |
Mesabi | small (default) | 256 | 24 | 64 | 96 | ? |
large (default) | 360 | 24 | 64 | 24 | ? | |
ram256g | 32 | 24 | 256 | 96 | ? | |
ram1t | 16 | 32 | 1000 | 96 | ? | |
k40 | 40 | 24 | 128 | 24 | ? |
Notes:
- Jobs cannot request more than 1 node on the lab queue
- 8 nodes in the lab (default) queue have 128G of memory and 16-32 cores per node, the rest (65 nodes) have 15G of memory and 8 cores per node
- Jobs submitted to Mesabi’s default queue automatically routes jobs requesting 10 or more nodes to the “large” queue, and smaller jobs to the “small” queue.
Lab Cluster Queues¶
The Lab Cluster is the oldest, slowest general-purpose cluster at MSI. All queues on the Lab cluster only allow single-node jobs, but up to six jobs can run simultaneously (per user). The Lab cluster hardware is very old and is not considered a high-performance system. Your jobs will run much faster on Itasca or Mesabi. The “isub” command launches interactive jobs on this cluster (useful for general-purpose command-line work).
- lab (default)
- The is the main queue on the Lab cluster. Many nodes are available, but most are small. Job requesting 15G of memory or less and 8 nodes or less will have much shorter wait times in the queue than jobs requesting >15G memory or >8 nodes
- lab-long
- Useful if your job needs more than 72 hours of walltime
- lab-600
- Useful if your job needs more than 150 hours of walltime
- oc
- A queue for four overclocked nodes, particularily useful for serial (single-core) jobs. Also good for general use since these nodes are much newer than the other Lab nodes
Itasca Queues¶
Itasca is the second-tier general-purpose cluster at MSI, slower than Mesabi but faster than the Lab Cluster.
- batch (default)
- This is the main queue on Itasca. A huge number of nodes are available, but each node is not particularily powerful. Great for jobs than can make use of many nodes, and for general use
- devel
- This queue is for testing your pbs scripts. It works just like the batch queue, but you are limited to 2 hours of walltime and 32 nodes. The advantage is jobs on this queue have high priority, so jobs should start very quickly
- long
- Useful if your job needs more than 24 hours of walltime
- jay
- A queue for a special high-performance node with a high-speed internet connection
- sb
- Sandybridge queue (64G memory). The Sandybridge nodes are much more powerful than the standard Itasca nodes in the batch queue, but there aren’t very many of them. Great for single-node and smaller multi-node jobs. Large multi-node (>6) jobs tend to have long wait times in the queue. This queue has four times more nodes than the sb128 and sb256 queues, so use this queue unless you need more than 64G memory
- sb128
- Sandybridge queue (128G memory). Refer to sb queue
- sb256
- Sandybridge queue (256G memory). Refer to sb queue
Mesabi Queues¶
Mesabi is the newest, fastest general-purpose cluster at MSI that also contains some specialized hardware. MSI has not determined how queues will be set up on the new Mesabi system. However, it is likely that the queue structure will be derived from the hardware structure:
- small (default)
- This is the main queue on Mesabi for jobs requesting fewer than 10 nodes. Jobs may request partial nodes (like on the lab queue). A large number of powerful nodes are available. Great for smaller multi-node jobs or for large numbers of small jobs (including single-node, single-core jobs)
- large (default)
- This is the main queue on Mesabi for jobs requesting 10 or more nodes. A large number of powerful nodes are available. Great for jobs than can make use of many nodes.
- ram256 (mid-mem)
- Queue for general-purpose 256G memory nodes
- ram1t (high-mem)
- Queue for general-purpose 1T memory nodes. These nodes have 32 cores per node, so they are also good for jobs that scale well across multiple cores, but can’t make use of multiple nodes.
- k40 (gpu)
- Queue for nodes with NVidia Tesla GPUs, useful for jobs running software capable of using GPU accelerators