Serial Codes Allocating Multiple CPU-Cores
This alert identifies users that are apparently running serial codes while allocating multiple CPU-cores. A serial code can only use 1 CPU-core so the additional allocated cores are wasted.
Configuration File
Below is an example entry for config.yaml
:
serial-allocating-multiple-1:
cluster: della
partitions:
- cpu
min_run_time: 61 # minutes
cpu_hours_threshold: 100 # cpu-hours
lower_ratio: 0.85 # [0.0, 1.0]
num_top_users: 5 # count
email_file: "serial_allocating_multiple.txt"
admin_emails:
- admin@institution.edu
The parameters are explained below:
-
cluster
: Specify the cluster name as it appears in the Slurm database. -
partitions
: Specify one or more Slurm partitions. -
cpu_hours_threshold
: A user must have apparently wasted this number of CPU-hours to be considered. -
lower_ratio
: This alert works by comparing the CPU efficiency of the job to 100% divided by the number of allocated CPU-cores. If the ratio of these two numbers of greater than or equal tolower_ratio
then the job is assumed to be a serial code wasting CPU-cores. For instance, if the CPU efficieny of a job that allocates 8 CPU-cores is 12.4% andlower_ratio
is 0.85 then this job would be included since 12.4% / (100% / 8) is greater than 0.85. Default: 0.85 -
email_file
: The text file to be used for the email message to users. -
min_run_time
: (Optional) Minimum run time of a job in units of minutes. Ifmin_run_time: 61
then jobs that ran for an hour or less are ignored. Default: 0 -
cores_per_node
: (Optional) Number of CPU-cores per node. If this is defined then the<NUM-NODES>
placeholder will be available. -
num_top_users
: (Optional) Only consider up to this number of users after sorting by "wasted CPU-hours". -
include_running_jobs
: (Optional) IfTrue
then jobs in a state ofRUNNING
will be included in the calculation. The Prometheus server must be queried for each running job, which can be an expensive operation. Default: False -
nodelist
: (Optional) Only apply this alert to jobs that ran on the specified nodes. See example. -
excluded_users
: (Optional) List of users to exclude from receiving emails. -
admin_emails
: (Optional) List of administrator email addresses that should receive copies of the emails that are sent to users. -
email_subject
: (Optional) Subject of the email message to users. -
report_title
: (Optional) Title of the report to system administrators.
Report for System Administrators
Below is an example report:
$ python job_defense_shield.py --serial-allocating-multiple
Serial Jobs Allocating Multiple CPU-Cores
------------------------------------------------------------------------
User CPU-Hours-Wasted AvgCores Jobs JobID Emails
------------------------------------------------------------------------
1 u74805 10248 8 96 62829596,62834509+ 2 (3)
2 u31448 906 30 2 62887370,62887407 0
3 u93676 783 3 20 62904738,62904739+ 1 (10)
4 u16959 325 15 6 62896573,62912383+ 4 (17)
5 u36725 164 12 11 62814801,62814802+ 1 (10)
------------------------------------------------------------------------
Cluster: della
Partitions: cpu
Start: Sun Mar 09, 2025 at 10:21 PM
End: Sun Mar 16, 2025 at 10:21 PM
Email Message to Users
Below is an example email (see email/serial_allocating_multiple.txt
):
Hello Alan (u12345),
Below are your jobs that ran on della (cpu) in the past 7 days:
JobID Partition CPU-Cores CPU-Util 100%/CPU-cores Hours
62599800 cpu 4 24.8% 25.0% 48
62599800 cpu 4 24.9% 25.0% 48
62695534 cpu 12 8.3% 8.3% 24
62719003 cpu 12 8.3% 8.3% 24
The CPU utilization (CPU-Util) of each job above is approximately equal to
100% divided by the number of allocated CPU-cores (100%/CPU-cores). This
suggests that you may be running a code that can only use 1 CPU-core. If this is
true then allocating more than 1 CPU-core is wasteful. A good target value for
CPU utilization is 90% and above.
If the code cannot run in parallel then please use the following Slurm
directives:
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
Replying to this automated email will open a support ticket with Research
Computing.
Placeholders
The following placeholders can be used in the email file:
<GREETING>
: The greeting generated bygreeting-method
.<CLUSTER>
: The cluster specified for the alert.<PARTITIONS>
: A comma-separated list of the partitions used by the user.<CASE>
: The rank of the user by CPU-hours.<DAYS>
: Number of days in the time window (default is 7).<CPU-HOURS>
: Total number of CPU-hours wasted if the jobs were in fact running serial codes. This quantity is the elapsed time of the job multiplied by the number of allocated core minus one (summed over all jobs).<NUM-JOBS>
: Total number of jobs with at least one idle GPU.<TABLE>
: Table of job data.<JOBSTATS>
: Thejobstats
command for the first job of the user.
If cores_per_node
is defined in the alert then one additional placeholder is available:
<NUM-NODES>
: Number of nodes wasted. This is calculated as the wasted CPU-hours divided by the product of the number of cores per node and the number of hours in the time window. One strategy is to setcpu_hours_threshold
large enough so that<NUM-NODES>
is always greater than or equal to 1. Note that theround
function is applied when computing this quantity.
Usage
Generate a report for system administrators:
Send emails to the offending users:
See which users have received emails and when:
cron
Below is an example crontab
entry: