CPU Job Statistics
Slurm has to be configured to track job accounting data via the cgroup plug-in. This requires the following line in slurm.conf
:
The above is in addition to the other usual cgroup-related plug-ins/settings:
Slurm will then create two top-level cgroup directories for each job, one for CPU utilization and one for CPU memory. Within each directory there will be subdirectories: step_extern
, step_batch
, step_0
, step_1
, and so on. Within these directories one finds task_0
, task_1
, and so on. These cgroups are scraped by a cgroup exporter. The table below lists all of the collected fields:
Name | Description | Type |
---|---|---|
cgroup_cpu_system_seconds |
Cumulative CPU system seconds for jobid | gauge |
cgroup_cpu_total_seconds |
Cumulative CPU total seconds for jobid | gauge |
cgroup_cpu_user_seconds |
Cumulative CPU user seconds for jobid | gauge |
cgroup_cpus |
Number of CPUs in the jobid | gauge |
cgroup_memory_cache_bytes |
Memory cache used in bytes | gauge |
cgroup_memory_fail_count |
Memory fail count | gauge |
cgroup_memory_rss_bytes |
Memory RSS used in bytes | gauge |
cgroup_memory_total_bytes |
Memory total given to jobid in bytes | gauge |
cgroup_memory_used_bytes |
Memory used in bytes | gauge |
cgroup_memsw_fail_count |
Swap fail count | gauge |
cgroup_memsw_total_bytes |
Swap total given to jobid in bytes | gauge |
cgroup_memsw_used_bytes |
Swap used in bytes | gauge |
cgroup_uid |
UID number of user running this job | gauge |
The cgroup exporter used here is based on the exporter by Trey Dock [1] with additional parsing of the jobid
, steps
, tasks
and UID
number. This produces an output that resembles (e.g., for system seconds):
Note that the UID of the owning user is stored as a gauge in cgroup_uid
:
This is because accounting is job-oriented and having a UID of the user as a label would needlessly increase the cardinality of the data in Prometheus. All other fields are alike with jobid
, step
and task
labels.
The totals for a job have an empty step
and task
, for example:
This is due to the organization of the cgroup hierarchy. Consider the directory:
Within this directory, one finds the following subdirectories:
job_247463/cpuacct.usage_user
job_247463/step_extern/cpuacct.usage_user
job_247463/step_extern/task_0/cpuacct.usage_user
This is the data most often retrieved and parsed for overall job efficiency which is why by default the cgroup_exporter
does not parse step
or task
data. To collect all of it, add the --collect.fullslurm option
. We run the cgroup_exporter
with these options:
The --config.paths /slurm
has to match the path used by Slurm under the top cgroup directory. This is usually a path that is something like /sys/fs/cgroup/memory/slurm
.