Metrics

The following metrics are under development (or being planned).

Each metric can be ascribed to a high level family, shown in the table below as the “Family” column. We likely will tweak and improve upon these categories.

Implemented Metrics

sys-hwloc

Hwloc or “portable hardware locality” can be used to look at the hardware of your system. There is a nice tutorial here for the default command that is run, “lstopo” that does exactly that - listing your hardware topology! Specifically we output a png image and machine spec for the default command, and this can be updated. This man page is recommended to see the different commands and options.

Name Description Type Default
command Change the default command to something else. string lstopo architecture.png && hwloc-ls machine.xml

The above saves a png image, and the machine data to xml. Note that if you need to copy the data post-run, you likely want to set interactive: true to keep it running.

perf-sysstat

This metric provides the “pidstat” executable of the sysstat library. The following options are available:

Name Description Type Default
color Set to turn on color parsing Anything set unset
pids For debugging, show consistent output of ps aux Anything set unset
threads add -t to each pidstat command to indicate wanting thread-level output unset
completions Number of times to run metric int32 unset (runs for lifetime of application or indefinitely)
rate Seconds to pause between measurements int32 10

By default color and pids are set to false anticipating log parsing. And we also provide the option to see “commands” or specific commands based on a job index to the metric. As an example, here is how we would ask to monitor two different commands for a launcher node (index 0) and the rest (workers).

- name: perf-sysstat
  options:
    pids: "true"

  # Custom options
  options:
    rate: 2

# Look for pids based on commands matched to index
  mapOptions:
    commands:
       # First set all to use the worker command, but give the lead broker a special command
       "all": /usr/libexec/flux/cmd/flux-broker --config /etc/flux/config -Scron.directory=/etc/flux/system/cron.d -Stbon.fanout
       "0": /usr/bin/python3.8 /usr/libexec/flux/cmd/flux-submit.py -n 2 --quiet --watch lmp -v x 2 -v y 2 -v z 2 -in in.reaxc.hns -nocite

In the map above, order matters, as the command for all indices is first set to be the flux-broker one, and then after the index at 0 gets a custom command. See pidstat for more information on this command, and this file for how we use them. If there is an option or command that is not exposed that you would like, please open an issue.

io-fio

This is a nice tool that you can simply point at a path, and it measures IO stats by way of writing a file there! Options you can set include:

Name Description Type Default
testname Name for the test string test
blocksize Size of block to write. It defaults to 4k, but can be set from 256 to 8k. string 4k
iodepth Number of I/O units to keep in flight against the file. int 64
size Total size of file to write string 4G
directory Directory (usually mounted) to test. string /tmp
pre Custom logic / command to run before Fio string unset
post Custom logic / command to run after Fio (e.g., cleanup) string unset
prefix Prefix to add to running fio commands (like a wrapper) string unset

For the “directory” we use this location to write a temporary file, which will be cleaned up. This allows for testing storage mounted from multiple metric pods without worrying about a name conflict.

io-ior

img/ior.jpeg

Ior is a really nice IO tool that is now a combination of its previous self and the mdtest tool. We expose a simple set of the working directory and command that you want to run, and the rest is up to you!

Name Description Type Default
command The default ior command string ior -w -r -o testfile
workdir The working directory for the command string /opt/ior

The getting started tutorial is great for seeing how basic commands are done. Note that the container does have mpirun if you want to use it. We don’t have support for this across nodes, but this could be added. Let us know if this would be interesting to you.

io-sysstat

This is the “iostat” executable of the sysstat library.

Name Description Type Default
human Show tabular, human-readable output inside of json string "true" or "false" "false"
completions Number of times to run metric int32 unset (runs for lifetime of application or indefinitely)
rate Seconds to pause between measurements int32 10
pre One or more commands to run before iostat string unset
post One or more commands to run after iostat string unset

This is good for mounted storage that can be seen by the operating system, but may not work for something like NFS.

dlio

While this is a simple performance tool not coded into the Metrics Operator (it is installed on the fly to your container with pip and you minimally require hwloc) it generates pretty cool data that can be visualized with perfetto!

You can see the full example above. It is just installing a library with pip, and then ensuring the tool LD_PRELOAD is set as the prefix. I added sleep infinity to the end to copy over output data at the end.

network-netmark

This is currently a private container/software, but we have support for it when it’s ready to be made public (networking) Variables to customize include:

Name Description Option Key Type Default
tasks Total number of tasks across pods options->tasks string nproc * pods
warmups Number of warmups options->warmups int32 10
trials Number of trials options->trials int32 20
sendReceiveCycles Number of send-receive cycles options-sendReceiveCycles int32 20
messageSize Message size in bytes options->messageSize int32 0
storeEachTrial Flag to indicate storing each trial data options->storeEachTrial string (true/false) "true"
soleTenancy Turn off sole tenancy (one pod/node) options->soleTenancy string ("false" or "no") "true"

network-osu-benchmark

Point to point benchmarks for MPI (networking). If listOptions->commands not set, will use all one-point commands. Variables to customize include:

Name Description Option Key Type Default
commands Custom list of osu-benchmark one-sided commands to run listOptions->commands array unset uses default set
soleTenancy Turn off sole tenancy (one pod/node) string ("false" or "no") "true"
all Run ALL benchmarks with defaults string ("true" or "yes") "false"
flags Overwrite defaults flags (experts only!) string Defaults to an ideal set per metric (see osu-benchmark.go)
timed String "true" or "yes" to add time prefix to mpirun (for debugging, etc) string "false"
sleep Number of seconds to sleep to wait for network to be ready int32 60

By default, we run a subset of commands:

  • osu_get_acc_latency

  • osu_acc_latency

  • osu_fop_latency

  • osu_get_latency

  • osu_put_latency

  • osu_allreduce

  • osu_latency

  • osu_bibw

  • osu_bw

However all of the following are available for MPI

Commands available for OSU Benchmarks
.
|-- collective
|   |-- osu_allgather
|   |-- osu_allgatherv
|   |-- osu_allreduce
|   |-- osu_alltoall
|   |-- osu_alltoallv
|   |-- osu_barrier
|   |-- osu_bcast
|   |-- osu_gather
|   |-- osu_gatherv
|   |-- osu_iallgather
|   |-- osu_iallgatherv
|   |-- osu_iallreduce
|   |-- osu_ialltoall
|   |-- osu_ialltoallv
|   |-- osu_ialltoallw
|   |-- osu_ibarrier
|   |-- osu_ibcast
|   |-- osu_igather
|   |-- osu_igatherv
|   |-- osu_ireduce
|   |-- osu_iscatter
|   |-- osu_iscatterv
|   |-- osu_reduce
|   |-- osu_reduce_scatter
|   |-- osu_scatter
|   `-- osu_scatterv
|-- one-sided
|   |-- osu_acc_latency
|   |-- osu_cas_latency
|   |-- osu_fop_latency
|   |-- osu_get_acc_latency
|   |-- osu_get_bw
|   |-- osu_get_latency
|   |-- osu_put_bibw
|   |-- osu_put_bw
|   `-- osu_put_latency
|-- pt2pt
|   |-- osu_bibw
|   |-- osu_bw
|   |-- osu_latency
|   |-- osu_latency_mp
|   |-- osu_latency_mt
|   |-- osu_mbw_mr
|   `-- osu_multi_lat
`-- startup
    |-- osu_hello
    `-- osu_init

Note that not all of these have been tested on our setups, so if you have any questions please let us know. Here are some useful resources for the benchmarks:

app-lammps

Since we were using LAMMPS so often as a benchmark (and testing timing of a network) it made sense to add it here as a standalone metric! Although we are doing MPI with communication via SSH, this can still serve as a means to assess performance.

Name Description Option Key Type Default
command The full mpirun and lammps command options->command string (see below)
workdir The working directory for the command options->workdir string /opt/lammps/examples/reaxff/HNS#
soleTenancy require each pod to have sole tenancy command->soleTenancy string "false"

For inspection, you can see all the examples provided in the LAMMPS GitHub repository. The default command (if you don’t change it) intended as an example is:

mpirun --hostfile ./hostlist.txt -np 2 --map-by socket lmp -v x 2 -v y 2 -v z 2 -in in.reaxc.hns -nocite(e

In the working directory /opt/lammps/examples/reaxff/HNS#. You should be calling mpirun and expecting a ./hostlist.txt in the present working directory (the “workdir” you chose above). You should also provide the correct number of processes (np) and problem size for LAMMPS (lmp). We left this as open and flexible anticipating that you as a user would want total control.

app-amg

AMG means “algebraic multi-grid” and it’s easy to confuse with the company AMD “Advanced Micro Devices” ! From the guide:

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided for this benchmark builds linear systems for a 3D problem with a 27-point stencil and generates two different problems that are described in section D of the AMG.readme file in the docs directory.

Here are examples of small and medium problem sizes provided in that same guide. Each of these would be given to MPI (mpirun), but srun is provided as an example instead.

# Small size problems
srun –N 32 –n 512 amg –problem 1 –n 96 96 96 –P 8 8 8
srun –N 32 –n 512 amg –problem 2 –n 40 40 40 –P 8 8 8
srun –N 64 –n 1024 amg –problem 1 –n 96 96 96 –P 16 8 8
srun –N 64 –n 1024 amg –problem 2 –n 40 40 40 –P 16 8 8

# Medium size problems
srun –N 512 –n 8192 amg –problem 1 –n 96 96 96 –P 32 16 16
srun –N 512 –n 8192 amg –problem 2 –n 40 40 40 –P 32 16 16
srun –N 1024 –n 16384 amg –problem 1 –n 96 96 96 –P 32 32 16
srun –N 1024 –n 16384 amg –problem 2 –n 40 40 40 –P 32 32 16

By default, akin to LAMMPS we expose the entire mpirun command along with the working directory for you to adjust.

Name Description Option Key Type Default
command The amg command (without mpirun) options->command string (see below)
prefix The prefix (mpirun command and arguments) options->mpirun string (see below)
workdir The working directory for the command options->workdir string /opt/AMG

By default, when not set, you will just run the amg binary to get a test case run:

# mpirun
mpirun --hostfile ./hostlist.txt

# command
amg

# Assembled into
mpirun --hostfile ./hostlist.txt ./problem.sh

More likely you want an actual problem size on a specific number of node and tasks, and you’ll want to test this. The two problem sizes include:

  • problem 1 (default) will use conjugate gradient preconditioned with AMG to solve a linear system with a 3D 27-point stencil of size nxnynzPxPy*Pz.

  • problem 2 simulates a time-dependent problem of size nxnynzPxPy*Pz with AMG-GMRES. The linear system is also a 3D 27-point stencil. The system is sized to be 5-10% of the large problem.

NOTE that the Python parser was written for the test case, and likely we need to extend it to problem 2 or larger sized problems. If you run a larger problem and the parser does not work as expected, please send us the output and we will provide an updated parser. See this guide for more detail.

app-cabanaPIC

This is a particle in cell simulation that is experimental because it does not seem to support multiple nodes yet (but should).

Name Description Option Key Type Default
command The full command to run options->command string cbnpic
workdir The working directory for the command options->workdir string /opt/cabanaPIC/build

app-quicksilver

Quicksilver is a proxy app for Monte Carlo simulation code. You can learn more about it on the GitHub repository. By default, akin to other apps we expose the entire mpirun command along with the working directory for you to adjust.

Name Description Option Key Type Default
command The qs command (without mpirun) options->command string (see below)
prefix The prefix (mpirun command and arguments) options->mpirun string (see below)
workdir The working directory for the command options->workdir string /opt/AMG

By default, when not set, you will just run the qs (quicksilver) binary on a sample problem, represented by an input text file:

# mpirun
mpirun --hostfile ./hostlist.txt

# command
qs /opt/quicksilver/Examples/CORAL2_Benchmark/Problem1/Coral2_P1.inp

# Assembled into problem.sh as follows:
mpirun --hostfile ./hostlist.txt ./problem.sh

There are many problems that come in the container, and here are the fullpaths:

# Example command
qs /opt/quicksilver/Examples/CORAL2_Benchmark/Problem1/Coral2_P1.inp

# All examples:
/opt/quicksilver/Examples/AllScattering/scatteringOnly.inp
/opt/quicksilver/Examples/NoCollisions/no.collisions.inp
/opt/quicksilver/Examples/NonFlatXC/NonFlatXC.inp
/opt/quicksilver/Examples/CORAL2_Benchmark/Problem2/Coral2_P2_4096.inp
/opt/quicksilver/Examples/CORAL2_Benchmark/Problem2/Coral2_P2.inp
/opt/quicksilver/Examples/CORAL2_Benchmark/Problem2/Coral2_P2_1.inp
/opt/quicksilver/Examples/CORAL2_Benchmark/Problem1/Coral2_P1.inp
/opt/quicksilver/Examples/CORAL2_Benchmark/Problem1/Coral2_P1_1.inp
/opt/quicksilver/Examples/CORAL2_Benchmark/Problem1/Coral2_P1_4096.inp
/opt/quicksilver/Examples/CTS2_Benchmark/CTS2.inp
/opt/quicksilver/Examples/CTS2_Benchmark/CTS2_36.inp
/opt/quicksilver/Examples/CTS2_Benchmark/CTS2_1.inp
/opt/quicksilver/Examples/AllAbsorb/allAbsorb.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem_v4_ts.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem_v5_ts.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem_v3_wq.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem_v7_ts.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem_v4_tm.inp
/opt/quicksilver/Examples/Homogeneous/homogeneousProblem_v3.inp
/opt/quicksilver/Examples/AllEscape/allEscape.inp
/opt/quicksilver/Examples/NoFission/noFission.inp

You can also look more closely in the GitHub repository.

app-pennant

Pennant is an unstructured mesh hydrodynamics for advanced architectures. The documentation is sparse, but you can find the source code on GitHub. By default, akin to other apps we expose the entire mpirun prefix and command along with the working directory for you to adjust.

Name Description Option Key Type Default
command The pennant command (without mpirun) options->command string (see below)
prefix The prefix (mpirun command and arguments) options->mpirun string (see below)
workdir The working directory for the command options->workdir string /opt/AMG

By default, when not set, you will just run pennant on a test problem, represented by an input text file:

# mpirun
mpirun --hostfile ./hostlist.txt

# command
pennant /opt/pennant/test/sedovsmall/sedovsmall.pnt

# Assembled into problem.sh as follows:
mpirun --hostfile ./hostlist.txt ./problem.sh

There are many input files that come in the container, and here are the fullpaths in /opt/pennant/test:

Input files available to pennant
|-- leblanc
|   |-- leblanc.pnt
|   |-- leblanc.xy.std
|   `-- leblanc.xy.std4
|-- leblancbig
|   `-- leblancbig.pnt
|-- leblancx16
|   `-- leblancx16.pnt
|-- leblancx4
|   `-- leblancx4.pnt
|-- leblancx48
|   `-- leblancx48.pnt
|-- leblancx64
|   `-- leblancx64.pnt
|-- noh
|   |-- noh.pnt
|   |-- noh.xy.std
|   `-- noh.xy.std4
|-- nohpoly
|   `-- nohpoly.pnt
|-- nohsmall
|   |-- nohsmall.pnt
|   |-- nohsmall.xy.std
|   `-- nohsmall.xy.std4
|-- nohsquare
|   `-- nohsquare.pnt
|-- sample_outputs
|   |-- edison
|   |   |-- leblancbig.thr1.out
|   |   |-- leblancx16.thr1024.out
|   |   |-- leblancx4.thr16.out
|   |   |-- leblancx64.mpi2048.out
|   |   `-- nohpoly.thr1.out
|   `-- vulcan
|       |-- leblancx16.out
|       |-- leblancx48.out
|       |-- sedovflat.out
|       |-- sedovflatx16.out
|       |-- sedovflatx4.out
|       `-- sedovflatx40.out
|-- sedov
|   |-- sedov.pnt
|   |-- sedov.xy.std
|   `-- sedov.xy.std4
|-- sedovbig
|   `-- sedovbig.pnt
|-- sedovflat
|   `-- sedovflat.pnt
|-- sedovflatx120
|   `-- sedovflatx120.pnt
|-- sedovflatx16
|   `-- sedovflatx16.pnt
|-- sedovflatx4
|   `-- sedovflatx4.pnt
|-- sedovflatx40
|   `-- sedovflatx40.pnt
`-- sedovsmall
    |-- sedovsmall.pnt
    |-- sedovsmall.xy
    |-- sedovsmall.xy.std
    `-- sedovsmall.xy.std4

And likely you will need to adjust the mpirun parameters, etc.

app-kripke

Kripke is (from the README):

Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. A main goal of Kripke is investigating how different data-layouts affect instruction, thread and task level parallelism, and what the implications are on overall solver performance.

Akin to AMG, we allow you to modify each of the mpirun and kripke commands via:

Name Description Option Key Type Default
command The amg command (without mpirun) options->command string (see below)
prefix The prefix (mpirun command and arguments) options->mpirun string (see below)
workdir The working directory for the command options->workdir string /opt/AMG

By default, when not set, you will just run the kripke binary to get a test case run, so mpirun is set to be blank.

# mpirun is blank
""
# But could be an actual mpirun command
mpirun --hostfile ./hostlist.txt

# command written to problem.sh
kripke

# Assembled into
mpirun --hostfile ./hostlist.txt ./problem.sh

There is a nice guide here that can help you to decide on your specific command or problem size. Also note that we expose the following executables built with it:

ex1_vector-addition            ex4_atomic-histogram                ex7_nested-loop-reorder
ex1_vector-addition_solution   ex4_atomic-histogram_solution       ex7_nested-loop-reorder_solution
ex2_approx-pi                  ex5_line-of-sight                   ex8_tiled-matrix-transpose
ex2_approx-pi_solution         ex5_line-of-sight_solution          ex8_tiled-matrix-transpose_solution
ex3_colored-indexset           ex6_stencil-offset-layout           ex9_matrix-transpose-local-array
ex3_colored-indexset_solution  ex6_stencil-offset-layout_solution  ex9_matrix-transpose-local-array_solution

(meaning on the PATH in /opt/Kripke/build/bin in the container). For apps / metrics to be added, please see this issue.

app-ldms

LDMS is “a low-overhead, low-latency framework for collecting, transferring, and storing metric data on a large distributed computer system” and is packaged alongside ovis-hpc. While there are complex aggregator setups we could run, for this simple metric we simply run (on each separate pod/node). The following variables are supported:

Name Description Type Default
command The command to issue to ldms_ls (or that) string (see below)
workdir The working directory for the command string /opt
completions Number of times to run metric int32 unset (runs for lifetime of application or indefinitely)
rate Seconds to pause between measurements int32 10

The following is the default command:

ldms_ls -h localhost -x sock -p 10444 -l -v

app-nekbone

Nekbone comes with a set of example that primarily depend on you choosing the correct workikng directory and command to run from. You can do this via these primary two commands:

Name Description Option Key Type Default
command The full mpirun and nekbone command options->command string (see below)
workdir The working directory for the command options->workdir string /root/nekbone-3.0/test

And the following combinations are supported. Note that example1 did not build, and example2 is the default (if you don’t set these variables).

Command Workdir
mpiexec --hostfile ./hostlist.txt -np 2 ./nekbone /root/nekbone-3.0/test/example2
mpiexec --hostfile ./hostlist.txt -np 2 ./nekbone /root/nekbone-3.0/test/example3
mpiexec --hostfile ./hostlist.txt -np 2 ./nekbone /root/nekbone-3.0/test/nek_comm
mpiexec --hostfile ./hostlist.txt -np 2 ./nekbone /root/nekbone-3.0/test/nek_mgrid
mpiexec --hostfile ./hostlist.txt -np 2 ./nekbone /root/nekbone-3.0/test/nek_delay

You can see the archived repository here. If there are interesting metrics in this project it would be worth bringing it back to life I think.

app-laghos

From the Laghos README:

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping.

Akin to other apps, you can customize the command and workdir. Note that the laghos executable is at /workflow/laghos and not on the path, so the default references it as ./laghos.

Name Description Option Key Type Default
command The full mpirun and laghos command options->command string (see below)
workdir The working directory for the command options->workdir string /workdir/laghos

app-bdas

BDAS standards for “Big Data Analysis Suite” and you can read more about it here. The container has machine learning analyses (provided in R) that work with MPI (openmpi), The benchmarks are in /opt/bdas/benchmarks/r in the container, and we provide an example for princomp (see default command below):

Name Description Option Key Type Default
command The full mpirun and Rscript command options->command string (see below)
workdir The working directory for the command options->workdir string /opt/bdas/benchmarks/r
# This is the default command. You must target the --hostfile and use the allow as root flag!
mpirun --allow-run-as-root -np 4 --hostfile ./hostlist.txt Rscript /opt/bdas/benchmarks/r/princomp.r 250 50

Try setting the logging->interactive: true option in the spec to keep the container running and explore other benchmarks. These are the ones I’ve tried:

# This is the default command. You must target the --hostfile and use the allow as root flag!
mpirun --allow-run-as-root -np 4 --hostfile ./hostlist.txt Rscript /opt/bdas/benchmarks/r/kmeans.r 250 50
mpirun --allow-run-as-root -np 4 --hostfile ./hostlist.txt Rscript /opt/bdas/benchmarks/r/svm.r 250 50

Containers

To see all associated app containers, look at the converged-computing/metrics-container repository (with Dockerfiles and automation) and associated packages.


Last update: Nov 27, 2023