Difference between revisions of "SMT profiling with pmcstat and perf"

From RCS Wiki
Jump to navigation Jump to search
Line 8: Line 8:
 
==== SMT principles ====
 
==== SMT principles ====
  
===== SMT is not "multi-core" =====
+
===== SMT is "multi-thread" not "multi-core" =====
 +
This is an important distinction.  SMT is a technology that increases throughput of instructions through parallelization where there are under-used CPU components.  While SMT4 can support four threads per core and SMT8 can support eight threads per core, this is not an additional three and seven cores, respectively.  There are trade-offs and benefits.  Per-thread performance declines with increasing utilization of SMT levels, but overall performance and power consumption efficiency increase.  Note that IBM did not market SMT as "multi-core," while several media sites conflated SMT with increased core count.
  
 
==== Comparison to RISC-V HARTs ====
 
==== Comparison to RISC-V HARTs ====
Line 23: Line 24:
 
[[:File:POWER9_PMU_UG_v12_28NOV2018_pub.pdf|POWER9 Performance Monitoring Unit User Guide v12]]
 
[[:File:POWER9_PMU_UG_v12_28NOV2018_pub.pdf|POWER9 Performance Monitoring Unit User Guide v12]]
  
 +
https://www.ibm.com/docs/en/linux-on-systems?topic=linuxonibm/performance/tuneforsybase/smtsettings.htm
  
 
George Neville-Neil's brief tutorial on pmcstat:
 
George Neville-Neil's brief tutorial on pmcstat:
 
https://freebsdfoundation.org/wp-content/uploads/2014/03/Understanding-Application-and-System-Performance-with-HWPMC4.pdf
 
https://freebsdfoundation.org/wp-content/uploads/2014/03/Understanding-Application-and-System-Performance-with-HWPMC4.pdf

Revision as of 05:40, 15 July 2025

This article discusses profiling symmetric multithreading (SMT) on the POWER9 architecture. It uses both Big-endian FreeBSD with pmcstat and Debian with perf.

The knowledge presented here was derived from a variety of sources which can be found in the #Additional Resources section.


Symmetric multithreading (SMT)

SMT principles

SMT is "multi-thread" not "multi-core"

This is an important distinction. SMT is a technology that increases throughput of instructions through parallelization where there are under-used CPU components. While SMT4 can support four threads per core and SMT8 can support eight threads per core, this is not an additional three and seven cores, respectively. There are trade-offs and benefits. Per-thread performance declines with increasing utilization of SMT levels, but overall performance and power consumption efficiency increase. Note that IBM did not market SMT as "multi-core," while several media sites conflated SMT with increased core count.

Comparison to RISC-V HARTs

Benchmark code

pmcstat

perf

Additional Resources

POWER9 User Manual v21

POWER9 Performance Monitoring Unit User Guide v12

https://www.ibm.com/docs/en/linux-on-systems?topic=linuxonibm/performance/tuneforsybase/smtsettings.htm

George Neville-Neil's brief tutorial on pmcstat: https://freebsdfoundation.org/wp-content/uploads/2014/03/Understanding-Application-and-System-Performance-with-HWPMC4.pdf