Performance Metric 00001
From Siwiki
Contents |
[edit] 00001: CPU Utilization by System
Nothing to see here (yet) - this is a draft page for a draft concept. See the scratchpad.
Metric: 00001
Version: 20070509
[edit] Description
- A system wide CPU utilization percent, measured over a time interval.
[edit] Units
- percent per interval
[edit] Minimum
- 0
[edit] Maximum
- 100
[edit] Category
- secondary
[edit] Importance
- high
[edit] Rules
- none
[edit] Suggested Interpretation
- 0 is idle
- 80 to 100 is busy
- 100 busy does not imply saturation
- busy often degrades performance, but not always
- unusually high utilization may be a software error, such as a "run away" process
[edit] Caveats
- a long interval may hide short bursts of CPU utilization
- a "run away" process or thread usually pegs a single CPU at 100%, however for a multi-CPU server this will be divided by the CPU count - making this error state difficult to identify (eg, spotting a 3% rise for a 32 CPU server).
- some servers run fine at 100% utilization, without a noticable performance issue
[edit] Expected Degradation Profile
- Gradual {cool picture here}
[edit] Suggested Further Investigation
- CPU utilization by CPU
- CPU utilization by process/thread
- CPU saturation by *
[edit] Suggested Action to Improve Performance
- If your system has high CPU utilization which you suspect is degrading performance, the following may help:
- Identify and eliminate unnecessary CPU work
- Tune or reprogram applications to use more CPUs
- Install faster CPUs
- Install more CPUs
[edit] Notes
- busy CPUs may cause a negligible performance effect on a system bounded by slower resource, such as disk I/O
- some idle may be good, this leaves headroom for bursts of activity
- plenty of idle may improve application performance slightly, as it may leave hardware caches "warm".
[edit] Technical Details
- For a single CPU or virtual CPU (eg, hardware thread): this metric is the percentge of time during the interval that the CPU did not spend in an idle state, such as running the system idle thread. ie, the percentage of time that the CPU ran user and kernel code. This time includes CPU cycles stalled waiting for memory bus requests to main memory.
- For a multiple CPUs: the sum of the single CPU percentages, divided by the number of CPUs. A maximum of 100 corresponds to all CPUs at 100 percent utilized for that interval.
[edit] Solaris 10
[edit] Method #1 - vmstat [interval]
[edit] Type
- approximation
[edit] Difficulty
- easy
[edit] How
- Run vmstat with an interval, eg, "vmstat 5" (seconds). The first line of output is the summary since boot.
- CPU utilization by system is either,
- cpu/us + cpu/sy
- or,
- 100 - cpu/id
[edit] Example #1
# vmstat 5 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr cd cd cd cd in sy cs us sy id 0 0 0 6354972 6875536 25 383 7 0 0 0 6 10 0 0 0 1705 2898 1461 3 2 95 0 0 0 6005640 6500772 1 5 0 0 0 0 0 8 0 0 0 1700 24460 2128 50 11 39 0 0 0 6005628 6500784 0 1 0 0 0 0 0 0 0 0 0 1582 24582 2078 50 11 39 0 0 0 6005616 6500780 0 0 0 0 0 0 0 8 0 0 0 1649 24353 2072 50 11 39 ^C
- the first line show the average since boot at 3% user + 2% system = 5% average CPU utilization by system
- second line onwards are three second averages, which show 50% user + 11% system = 61% average CPU utilization by system
[edit] Features
- can be run as any user
- other useful metrics are displayed in the output
- lightweight
- based on high resolution microstates
[edit] Caveats
- this CPU utilization metric includes main memory bus wait cycles; a system with heavy memory bus activity will appear as heavy CPU utilization.
[edit] Technical Details
- This value is the average of CPU utilizations as provided by kstat. kstat reads this value from CPU microstates.
[edit] Generating Test Load
- A one-liner to generate high CPU load is,
$ perl -e 'while (--$ARGV[0] and fork) {}; while () {}' 4
- Change "4" to equal your CPU count. Hit Ctrl-C to end the load.
