CPU/Processor

From Siwiki

Jump to: navigation, search

Key resources to any computer system are the central processing units (CPUs). Many modern systems from Sun boast numerous CPUs or virtual CPUs (which may be cores or hardware threads). The CPUs are shared by applications on the system, according to a policy prescribed by the operating system and scheduler (see Chapter 3 in Solaris Internals).

If the system becomes CPU resource limited, then application or kernel threads have to wait on a queue to be scheduled on a processor, potentially degrading system performance. The time spent on these queues, the length of these queues and the utilization of the system processor are important metrics for quantifying CPU-related performance bottlenecks. In addition, we can directly measure CPU utilization and wait states in various forms by using DTrace.



Contents

[edit] Which Tools should I use for CPU Analysis?

A number of different tools analyze CPU activity. The following summarizes both these tools and the topics covered in this section.

  • Utilization - Overall CPU utilization can be determined from the idle (id) field from vmstat, and the user (us) and system (sy) fields indicate the type of activity. Heavy CPU saturation is more likely to degrade performance than is CPU utilization.
  • Saturation - The run queue length from vmstat (kthr:r) can be used as a measure of CPU saturation, as can CPU latency time from prstat -m.
  • Load averages - These numbers, available from both the uptime and prstat commands, provide 1-, 5-, and 15-minute averages that combine both utilization and saturation measurements. This value can be compared to other servers if divided by the CPU count.
  • History - sar can be activated to record historical CPU activity. This data can identify long-term patterns; it also provides a reference for what CPU activity is normal.
  • Per-CPU utilization - mpstat lists statistics by CPU, to help identify application scaling issues should CPU utilization be unbalanced.
  • CPU by process - Commands such as ps and prstat can be used to identify CPU consumption by process.
  • Microstate accounting - High-resolution time counters track several states for user threads; prstat -m reports the results.
  • DTrace analysis - DTrace can analyze CPU consumption in depth and can measure events in minute detail.


Tools for CPU Analysis
Tool Uses Description
vmstat Kstat For an initial view of overall CPU behavior
psrinfo Kstat For physical CPU properties
uptime getloadavg() For the load averages, to gauge recent CPU activity
sar Kstat, sadc For overall CPU behavior, and dispatcher queue statistics; sar also allows historical data collection
mpstat Kstat For per-CPU statistics
prstat procfs To identify process CPU consumption
dtrace DTrace For detailed analysis of CPU activity, including scheduling events and dispatcher analysis

[edit] How do I measure CPU Utilization and Load?

[edit] vmstat

The vmstat tool provides a glimpse of the system's behavior on one line and is often the first command you run to familiarize yourself with a system. It is useful here because it indicates both CPU utilization and saturation on one line.

$ vmstat 5
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr dd f0 s1 --   in   sy   cs us sy id
 0 0 0 1324808 319448 1   2  2  0  0  0  0  0  0  0  0  403   21   54  0  1 99
 2 0 0 1318528 302696 480 6 371 0  0  0  0 73  0  0  0  550 5971  190 84 16  0
 3 0 0 1318504 299824 597 0 371 0  0  0  0 48  0  0  0  498 8529  163 81 19  0
 2 0 0 1316624 297904 3   0 597 0  0  0  0 91  0  0  0  584 2009  242 84 16  0
 2 0 0 1311008 292288 2   0 485 0  0  0  0 83  0  0  0  569 2357  252 77 23  0
 2 0 0 1308240 289520 2   0 749 0  0  0  0 107 0  0  0  615 2246  290 82 18  0
 2 0 0 1307496 288768 5   0 201 0  0  0  0 58  0  0  0  518 2231  210 79 21  0
...


[edit] psrinfo

To determine the number of processors in the system and their speed, use the psrinfo -v command. In Solaris 10, -vp prints additional information.

$ psrinfo -vp
The physical processor has 1 virtual processor (0)
  UltraSPARC-III  (portid 0 impl 0x15 ver 0x23 clock 900 MHz)
The physical processor has 1 virtual processor (1)
  UltraSPARC-III  (portid 1 impl 0x15 ver 0x23 clock 900 MHz)


[edit] uptime

The uptime command is a quick way to print the CPU load averages.

$ uptime
 12:29am  up 274 day(s), 6 hr(s),  7 users,  load average: 2.00, 1.07, 0.46

The numbers are the 1-, 5-, and 15-minute load averages. They represent both utilization and saturation of the CPUs. Put simply, a value equal to your CPU count usually means 100% utilization; less than your CPU count is proportionally less than 100% utilization; and greater than your CPU count is a measure of saturation. To compare a load average between servers, divide the load average by the CPU count for a consistent metric.

By providing the 1-, 5-, and 15-minute averages, recently increasing or decreasing CPU load can be identified. The previous uptime example demonstrates an increasing profile (2.00, 1.07, 0.46).


[edit] sar

The system activity reporter (sar) can provide live statistics or can be activated to record historical CPU statistics. This can be of tremendous value because you may identify long-term patterns that you might have missed when taking a quick look at the system. Also, historical data provides a reference for what is "normal" for your system.

The following example shows the default output of sar, which is also the -u option to sar. An interval of 1 second and a count of 5 were specified.

$ sar 1 5
SunOS titan 5.11 snv_16 sun4u    02/27/2006
03:20:42    %usr    %sys    %wio   %idle
03:20:43      82      17       0       1
03:20:44      92       8       0       0
03:20:45      91       9       0       0
03:20:46      94       6       0       0
03:20:47      93       7       0       0
Average       91       9       0       0

sar has printed the user (%usr), system (%sys), wait I/O (%wio), and idle times (%idle). User, system, and idle are also printed by the vmstat command and are defined in Table 2-2. The following are some additional points.

[edit] mpstat

The mpstat command summarizes the utilization statistics for each CPU. The following output is an example from a system with 32 CPUs.

$ mpstat 1
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0  279   267  112  106    7    7   85    0   219   85   2   0  13
  1    1   0  102    99    0  177    9   15  119    2   381   72   3   0  26
  2    2   0   75   130    0  238   19   11   98    5   226   70   3   0  28
  3    1   0   94    32    0   39    8    6   95    2   380   81   2   0  17
  4    1   0   70    75    0  128   11    9   96    1   303   66   2   0  32
  5    1   0   88    62    0   99    7   11   89    1   370   74   2   0  24
  6    4   0   78    47    0   85   24    6   67    8   260   86   2   0  12
  7    2   0   73    29    0   45   21    5   57    7   241   85   1   0  14
  8    2   0   94    21    0   17   10    3   73    0   392   86   2   0  12
  9    3   0   64   106    0  198   23   12   85    7   209   66   2   0  32
 10    2   0   23   164    0  331   20   13   85    7    89   17   2   0  81
 11    3   0   31   105    0  200   14   11   63    4   144   33   2   0  66
 12    0   0   47   129    0  260    3    8   86    0   248   33   2   0  65
 13    4   0   76    48    0   90   25    5   77    8   255   75   2   0  24
 14    5   0   39   130    0  275   17    9   83   10   158   36   2   0  62
 15    2   0   67    99    0  183   18    5  101    4   207   72   2   0  25
 16    3   0   79    61    0  103   20    2   89    3   252   83   2   0  15
 17    2   0   17    64    0  123    8    7   38    2    65   18   1   0  81
 18    4   0   86    22    0   21   21    0   98    4   283   98   2   0   0
 19    1   0   10    75    0  148    9    5   38    1    47    9   3   0  88
 20    1   0   49   108    1  208    4    6   85    0   252   29   2   0  69
 21    5   0   90    46    0   77   29    1   75    8   277   92   2   0   6
 22    2   0   56    98    0  186   15    5   71    3   176   59   2   0  39
 23    5   0   37   156    0  309   19    6   75    4   136   39   2   0  59
 24    0   0   32    51    0   97    2    3   32    1   198   15   1   0  83
 25    8   0   82    56    0  142   13    8   87   13   294   82   2   0  16
 26    2   0   66    40    0   75   12    3   66    2   237   73   2   0  25
 27    6   0   80    33    0   57   21    5   89    3   272   86   2   0  13
 28    1   0   97    35    0   56    7    3   94    2   369   76   2   0  22
 29    4   0   83    44    0   79   27    3   69    7   286   89   2   0   9
 30    1   0   95    41    1   69    8    4  105    1   382   80   2   0  18
 31    5   0   16    31    2   99    5    9   20   15    97    9   1   0  90
...

For each CPU, a set of event counts and utilization statistics are reported. The first output printed is the summary since boot. After vmstat is checked, the mpstat processor utilization metrics are often the next point of call to ascertain how busy the system CPUs are. Processor utilization is reported by percent user (usr), system (sys), wait I/O (wt) and idle (idl) times, which have the same meanings as the equivalent columns from vmstat [2.2] and sar [2.7].

[edit] How do I determine how the CPU resources are being consumed?

[edit] Using prstat to Determine Who/What Is Using the CPU

The prstat command was introduced in Solaris 8 to provide real-time process status in a meaningful way (it resembles top, the original freeware tool written by William LeFebvre). prstat uses procfs, the /proc file system, to fetch process details (see proc(4)), and the getloadavg() syscall to get load averages.

$ prstat
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 25639 rmc      1613M   42M cpu22    0   10   0:33:10 3.1% filebench/2
 25655 rmc      1613M   42M cpu23    0   10   0:33:10 3.1% filebench/2
 25659 rmc      1613M   42M cpu30    0   10   0:33:11 3.1% filebench/2
 25644 rmc      1613M   42M cpu14    0   10   0:33:10 3.1% filebench/2
 25658 rmc      1613M   42M cpu16    0   10   0:33:10 3.1% filebench/2
 25636 rmc      1613M   42M cpu21    0   10   0:33:10 3.1% filebench/2
 25646 rmc      1613M   42M cpu15    0   10   0:33:10 3.1% filebench/2
 25661 rmc      1613M   42M cpu8     0   10   0:33:11 3.1% filebench/2
 25652 rmc      1613M   42M cpu20    0   10   0:33:09 3.1% filebench/2
 25647 rmc      1613M   42M cpu0     0   10   0:33:10 3.1% filebench/2
 25641 rmc      1613M   42M cpu27    0   10   0:33:10 3.1% filebench/2
 25656 rmc      1613M   42M cpu7     0   10   0:33:10 3.1% filebench/2
 25634 rmc      1613M   42M cpu11    0   10   0:33:11 3.1% filebench/2
 25637 rmc      1613M   42M cpu17    0   10   0:33:10 3.1% filebench/2
 25643 rmc      1613M   42M cpu12    0   10   0:33:10 3.1% filebench/2
 25648 rmc      1613M   42M cpu1     0   10   0:33:10 3.1% filebench/2
 25640 rmc      1613M   42M cpu26    0   10   0:33:10 3.1% filebench/2
 25651 rmc      1613M   42M cpu31    0   10   0:33:10 3.1% filebench/2
 25654 rmc      1613M   42M cpu29    0   10   0:33:10 3.1% filebench/2
 25650 rmc      1613M   42M cpu5     0   10   0:33:10 3.1% filebench/2
 25653 rmc      1613M   42M cpu10    0   10   0:33:10 3.1% filebench/2
 25638 rmc      1613M   42M cpu18    0   10   0:33:10 3.1% filebench/2
Total: 91 processes, 521 lwps, load averages: 29.06, 28.84, 26.68

[edit] CPU Run Queue Latency

The microstate accounting system maintains accurate time counters for threads as well as CPUs. Thread-based microstate accounting tracks several meaningful states per thread in addition to user and system time, which include trap time, lock time, sleep time, and latency time. The process statistics tool, prstat, reports the per-thread microstates for user processes.

$ prstat -mL
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 25644 rmc       98 1.5 0.0 0.0 0.0 0.0 0.0 0.1   0  36 693   0 filebench/2
 25660 rmc       98 1.7 0.1 0.0 0.0 0.0 0.0 0.1   2  44 693   0 filebench/2
 25650 rmc       98 1.4 0.1 0.0 0.0 0.0 0.0 0.1   0  45 699   0 filebench/2
 25655 rmc       98 1.4 0.1 0.0 0.0 0.0 0.0 0.2   0  46 693   0 filebench/2
 25636 rmc       98 1.6 0.1 0.0 0.0 0.0 0.0 0.2   1  50 693   0 filebench/2
 25651 rmc       98 1.6 0.1 0.0 0.0 0.0 0.0 0.2   0  54 693   0 filebench/2
 25656 rmc       98 1.5 0.1 0.0 0.0 0.0 0.0 0.2   0  60 693   0 filebench/2
 25639 rmc       98 1.5 0.1 0.0 0.0 0.0 0.0 0.2   1  61 693   0 filebench/2
 25634 rmc       98 1.3 0.1 0.0 0.0 0.0 0.0 0.4   0  63 693   0 filebench/2
 25654 rmc       98 1.3 0.1 0.0 0.0 0.0 0.0 0.4   0  67 693   0 filebench/2
 25659 rmc       98 1.7 0.1 0.0 0.0 0.0 0.0 0.4   1  68 693   0 filebench/2
 25647 rmc       98 1.5 0.1 0.0 0.0 0.0 0.0 0.4   0  73 693   0 filebench/2
 25648 rmc       98 1.6 0.1 0.0 0.0 0.0 0.3 0.2   2  48 693   0 filebench/2
 25643 rmc       98 1.6 0.1 0.0 0.0 0.0 0.0 0.5   0  75 693   0 filebench/2
 25642 rmc       98 1.4 0.1 0.0 0.0 0.0 0.0 0.5   0  80 693   0 filebench/2
 25638 rmc       98 1.4 0.1 0.0 0.0 0.0 0.0 0.6   0  76 693   0 filebench/2
 25657 rmc       97 1.8 0.1 0.0 0.0 0.0 0.4 0.3   6  64 693   0 filebench/2
 25646 rmc       97 1.7 0.1 0.0 0.0 0.0 0.0 0.6   6  83 660   0 filebench/2
 25645 rmc       97 1.6 0.1 0.0 0.0 0.0 0.0 0.9   0  55 693   0 filebench/2
 25652 rmc       97 1.7 0.2 0.0 0.0 0.0 0.0 0.9   2 106 693   0 filebench/2
 25658 rmc       97 1.5 0.1 0.0 0.0 0.0 0.0 1.0   0  72 693   0 filebench/2
 25637 rmc       97 1.7 0.1 0.0 0.0 0.0 0.3 0.6   4  95 693   0 filebench/2
Total: 91 processes, 510 lwps, load averages: 28.94, 28.66, 24.39

Another example shows what we would observe when the system is CPU-resource constrained. In this example, we can see that on average each thread is waiting for CPU resource about 80% of the time.

By specifying the -m (show microstates) and -L (show per-thread) options, we can observe the per-thread microstates. These microstates represent a time-based summary broken into percentages for each thread. The columns USR through LAT sum to 100% of the time spent for each thread during the prstat sample. The important microstates for CPU utilization are USR, SYS, and LAT. The USR and SYS columns are the user and system time that this thread spent running on the CPU. The LAT (latency) column is the amount of time spent waiting for CPU. A non-zero number means there was some queuing for CPU resources.


$ prstat -mL
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 25765 rmc       22 0.3 0.1 0.0 0.0 0.0 0.0  77   0  42 165   0 filebench/2
 25833 rmc       22 0.3 0.3 0.0 0.0 0.0 0.0  77   0 208 165   0 filebench/2
 25712 rmc       20 0.2 0.1 0.0 0.0 0.0 0.0  80   0  53 132   0 filebench/2
 25758 rmc       20 0.3 0.1 0.0 0.0 0.0 0.0  80   0  84 148   0 filebench/2
 25715 rmc       20 0.3 0.1 0.0 0.0 0.0 0.0  80   0  56 132   0 filebench/2
 25812 rmc       19 0.2 0.1 0.0 0.0 0.0 0.0  81   0  50 132   0 filebench/2
 25874 rmc       19 0.2 0.0 0.0 0.0 0.0 0.0  81   0  22 132   0 filebench/2
 25842 rmc       19 0.2 0.2 0.0 0.0 0.0 0.0  81   1  92 132   0 filebench/2
 25732 rmc       19 0.2 0.1 0.0 0.0 0.0 0.0  81   0  54  99   0 filebench/2
 25714 rmc       18 0.3 0.1 0.0 0.0 0.0 0.0  81   0  84 165   0 filebench/2
 25793 rmc       18 0.3 0.1 0.0 0.0 0.0 0.0  81   0  30 132   0 filebench/2
 25739 rmc       18 0.3 0.3 0.0 0.0 0.0 0.0  81   0 150 115   0 filebench/2
 25849 rmc       18 0.3 0.0 0.0 0.0 0.0 0.0  81   1  19 132   0 filebench/2
 25788 rmc       18 0.2 0.1 0.0 0.0 0.0 0.0  81   0  77  99   0 filebench/2
 25760 rmc       18 0.2 0.0 0.0 0.0 0.0 0.0  82   0  26 132   0 filebench/2
 25748 rmc       18 0.3 0.1 0.0 0.0 0.0 0.0  82   0  58 132   0 filebench/2
 25835 rmc       18 0.3 0.1 0.0 0.0 0.0 0.0  82   0  65 132   0 filebench/2
 25851 rmc       18 0.2 0.1 0.0 0.0 0.0 0.0  82   0  87  99   0 filebench/2
 25811 rmc       18 0.3 0.2 0.0 0.0 0.0 0.0  82   0 129 132   0 filebench/2
 25767 rmc       18 0.2 0.1 0.0 0.0 0.0 0.0  82   1  25 132   0 filebench/2
 25740 rmc       18 0.3 0.2 0.0 0.0 0.0 0.0  82   0 118 132   0 filebench/2
 25770 rmc       18 0.2 0.1 0.0 0.0 0.0 0.0  82   0  68 132   0 filebench/2
Total: 263 processes, 842 lwps, load averages: 201.45, 192.26, 136.16

[edit] Using DTrace to Explain Events From Performance Tools

DTrace can be exploited to attribute the source of events noted in higher-level tools such as mpstat(1M). For example, if we see a significant amount of system time (%sys) and a high system call rate (syscl), then we might want to know who or what is causing those system calls.

# mpstat 2
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0  117   0 1583   883  111 1487  593  150 6104   64 11108    7  92   0   1
  1  106   0  557   842    0 1804  694  150 6553   84 10684    6  93   0   1
  2  112   0  664   901    0 1998  795  143 6622   64 11227    6  93   0   1
  3   95   0  770  1035    0 2232  978  131 6549   59 11769    7  92   0   1
^C
# dtrace -n 'syscall:::entry { @[execname] = count(); }'
dtrace: description 'syscall:::entry ' matched 229 probes
^C
  inetd                                                             1
  svc.configd                                                       1
  fmd                                                               2
  snmpdx                                                            2
  utmpd                                                             2
  svc.startd                                                       13
  sendmail                                                         30
  snmpd                                                            36
  nscd                                                            105
  dtrace                                                         1311
  filebench                                                   3739725

Using the DTrace syscall provider, we can quickly identify which process is causing the most system calls. This dtrace one-liner measures system calls by process name. In this example, processes with the name filebench caused 3,739,725 system calls during the time the dtrace command was running. We can then drill deeper by matching the syscall probe only when the execname matches our investigation target, filebench, and counting the syscall name.

# dtrace -n 'syscall:::entry /execname == "filebench"/ { @[probefunc] = count(); }'
dtrace: description 'syscall:::entry ' matched 229 probes
^C
  lwp_continue                                                      4
  lwp_create                                                        4
  mmap                                                              4
  schedctl                                                          4
  setcontext                                                        4
  lwp_sigmask                                                       8
  nanosleep                                                        24
  yield                                                           554
  brk                                                            1590
  pwrite                                                        80795
  lwp_park                                                     161019
  read                                                         324159
  pread                                                        898401
  semsys                                                      1791717

Ah, so we can see that the semsys syscall is hot in this case. Let's look at what is calling semsys by using the ustack() DTrace action.

# dtrace -n 'syscall::semsys:entry /execname == "filebench"/ { @[ustack()] = count();}'
dtrace: description 'syscall::semsys:entry ' matched 1 probe
^C            
              libc.so.1`_syscall6 0x1c
              filebench`flowop_start 0x408
              libc.so.1`_lwp_start
            10793
              libc.so.1`_syscall6 0x1c
              filebench`flowop_start 0x408
              libc.so.1`_lwp_start
            10942
              libc.so.1`_syscall6 0x1c
              filebench`flowop_start 0x408
              libc.so.1`_lwp_start
            11084

We can now identify which system call, and then even obtain the hottest stack trace for accesses to that system call. We conclude by observing that the filebench flowop_start function is performing the majority of semsys system calls on the system.

[edit] What about CMT Processors?

CMT processors complicate utilization metrics, because they have multiple solaris-visible cpus sharing a single core. This is discussed more in CMT Utilization.

Solaris Internals
Personal tools
The Books
The Ads