DTrace Topics Locks
From Siwiki
Contents |
[edit] DTrace Topics: Locks
This article is about lock analysis using DTrace, and is part of the DTrace Topics collection. A general understanding of DTrace is assumed knowledge, which can be studied from the DTrace Topics: Intro section.
DTrace is a dynamic troubleshooting and analysis tool first introduced in the Solaris 10 and OpenSolaris operating systems. Locks are a thread synchronisation mechanism.
Completion
Difficulty
Audience Developers, senior sysadmins
[edit] Lock Analysis
Locks are a mechanism to synchronise threads, and are used extensively throughout the kernel and in multi-threaded applications. The main types are,
- Mutex locks
- Spin locks
- Reader/Writer locks
- Conditional variables
The way locks are used by an application, the workload applied, and the architecture of the system, can all lead to lock contention issues which can degrade performance.
[edit] Strategy
For getting started with lock analysis,
- study lock types
- The approach for analysing lock behaviour greatly depends on the lock type.
- Sections that follow will introduce different lock types.
- try the commands
- Solaris now has lockstat and plockstat for lock analysis.
- write custom scripts
- DTrace provides probes to trace lock activity, allowing custom scripts for lock analysis to be written.
[edit] Using DTrace
DTrace is very suitable for lock analysis, due to it's ability to dynamically instrument lock events as required. The following commands and providers can be used for lock analysis, and were first shipped with the Solaris 10.
[edit] Commands
The following commands were first shipped with Solaris 10, and use the DTrace framework to perform lock analysis,
| Command | Description |
| /usr/sbin/plockstat | perform user lock analysis |
| /usr/sbin/lockstat | perform kernel lock analysis |
| /usr/sbin/dtrace | perform custom lock analysis using DTrace providers |
[edit] Providers
The following DTrace providers can be used to examine lock events in custom detail,
| Provider | Description |
| plockstat | trace user lock events |
| lockstat | trace kernel lock events |
| pid | trace raw user library events |
| fbt | trace raw kernel events |
These providers can trace fundamental lock activity including,
- when locks are requested
- when threads block on locks
- when locks are acquired
All with nanosecond timestamps, and with additional details such as stack traces.
The DTrace Guide documents these providers in seperate chapters.
[edit] User
These are locks from user-land applications, such as web servers, databases and application servers.
Before you analyse your applications, it is best to study something simple and become comfortable with that first. For this reason, simple example programs have been provided below which exibit lock issues, for you to practise analysing. These are good examples of issues, not good examples of programming (to conserve space no error checking is performed).
To ensure that the following examples cause lock contention, make sure to run them on a multi-CPU server,
$ psrinfo 0 on-line since 04/05/2007 13:49:52 1 on-line since 04/06/2007 11:29:59
[edit] Mutex Locks
[edit] Example #1
The following simple program creates four threads which run the run_amock() function. This function has a loop which increments a global integer g_count, and prints a message whenever g_count is equal to 123456789. A mutex lock g_lock is used to ensure that only one thread is writing to this variable at a time.
mutex.c,
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define THREAD_NUM 4
pthread_t g_thread[THREAD_NUM];
pthread_mutex_t g_mutex;
uint64_t g_count;
void *
run_amuck(void *arg)
{
int i, j;
printf("Thread %d started.\n", (int)arg);
for (i = 0; i < 10000; i++) {
pthread_mutex_lock(&g_mutex);
for (j = 0; j < 100000; j++) {
if (g_count++ == 123456789)
printf("Thread %d wins!\n", (int)arg);
}
pthread_mutex_unlock(&g_mutex);
}
return (NULL);
}
int
main(int argc, char *argv[])
{
int i, threads = THREAD_NUM;
printf("Creating %d threads...\n", threads);
pthread_mutex_init(&g_mutex, NULL);
for (i = 0; i < threads; i++)
pthread_create(&g_thread[i], NULL, run_amuck, (void *) i);
for (i = 0; i < threads; i++)
pthread_join(g_thread[i], NULL);
printf("Done.\n");
return (0);
}
[edit] Output
Compiling and running the program is straightforward,
$ cc -o mutex mutex.c $ ./mutex Creating 4 threads... Thread 0 started. Thread 3 started. Thread 2 started. Thread 1 started. Thread 2 wins! Done.
[edit] Block Time
The plockstat command can be used to measure mutex block time, which is the duration threads are waiting for a lock. These are contention events, which plockstat measures by default (or specifically with -C). We also run the program through ptime, to get a rough idea of overall execution time (or we could truncate traced time using plockstat -e),
# ptime plockstat ./mutex Creating 4 threads... Thread 0 started. Thread 3 started. Thread 2 started. Thread 1 started. Thread 2 wins! Done. plockstat: pid 1678 has exited Mutex block Count nsec Lock Caller ------------------------------------------------------------------------------- 310 317871374 mutex`g_mutex mutex`run_amuck+0x36 real 33.846 user 0.373 sys 0.200
The columns above are,
| Count | number of times a mutex blocked |
| nsec | average duration of the mutex block |
| Lock | the lock name from the source code |
| Caller | the function in the source where this mutex was blocked |
The values show that our g_mutex example mutex blocked 310 times, with an average block time of 317871374 nanoseconds each. This multiplies to give a total block time of 98.54 seconds. Does that make sense?
The mutex.c program has 4 threads fighting over 1 mutex lock, so only 1 thread will run while 3 are blocked. Our rough execution time from ptime was 33.846 seconds, and since 3 threads are blocked at a time, our expected block time is (33.846 seconds) x (3 threads), which is 101.538 seconds. The measured value of 98.54 seconds is consistant, and slightly lower due to time lost from other overheads during program execution (execution of main() and also DTrace initializing).
Block time means a thread is waiting to do work, but can't, which usually increases experienced latency of the application. plockstat lets you quickly measure the magnitude of the issue, and identify both the lock name and the function that it was called from.
Since this was run on a 2 x CPU server, while 1 thread runs on a CPU, another CPU is idle and 3 threads want to run. If the program could be rewritten to avoid this lock contention, the speed could potentially double on this 2 x CPU server, as 2 threads could be running on CPU.
[edit] Block Time with Stacks
A great feature of plockstat is the ability to measure stack traces for lock events. Here we measure up to 8 stack frames (lines),
# plockstat -s8 ./mutex
Creating 4 threads...
Thread 0 started.
Thread 3 started.
Thread 2 started.
Thread 1 started.
Thread 3 wins!
^C
Mutex block
-------------------------------------------------------------------------------
Count nsec Lock Caller
174 541065216 mutex`g_mutex mutex`run_amuck+0x36
nsec ---- Time Distribution --- count Stack
4194304 | | 2 libc.so.1`slow_lock+0x40
8388608 | | 2 libc.so.1`mutex_lock_impl+0x14b
16777216 | | 0 libc.so.1`mutex_lock+0x1d
33554432 | | 3 mutex`run_amuck+0x36
67108864 | | 7 libc.so.1`_thr_setup+0x52
134217728 |@@@ | 27 libc.so.1`_lwp_start
268435456 |@@@@ | 35
536870912 |@@@@@@@@ | 62
1073741824 |@@@ | 28
2147483648 |@ | 8
A distribution plot has been printed to show the profile of blocked time; in the output abover the most frequent block time was in the 0.536 to 1.074 second bucket, for which there were 62 events.
Next to the distribution plot there is a stack trace, which shows (from bottom to top) the functions that led to the run_amuck() function being called, then run_amuck(), then the library functions that run_amuck() called to perform the mutex lock.
[edit] Held Time
The held time is the duration that a thread holds a lock for, which is the time other threads are prevented from fetching the lock. We'd like the hold time to be small, so that the lock is usually free when threads want it. It can be measured using plockstat and -H,
# ptime plockstat -H ./mutex
Creating 4 threads...
Thread 0 started.
Thread 3 started.
Thread 2 started.
Thread 1 started.
Thread 0 wins!
Done.
plockstat: pid 1840 has exited
Mutex hold
Count nsec Lock Caller
-------------------------------------------------------------------------------
38524 831188 mutex`g_mutex mutex`run_amuck+0xe7
1476 830804 mutex`g_mutex mutex`run_amuck+0xe7
1 122174 libc.so.1`_xftab+0x44 mutex`run_amuck+0xc5
4 27370 libc.so.1`_xftab+0x44 mutex`run_amuck+0x14
3 35298 libc.so.1`_uberdata+0xe40 libc.so.1`find_stack+0x1bf
1 50637 libc.so.1`_uberdata+0xfc0 0
4 9958 libc.so.1`_uberdata libc.so.1`_thr_exit_common+0xb4
4 7194 libc.so.1`_uberdata+0x40 libc.so.1`pthread_create+0x22a
4 7152 libc.so.1`_uberdata mutex`main+0x96
1 27840 libc.so.1`_uberdata+0x40 libc.so.1`libc_init+0x2de
1 24559 libc.so.1`_xftab+0x44 mutex`main+0xb6
4 5824 libc.so.1`_uberdata mutex`main+0x5b
1 20192 libc.so.1`update_lock libc.so.1`libc_init+0x455
1 10902 libc.so.1`_first_link_lock libc.so.1`_exithandle+0x53
4 2262 libc.so.1`_uberdata libc.so.1`_thr_exit_common+0xb4
6 1462 libc.so.1`_uberdata+0xd40 libc.so.1`exit+0x12
1 8553 0xfee5c080 libc.so.1`_thr_exit_common+0xb4
1 7729 0xfee5c0c0 libc.so.1`_thr_exit_common+0xb4
4 1702 libc.so.1`_uberdata+0x1000 libc.so.1`_t_cancel+0x10e
1 6327 0xfee5c080 libc.so.1`pthread_create+0x22a
1 5566 libc.so.1`_uberdata+0x1000 LM1`ld.so.1`rt_thr_init+0x35
1 5271 libc.so.1`_uberdata+0xd40 libc.so.1`libc_init+0x2de
1 4476 0xfee5c080 mutex`main+0x96
3 1433 libc.so.1`_uberdata libc.so.1`_thrp_create+0xc5
3 1363 libc.so.1`_uberdata libc.so.1`pthread_create+0x120
1 3853 0xfee5c140 libc.so.1`_thr_exit_common+0xb4
2 1691 libc.so.1`_uberdata+0xe00 libc.so.1`_t_cancel+0x10e
1 3314 0xfee5c100 libc.so.1`_thr_exit_common+0xb4
1 2502 0xfee5c0c0 libc.so.1`pthread_create+0x22a
1 2417 0xfee5c140 libc.so.1`pthread_create+0x22a
1 2395 0xfee5c100 libc.so.1`pthread_create+0x22a
1 2342 0xfee5c080 libc.so.1`pthread_create+0x120
1 2231 libc.so.1`_uberdata+0xa00 libc.so.1`pthread_create+0x120
1 1912 libc.so.1`_uberdata+0xd40 0x8050815
1 1831 0xfee5c0c0 libc.so.1`pthread_create+0x120
1 1773 0xfee5c100 mutex`main+0x96
1 1762 0xfee5c100 libc.so.1`pthread_create+0x120
1 1684 libc.so.1`_uberdata+0xfc0 0x8050815
1 1643 0xfee5c140 libc.so.1`pthread_create+0x120
1 1625 0xfee5c0c0 mutex`main+0x96
1 1599 libc.so.1`_xftab+0x8 libc.so.1`_exithandle+0x53
1 1572 0xfee5c140 mutex`main+0x96
1 1517 libc.so.1`_xftab+0x44 libc.so.1`_exithandle+0x53
1 1501 libc.so.1`_uberdata mutex`main+0x5b
1 1495 libc.so.1`_uberdata+0xe40 libc.so.1`_thrp_create+0xc5
1 1473 libc.so.1`_uberdata+0xd40 libc.so.1`libc_init+0x45a
1 1470 libc.so.1`_uberdata+0xfc0 libc.so.1`pthread_create+0x120
1 1438 libc.so.1`_uberdata+0xd40 libc.so.1`_thrp_create+0x54
1 1405 libc.so.1`_uberdata+0xd40 0x805081f
1 1378 libc.so.1`_uberdata libc.so.1`pthread_create+0x120
1 1356 libc.so.1`_uberdata+0xfc0 libc.so.1`pthread_create+0x120
1 1332 libc.so.1`_uberdata+0xd40 libc.so.1`_thrp_create+0x54
1 1328 libc.so.1`_uberdata+0xd40 libc.so.1`_thrp_create+0x54
1 1318 libc.so.1`_uberdata+0xfc0 libc.so.1`pthread_create+0x120
1 1262 libc.so.1`_uberdata+0xfc0 0x805081f
real 33.990
user 0.368
sys 0.132
The output fields are,
| Count | number of times a mutex was held |
| nsec | average duration the mutex was held |
| Lock | the lock name from the source code |
| Caller | the function in the source where this mutex was held |
The top two hold times are our run_amuck() function holding the g_mutex lock, with counts of 38524 and 1476. These add to 40000, which is the number of times the mutex.c program holds that lock. Multiplying with the average hold times, gives the total hold time for g_mutex to be 33.247 seconds. This is consistant with the rough execution time that ptime printed of 33.990 seconds, as due to the mutex lock only 1 thread will be running at a time during program execution.
[edit] Spin Locks
[edit] RW Locks
[edit] Conditional Variables
[edit] Kernel
These are locks used within the kernel and its drivers.
