DTrace Topics Databases
From Siwiki
Contents |
[edit] DTrace Topics: Databases
This article is about using DTrace to analyze Databases, and is part of the DTrace Topics collection. A general understanding of DTrace is assumed knowledge, which can be studied from the DTrace Topics: Intro section.
DTrace is a dynamic troubleshooting and analysis tool first introduced in the Solaris 10 and OpenSolaris operating systems.
Completion
Difficulty
Audience DBAs, developers, senior sysadmins
[edit] Database Analysis
DTrace can assist database analysis in two key ways:
- DTrace can measure behaviour across the entire software stack
- This may reveal issues with the disk devices, kernel drivers, user-level locks and any other system component.
- DTrace can provide a deep view of internal database operation on demand
- Going far beyond shipped metrics and standard analyzers.
[edit] Getting Started
A recommended approach for getting started with DTrace database analysis is:
- Use the shipped database statistics and analyzers that are already available.
- Most databases can provide exhaustive statistics, if they are enabled, and also analyzers to observe operation. Understanding your available database statistics is a good introduction to which type of metrics may be important, and should lead to ideas where DTrace can extend observability (rather than re-invent observability).
- Search for a matching DTrace database provider.
- This will be a high level provider which will allow consise scripts to be written to observe fundamental database behaviour, for example - the postgresql provider. This page will list these providers for each database in the sections that follow.
- DTrace system behaviour from other providers.
- DTrace is about observing everything, not just the database. Some issues may be easier to see, and much easier to prove, if observed from other layers of the software stack. For example, observing disk events from the io provider (which resembles physical calls to disks), rather than from database I/O.
- Consider using the pid provider to examine raw database operation.
- This can be a difficult approach, as you are observing the raw unvarnished code of the database in flight. A specific database provider is preferable, but one may not have been written for your database yet (see the sections that follow to see if one has). Also to consider - does your database vendor allow you to examine the operation of their code (conditions of use)?
[edit] Privileges
By default, only the root user can use DTrace. Others, such as database administrators that don't have the root password, need to have certain DTrace privileges assigned to their user account to be able to use DTrace. These privileges were introduced in Solaris 10 as part of the "Least Privilege" feature. See DTrace Topics Intro: Consumer for an introduction to DTrace privileges.
[edit] Perspectives
There are two perspectives to consider for DTrace database analysis:
- server - observing the operation of the server and its resources.
- Here we can watch all client requests, and monitor crucial resources such as CPUs, memory, disks and network interfaces. Systemic bottlenecks can be identified and eliminated, and the database can be tuned to optimally process workloads.
- client - observing the client as it communicaties with the database server.
- Here we can measure the actual latency suffered by the client, and examine the operation of the database interface library.
[edit] Providers
DTrace providers for database analysis include:
| Target | DTrace Provider | Description |
| server | pid | trace raw database operation as it parses, plans and executes queries. |
| server | io | trace disk events, measure volume and latency |
| server | fbt | trace raw kernel operation |
| client | pid | trace client requests, library behaviour, client latency |
| client | syscall | trace database I/O for basic analysis |
The pid provider is most useful with open source software, as the code that it instruments can be examined and understood. For example, MySQL is an open source database.
[edit] Database Providers
The database server and client interface may provider their own intergrated and stable providers, in addition to those listed in Providers. The table below is a summary (but might be slightly out of date).
Intergrated Database DTrace providers:
| Database | DTrace Provider | Description |
| Oracle | ? | status not known |
| MySQL | ? | status not known |
| Postgres | postgresql | PostgreSQL Provider |
[edit] Generic
Apart from analysing the database software itself, DTrace allows us to examine the the rest of the system and the behaviour of its resources. This may be used to:
- pinpoint system bottlenecks
- confirm optimal operation of resources
- provide measurements for capacity planning
Performance analysis from the database itself may not provide sufficient answers to these.
This system analysis applies to all database products.
[edit] Server
[edit] Client
[edit] MySQL
MySQL is a popular open source database. It is shipped with the Solaris 10 operating system.
[edit] Server
Currently there is no intergrated mysqld provider available.
[edit] pid Provider
Since MySQL is open source, the code can be inspected for suitable instrumentation using the dynamic pid provider.
In Brendan's opinion: the server code itself is quite easy code to follow, considering this is a reasonably complex application. Function and variable names are meaningful, there are some comments (there could be more), but best of all is that the code is very straight forward - few tricks. It might take you a day to learn the basics, and a week to learn many of the ins and outs - assuming you can program in C++.
The following examples of pid provider tracing were for mysql-5.1.17-beta. There is no guarentee that these will work on other versions of MySQL - such is the nature of dynamic pid tracing (and the need for stable intergrated providers).
[edit] Query Snoop
The server processes queries with the function mysql_parse,
- mysql_parse(THD *thd, char *inBuf, uint length)
The second argument, inBuf, sounds interesting. The code shows that this is the query string that is about to be parsed. The following one-liner traces mysql_parse, and prints inBuf along with a timestamp,
# dtrace -qn 'pid$target::*mysql_parse*:entry { printf("%Y %s\n", walltimestamp, copyinstr(arg1)); }'
-p `pgrep -x mysqld`
2007 Jun 3 23:28:39 select * from months where num > 6
2007 Jun 3 23:28:45 select * from months
2007 Jun 3 23:29:14 select * from words where name > 'fish'
^C
Great - this allows easy tracing of which queries are being requested. These may actually be invalid queries may be rejected before execution. For example:
# dtrace -qn 'pid$target::*mysql_parse*:entry { printf("%Y %s\n", walltimestamp, copyinstr(arg1)); }'
-p `pgrep -x mysqld`
2007 Jun 3 23:39:44 select blah blah I am an invalid query
^C
The above query returned an error. DTrace can also be used to trace the return value from functions, and so can be used to show which queries succeeded and which failed (although that would become a small script, rather than a one-liner).
Note that the above one-liners matched the process-ID using `pgrep -x mysqld` - this is fine if you only have one mysqld running, but if your server runs multiple instances of mysqld, you'll need to provide the PID instead.
[edit] Query Counting
Tracing queries on busy servers may produce rapid screenfulls of output. DTrace aggregations can help us here, by performing frequency counts on queries:
# dtrace -n 'pid$target::*mysql_parse*:entry { @[copyinstr(arg1)] = count(); }' -p `pgrep -x mysqld`
dtrace: description 'pid$target::*mysql_parse*:entry ' matched 1 probe
^C
select * from months where num < 4 1
select * from words where word > 'zoo' 1
select @@version_comment limit 1 1
select * from months 2
select * from months where num < 3 2
select * from words where name > 'zoo' 5
The above output shows queries and the number of times they were requested while DTrace was tracing.
Now for something a bit more interesting - here I aggregate queries on a MediaWiki server which is using MySQL:
# dtrace -n 'pid$target::*mysql_parse*:entry { @[copyinstr(arg1)] = count(); }' -p `pgrep -x mysqld`
dtrace: description 'pid$target::*mysql_parse*:entry ' matched 1 probe
^C
[...many lines truncated...]
/* User::checkNewtalk 192.168.1.198 */ SELECT user_ip FROM `user_newtalk` WHERE user_ip = '1
92.168.1.198' LIMIT 1 4
SELECT /* MediaWikiBagOStuff::_doquery 192.168.1.198 */ value,exptime FROM `objectcache` WHERE
keyname='wikidb:messages' 7
SELECT /* MediaWikiBagOStuff::_doquery 192.168.1.198 */ value,exptime FROM `objectcache` WHERE
keyname='wikidb:messages-hash' 7
SET /* Database::open 192.168.1.198 */ sql_mode = 10
/* Job::pop 192.168.1.198 */ SELECT * FROM `job` WHERE job_id >= 0 ORDER BY job_id LIMIT 1
11
COMMIT 17
BEGIN 19
The queries contain embedded functions, and are rather long - wordwrapping in some awkward places.
The following is the same output after some cleanup (removing embedded functions and fixing alignment):
[...many lines truncated...] SELECT user_ip FROM `user_newtalk` WHERE user_ip = '192.168.1.198' LIMIT 1 4 SELECT value,exptime FROM `objectcache` WHERE keyname='wikidb:messages' 7 SELECT value,exptime FROM `objectcache` WHERE keyname='wikidb:messages-hash' 7 SET sql_mode = ' ' 10 SELECT * FROM `job` WHERE job_id >= 0 ORDER BY job_id LIMIT 1 11 COMMIT 17 BEGIN 19
Identifying the most frequent queries may be a useful step in database tuning - to decide where potential optimizations can be made.
[edit] Execution Time
The final phase of processing a query (after parsing, planning and checking the query cache), is the execution. This is performed by mysql_execute_command, and the DTrace pid provider allows us to trace the entry and return of this function - the time for query execution. By tracing this in conjunction with mysql_parse, execution time by query can be performed.
The following script (mysqld_pid_etime.d) measures execution time by query, printing nanosecond values as distribution plots:
#!/usr/sbin/dtrace -s
/*
* mysqld_pid_etime.d - measure mysqld query execution latency.
* Written for Solaris 10 (needs DTrace).
*
* 01-Jun-2007, ver 0.50
*
* USAGE: ./mysqld_pid_etime.d -p `pgrep -x mysqld`
*
* This prints distribution plots of the elapsed time during the execution
* of MySQL statements, with a plot for each query string traced. This
* measure the execution stage only, not the parse or plan stages.
*
* This is written using the DTrace pid provider, which means it uses an
* unstable interface and is likely to stop working for future versions of
* mysql (this was tested on mysql-5.1.17-beta).
*
* 01-Jun-2007 Brendan Gregg Created this.
*/
#pragma D option quiet
dtrace:::BEGIN
{
printf("Tracing... Hit Ctrl-C to end.\n");
}
pid$target::*mysql_parse*:entry
{
self->query = copyinstr(arg1);
}
pid$target::*mysql_execute_command*:entry
{
self->start = timestamp;
}
pid$target::*mysql_execute_command*:return
/self->start/
{
this->elapsed = timestamp - self->start;
@time[self->query] = quantize(this->elapsed);
self->query = 0;
self->start = 0;
}
dtrace:::END
{
printf("MySQL Query execution latency (ns):\n");
printa(@time);
}
This script is run with some simple queries on the following tables:
- months - months of the year (12 rows)
- words - a dictionary of words (25143 rows)
# ./mysqld_pid_etime.d -p `pgrep -x mysqld`
Tracing... Hit Ctrl-C to end.
^C
MySQL Query execution latency (ns):
show databases
value ------------- Distribution ------------- count
131072 | 0
262144 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 5
524288 | 0
select * from words
value ------------- Distribution ------------- count
33554432 | 0
67108864 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
134217728 | 0
select * from words where name < 'bryan'
value ------------- Distribution ------------- count
8388608 | 0
16777216 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 4
33554432 | 0
67108864 | 0
134217728 |@@@@@@@@ 1
268435456 | 0
select * from words where name > 'a priori' and name < 'in situ'
value ------------- Distribution ------------- count
67108864 | 0
134217728 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2
268435456 | 0
The values are nanoseconds. The slowest queries fell in the range of 134 to 268 milliseconds.
Note that the "select * from words where name < 'bryan'" query had one slow response, and four much faster responses - this may be evidence of caching (we can use DTrace to trace the query cache to confirm if it was due to MySQL caching).
[edit] Client
[edit] PostgreSQL
- Version 1.0, Robert Lor (robert.lor@sun.com)
[edit] Overview
PostgreSQL 8.2 now has a number of DTrace probes embedded in the source code. These probes will enable developers and system administrators to easily observe the behavior of PostgreSQL with simple scripting programs called D scripts. The current set of probes is fairly small, but more can easily be added, and the community is encouraged to contribute more. The probes were added using the Generic Monitoring Framework. See this proposal for more details [1].
This document describes how to:
- Compile PostgreSQL with DTrace
- Use the existing DTrace probes
- Add new DTrace probes
[edit] Compile PostgreSQL with DTrace
By default DTrace probes are disabled, and the user needs to explicitly tell the configure script to make the probes available in PostgreSQL. Certainly, enabling DTrace only makes sense on Operating Systems with DTrace facility. Currently DTrace is available on Solaris 10+ and soon on FreeBSD and Mac OS X. To include DTrace probes in a 32 bit binary, specify --enable-dtrace to configure. For example:
$ configure --enable-dtrace ...
To include DTrace probes in a 64 bit binary, specify --enable-dtrace and DTRACEFLAGS="-64" to configure. For example:
$ configure CC='gcc -m64' --enable-dtrace DTRACEFLAGS='-64' ...
Notes:
- To successfully compile PostgreSQL 8.2 with --enable-dtrace, you need to run Solaris Express available at [2]. The DTrace version in Solaris 10 (up until 6/06) does not allow probes to be added to static functions. This limitation will be fixed in future updates of Solaris 10.
- When using DTRACEFLAGS='-64', you also have to specify -m64 (for 64-bit binary in gcc) to configure; otherwise, you will get compilation errors.
[edit] Use Existing DTrace Probes
Using the probes in PostgreSQL is similar to using probes in other DTrace providers. Below is an example of a simple D script using the transaction-start, transaction-commit, and transaction-abort probes. The script prints out the total number of started, committed, and aborted transactions.
#!/usr/sbin/dtrace -qs
postgresql$1:::transaction-start
{
@start["Start"] = count();
self->ts = timestamp;
}
postgresql$1:::transaction-abort
{
@abort["Abort"] = count();
}
postgresql$1:::transaction-commit
/self->ts/
{
@commit["Commit"] = count();
@time["Total time (ns)"] = sum(timestamp - self->ts);
self->ts=0;
}
Executing the above script produces the following output.
# ./txn_count.d `pgrep -n postgres` ^C Start 71 Commit 70 Total time (ns) 2312105013
A number of sample D scripts are available at [3] To learn more about DTrace, refer to the following HowTo and DTrace Guides. [4] [5]
[edit] Add New DTrace Probes
New DTrace probes can easily be added to PostgreSQL. For example, to add a probe called transaction-start, follow these simple steps:
Add the probe definitions to src/backend/utils/probes.d
provider postgresql {
...
probe transaction__start(int);
...
};
When a dash (-) is used in the probe name, it needs to be converted to double underscores (__) in the probe definition file. So, the above probe is called transaction-start in the D script.
Add "PG_TRACE1 (transaction__start, s->transactionId);" to backend/access/transam/xact.c
static void
StartTransaction(void)
{
...
/*
* generate a new transaction id
*/
s->transactionId = GetNewTransactionId(false);
XactLockTableInsert(s->transactionId);
PG_TRACE1 (transaction__start, s->transactionId);
...
}
Notes:
- PG_TRACE1 is mapped to DTRACE_PROBE1. See src/include/pg_trace.h.
- The provider name for all probes in PostgreSQL is called postgresql per the decision by the developers, so it's not specified in PG_TRACE. See src/include/pg_trace.h.
- Make sure the data types in the probe definition match the argument passed to the probe. In this case s->transactionId has to be an integer (int). When you have probes that might be useful to the community at large, send a proposal/patch to pgsql-hackers@postgresql.org to get feedback from the developers.
After recompiling, run the new binary, and as root, execute the following DTrace command to check that your newly added probe is available.
# dtrace -l -n transaction-start
