DTrace Topics Style
From Siwiki
Contents |
[edit] DTrace Topics: Style
This article is about DTrace programming style, and is part of the DTrace Topics collection. A general understanding of DTrace is assumed knowledge, which can be studied from the DTrace Topics: Intro section.
DTrace is a dynamic troubleshooting and analysis tool first introduced in the Solaris 10 and OpenSolaris operating systems. DTrace processes a programming language called 'D'.
The following are D programming language style suggestions written by Brendan Gregg, creator of the DTraceToolkit and co-author of Solaris Performance and Tools.
Completion
Difficulty
Audience DTrace users who are publishing scripts
[edit] Intro
Many languages have well defined styles to follow, they help the programmer write in a neat consistant way and in the long run make fewer mistakes. They also help other programmers rapidly understand what they have written. These styles are documented in style guides, or "programming best practices".
Styles are borne from experience - finding out, often the hard way, what does and doesn't work. Since early 2004 I've written over a thousand DTrace scripts, around a hundred of which are in the DTraceToolkit. I've learnt a lot about what does and doesn't work, and have adapted my style numerous times. My last few style changes were costly, as it meant updating and retesting dozens of scripts from the DTraceToolkit. Hopefully I can save you similar pain, by documenting what I have learnt.
Some of this style is derived from cstyle, the tool used to check the style of the Solaris and OpenSolaris source to ensure it matches "Bill Joy Normal Form". I often run cstyle itself on DTrace scripts, as most of the warnings are still appropriate (thanks to the similarity of D to C programming).
This style will help if you wish to publish your scripts on the Internet, and is crucial for scripts in the DTraceToolkit. It is similar but not the same as the style used in the DTrace Guide (which was written by the authors of DTrace itself). For that reason this is known as the "DTraceToolkit Style", not the official "DTrace Style" as used in the DTrace Guide.
When using this guide, you may find that you don't agree on every point made. That is fine - these are suggestions. I don't agree on all the details of other programming style guides, however I usually follow them as obeying a standard can provide greater value than to use my own pecularities.
[edit] Generic Coding
The following are the same as C programmang in the style enforced by cstyle.
[edit] Line width of 80 chars
Each line width must not exceed 80 characters (with a tabstop of 8). The entire Solaris kernel is <= 80 chars, so your small DTrace script should have no problem. A soft limit of 79 characters is encouraged, for a few reasons, including avoiding a problem on some terminals which auto-join copy-n-pasted text which hits 80 chars.
The following is BAD,
printf("%-20s %7s %7s %7s %7s %7s %7s %7s\n", "Time", "scall/s", "sread/s", "swrit/s", "fork/s", "exec/s", "open/s", "stat/s");
The following is GOOD,
printf("%-20s %7s %7s %7s %7s %7s %7s %7s\n", "Time", "scall/s", "sread/s",
"swrit/s", "fork/s", "exec/s", "open/s", "stat/s");
The following is also GOOD,
printf("%-20s %7s %7s %7s %7s %7s %7s %7s\n", "Time",
"scall/s", "sread/s", "swrit/s", "fork/s", "exec/s", "open/s", "stat/s");
It may make logical sense to take a new line at a different point other than the first opportunity before the 80 char limit, it may not. So long as 80 characters isn't exceeded.
[edit] Line continuation of 4 chars
If a line exceeds the line width (80 chars), the remaining data can be placed on the following line with an indentation of 4 characters.
The following is BAD,
printf("%-20s %7s %7s %7s %7s\n",
"Time", "scall/s", "sread/s", "swrit/s", "fork/s");
The following is GOOD,
printf("%-20s %7s %7s %7s %7s\n",
"Time", "scall/s", "sread/s", "swrit/s", "fork/s");
[edit] Term seperator
Terms seperated by a comma must have a space after the comma.
The following is BAD,
printf("%6s %-16s %1s %s\n","PID","CMD","D","BYTES");
The following is GOOD,
printf("%6s %-16s %1s %s\n", "PID", "CMD", "D", "BYTES");
[edit] Comments
Comments are either a line comment or a block comment, of a very particular style (cstyle).
The following is BAD,
/******************
* Process io start
******************/
io:::start
{
/* fetch
details */
this->size = args[0]->b_bcount; /*** b_count is bytes ***/
The following is GOOD,
/*
* Process io start
*/
io:::start
{
/* fetch details */
this->size = args[0]->b_bcount; /* b_count is bytes */
The cstyle tool is very strict with comment formatting. For example, the first line in the above GOOD example must not have a trailing space.
[edit] cstyle
The remaining rules for coding can be learned by using the cstyle tool.
The following is BAD,
# cstyle cputypes.d cputypes.d: 42: missing space around assignment operator cputypes.d: 47: comma or semicolon followed by non-blank cputypes.d: 48: missing space around assignment operator cputypes.d: 55: continuation line not indented by 4 spaces cputypes.d: 59: line > 80 characters cputypes.d: 62: improper first line of block comment cputypes.d: 62: missing blank after open comment cputypes.d: 66: indent by spaces instead of tabs cputypes.d: 68: last line in file is blank
The following is GOOD,
# cstyle cputypes.d cputypes.d: 42: missing space around assignment operator
We allow the warning for line 42 as it is a DTrace directive that cstyle does not understand,
# sed '42!d' cputypes.d #pragma D option bufsize=64k
[edit] DTrace Specific
[edit] Fully qualified probe names
When specifying probes, you must use all four fields (if available), provider:module:function:name. The shortcuts that DTrace allows are suitable for when hacking at the command line, however for scripting it is both clearer and safer to specify the full name.
The following is BAD,
fork1:entry
The following is GOOD,
syscall::fork1:entry
In fact, the BAD example above is especially bad as it matches the fork1 probe in both the fbt and the syscall providers - producing duplicated results. If you write such a shortcut that only matches the desired probes now, in a future version of Solaris more probes may be added such that it becomes incorrect. To be safe, always fully qualify.
For consistancy, fully qualify the BEGIN probe as well,
dtrace:::BEGIN
This is the greatest deviation to the DTrace Guide style, which often uses just "BEGIN" to specify this probe.
[edit] BEGIN with a printf
When you run scripts that use the quiet pragma, the BEGIN statement must print something to let the user know when DTrace has begun tracing. This may be a header, or a message to say that tracing has begun.
The following is BAD,
# ./awkward_silence.d
The following is GOOD,
# ./bitesize.d Tracing... Hit Ctrl-C to end.
The following is also GOOD,
# ./dnlcsnoop.d PID CMD TIME HIT PATH
And the following is FINE,
# ./readbytes.d dtrace: script './readbytes.d' matched 4 probes
which is the default behaviour of DTrace, and does indeed note when tracing has begun.
[edit] Sampling/Tracing...
Scripts that collect data and then print a report when Ctrl-C is hit must print a BEGIN message. That BEGIN message should convey the behaviour of your script.
The following is BAD,
# ./mystery.d Somehow gathering data... Exit the usual way.
If your script traces events (eg, io:::start), then the following is GOOD,
# ./bitesize.d Tracing... Hit Ctrl-C to end.
If your script samples data (eg, profile:::profile-1000hz), then the following is GOOD,
# ./pridist.d Sampling... Hit Ctrl-C to end.
Whenever the user sees "Sampling", it informs them that the script may be subject to sampling errors and that the rate may need to be customised.
[edit] Units
Scripts that output numbers are encouraged to provide units in the output if space permits. The following points explain usage.
- Preferred usage is of the form "Kbytes/sec", however this length may be more suited to documentation.
- A shorter version of "Kbytes/sec" is "KB/s", which may be more suited for command outputs.
- Kilobits can be written as "Kb", or better "Kbits" to avoid confusion.
- For column headers, more caps are allowed for the longer forms: eg, "KBYTES/s" and "KBITS/s".
- 1 Kbyte = 1024 bytes; and 1 Kbit = 1000 bits.
- No SI binary prefixes yet (KiB/kibibyte), but this may change in the future. For now it shouldn't be a problem - DTrace scripts are short enough that people can read them to see what was used.
- For rate data, it is best to present the output in per second units.
- If per interval units are used, writing "Kbytes/int" or "KB/i" should be used.
The following is BAD,
# ./measure.d Tracing... Hit Ctrl-C to end. ^C Average: 152
The following is GOOD,
# ./measure.d Tracing... Hit Ctrl-C to end. ^C Average: 152 Kbytes/sec
[edit] Truncating
If your script truncates output, you should report this as part of the output.
The following is BAD,
# ./agg.d Tracing... Hit Ctrl-C to end. ^C Top syscalls, writev 288 write 406 pollsys 1278 read 1349 ioctl 1529
The following is GOOD,
# ./agg.d Tracing... Hit Ctrl-C to end. ^C Top 5 syscalls, writev 288 write 406 pollsys 1278 read 1349 ioctl 1529
The description is now "Top 5".
An exception to this may be prstat or top style scripts, which refresh the screen. Truncation from these style of tools is expected. It should still however be clearly documented.
[edit] Output width of 80 chars
You script output must under normal circumstances fit within an 80 character width.
The following is BAD,
# ./syscalls.d Tracing... Hit Ctrl-C to end. ^C Top 5 syscalls, EXEC SYSCALL COUNT dtrace ioctl 147 xmms pollsys 165 xmms ioctl 262 Xorg pollsys 285 Xorg read 332
The following is GOOD,
# ./syscalls.d Tracing... Hit Ctrl-C to end. ^C Top 5 Syscalls, EXEC SYSCALL COUNT dtrace ioctl 147 xmms pollsys 165 xmms ioctl 262 Xorg pollsys 285 Xorg read 332
The following is FINE,
# ./open.d
Tracing... Hit Ctrl-C to end.
^C
Top 5 Pathnames Opened,
COUNT PATHNAME
1 /var/sadm/pkg/SUNWstaroffice-gnome-integration/save/pspool/SUNWstaroffice-gnome-integration/install
1 /var/sadm/pkg/SUNWstaroffice-gnome-integration/save/pspool/SUNWstaroffice-gnome-integration/pkginfo
1 /var/sadm/pkg/SUNWstaroffice-gnome-integration/save/pspool/SUNWstaroffice-gnome-integration/pkgmap
2 /etc/resolv.conf
3 /etc/svc/volatile/repository_door
Truncating pathnames may be a bigger crime than exceeding 80 chars, and ls -l and find have set a precedant for this anyway. In this case, the style is to place the pathname field (the most varying field) as the right most field.
[edit] Variable types
- Temporary calculations within a clause should use this-> variables.
- Global variables should be avoided if possible (they can hurt performance).
[edit] Memory cleanup
Variables must be set to zero after final use, especially global hashes and thread local variables (self->). Otherwise, memory is leaked and you may encounter dynamic variable drops.
The following is BAD,
syscall::read:return
/self->start/
{
@latency = quantize(timestamp - self->start);
}
The following is GOOD,
syscall::read:return
/self->start/
{
@latency = quantize(timestamp - self->start);
self->start = 0;
}
Assuming self->start wasn't needed after that clause.
