DTrace Topics Exercises

From Siwiki

Jump to: navigation, search

Contents

[edit] DTrace Topics: Exercises

This is a page of DTrace Exercises to attempt while learning DTrace, and is part of the DTrace Topics collection. A general understanding of DTrace is assumed knowledge, which can be studied from the DTrace Topics: Intro section.

DTrace is a dynamic troubleshooting and analysis tool first introduced in the Solaris 10 and OpenSolaris operating systems.

Completion Image:trafficlight_red02.png
Difficulty: Basic Image:coffeemug01.png Image:coffeemug01.png Image:coffeemug01.png
Difficulty: Intermediate Image:coffeemug01.png Image:coffeemug01.png Image:coffeemug01.png Image:coffeemug01.png
Difficulty: Advanced Image:coffeemug01.png Image:coffeemug01.png Image:coffeemug01.png Image:coffeemug01.png Image:coffeemug01.png
Audience All DTrace users

[edit] Exercises

The temptation (or pressure) is to start learning DTrace by analysing your target application, an application which may be hideously, hideously complex. This is like learning to skii on a double black diamond run - you may be lucky, survive, and learn some skills between moments of life threatening terror; or you may break both your legs on the first turn.

Start by DTracing something simple, for example - write a short C program with a known fault. DTrace it. Become comfortable DTracing it, then move onto something a bit harder.

At each step of the way, verify that your analysis is correct. Can you show that your measurements are accurate? This usually means more DTracing, the use of other tools, research into the subject area (especially man pages), and much thought. And all of this is valuable DTrace practise.

This page provides various sample programs (mostly C) for you to compile and practice DTracing. Questions are also provided for you to answer with DTrace. Solutions are not provided, as this is as much about verifying solutions as it is about writing them.

Good luck! :-)

[edit] Basic

[edit] Hello World

[edit] Program

The following hello_loop.c program isn't so clever,

#include <unistd.h>

int
main()
{
        while (1) {
                write(4, "Hello World!\n", 13);     
        }

        return (0);
}

It attempts to write() "Hello World\n" to file descriptor 4, which by that point in the code will not be valid and will error. The return code from write() is ignored, and the program continues in a loop.

[edit] Setup

Complie using,

Sun's Compiler cc -o hello_loop hello_loop.c
GNU Compiler /usr/sfw/bin/gcc -o hello_loop hello_loop.c

When run, hello_loop produces no output but does consume plenty of CPU,

$ prstat
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
 27473 brendan  1176K  744K run     10    0   0:00:43  84% hello_loop/1
 26660 brendan    12M 7400K sleep   49    0   0:37:01 2.3% x64/1
   665 brendan   182M  171M run     59    0   0:49:06 2.0% Xorg/1
 17505 brendan   193M  182M sleep   49    0   1:31:56 1.4% opera/3
   726 brendan   113M   69M run     59    0   0:09:23 0.4% gnome-terminal/2
[...]

[edit] Exercise

Write either DTrace one-liners or scripts to answer the following:

  1. Show what system call is most frequently called by the program, by tracing for several seconds and summarising.
    Yes, we expect it to be write(). Prove it.
  2. Show what file descriptor the write() system call is given.
    Again, trace for several seconds and produce a summary.
  3. Show the return code from the write() system call.
    Does that make sense? (see "man write.2").
  4. Show the write() return code with the value of errno.
    Does that make sense? (see "/usr/include/sys/errno.h").
  5. Show what text message the write() system call is attempting to send.
  6. Show roughly why the program is on-CPU by sampling frequent user stack traces.
    Only sample when hello_loop is on-CPU. The output should make sense.
  7. Show where the kernel spent on-CPU time by sampling frequent kernel stack traces.
    Only sample when hello_loop is on-CPU. The output probably doesn't make much sense for now.
  8. Is the program spending most of its time in the write() system call? This time trace it (not sample). This involves,
    1. Measure the elapsed time while DTrace was tracing, and print it.
    2. Measure the total elapsed time for the hello_loop write() syscalls.
    3. Print these values as nanoseconds, and as milli-seconds.
    If time in write() is close to the total tracing time, can we answer the first question?

[edit] Intermediate

[edit] Advanced

[edit] Solutions

Solutions are not provided. Reasons are:

  • Example solutions with DTrace are in the Examples page. Other DTrace Topics pages provide Tips and Style suggestions.
  • Your real world problems don't come with a solutions page! You are going to need to start solving things yourself sooner or later.
  • DTrace is as much about verifying your solutions as it is about finding them. How do you show that what you did provides the right answer? See Verification for suggestions.
Solaris Internals
Personal tools
The Books
The Ads