3d7f6d6bbe
Sponsored by: iXsystems
357 lines
16 KiB
XML
357 lines
16 KiB
XML
<?xml version="1.0" encoding="iso-8859-1"?>
|
|
<!--
|
|
Recently I suggested to myself that this should become a profiling
|
|
and debugging chapter, which covers things like ktrace(1) and
|
|
using other debugging (like -x in shell scripts). But then I
|
|
realized that, over time and while DTrace becomes better supported,
|
|
that might make this chapter too large.
|
|
-->
|
|
<!--
|
|
The FreeBSD Documentation Project
|
|
$FreeBSD$
|
|
-->
|
|
<!-- XXXTR: Should probably put links and resources here. I'm
|
|
nervous about this chapter as it may require a partial
|
|
re-write and large modification once DTrace is complete, but
|
|
at least we can get everyone started ... -->
|
|
<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0" xml:id="dtrace">
|
|
<info>
|
|
<title>&dtrace;</title>
|
|
|
|
<authorgroup>
|
|
<author><personname><firstname>Tom</firstname><surname>Rhodes</surname></personname><contrib>Written
|
|
by </contrib></author>
|
|
</authorgroup>
|
|
</info>
|
|
|
|
<sect1 xml:id="dtrace-synopsis">
|
|
<title>Synopsis</title>
|
|
|
|
<indexterm><primary>&dtrace;</primary></indexterm>
|
|
<indexterm>
|
|
<primary>&dtrace; support</primary>
|
|
<see>&dtrace;</see>
|
|
</indexterm>
|
|
|
|
<para>&dtrace;, also known as Dynamic Tracing, was developed by
|
|
&sun; as a tool for locating performance bottlenecks in
|
|
production and pre-production systems. In addition to
|
|
diagnosing performance problems, &dtrace; can be used to help
|
|
investigate and debug unexpected behavior in both the &os;
|
|
kernel and in userland programs.</para>
|
|
|
|
<para>&dtrace; is a remarkable profiling tool, with an impressive
|
|
array of features for diagnosing system issues. It may also be
|
|
used to run pre-written scripts to take advantage of its
|
|
capabilities. Users can author their own utilities using the
|
|
&dtrace; D Language, allowing them to customize their profiling
|
|
based on specific needs.</para>
|
|
|
|
<para>The &os; implementation provides full support for kernel
|
|
&dtrace; and experimental support for userland &dtrace;.
|
|
Userland &dtrace; allows users to perform function boundary
|
|
tracing for userland programs using the <literal>pid</literal>
|
|
provider, and to insert static probes into userland programs for
|
|
later tracing. Some ports, such as
|
|
<package>databases/postgres-server</package> and
|
|
<package>lang/php5</package> have a &dtrace; option to enable
|
|
static probes. &os; 10.0-RELEASE has reasonably good userland
|
|
&dtrace; support, but it is not considered production ready. In
|
|
particular, it is possible to crash traced programs.</para>
|
|
|
|
<para>After reading this chapter, you will know:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>What &dtrace; is and what features it provides.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Differences between the &solaris; &dtrace;
|
|
implementation and the one provided by &os;.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>How to enable and use &dtrace; on &os;.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Before reading this chapter, you should:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Understand &unix; and &os; basics
|
|
(<xref linkend="basics"/>).</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Have some familiarity with security and how it pertains
|
|
to &os; (<xref linkend="security"/>).</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</sect1>
|
|
|
|
<sect1 xml:id="dtrace-implementation">
|
|
<title>Implementation Differences</title>
|
|
|
|
<para>While the &dtrace; in &os; is similar to that found in
|
|
&solaris;, differences do exist. The primary difference is that
|
|
in &os;, &dtrace; is implemented as a set of kernel modules and
|
|
&dtrace; can not be used until the modules are loaded. To load
|
|
all of the necessary modules:</para>
|
|
|
|
<screen>&prompt.root; <userinput>kldload dtraceall</userinput></screen>
|
|
|
|
<para>Beginning with &os; 10.0-RELEASE, the modules are
|
|
automatically loaded when <command>dtrace</command> is
|
|
run.</para>
|
|
|
|
<para>&os; uses the <literal>DDB_CTF</literal> kernel option to
|
|
enable support for loading <acronym>CTF</acronym> data from
|
|
kernel modules and the kernel itself. <acronym>CTF</acronym> is
|
|
the &solaris; Compact C Type Format which encapsulates a reduced
|
|
form of debugging information similar to
|
|
<acronym>DWARF</acronym> and the venerable stabs.
|
|
<acronym>CTF</acronym> data is added to binaries by the
|
|
<command>ctfconvert</command> and <command>ctfmerge</command>
|
|
build tools. The <command>ctfconvert</command> utility parses
|
|
<acronym>DWARF</acronym> <acronym>ELF</acronym> debug sections
|
|
created by the compiler and <command>ctfmerge</command> merges
|
|
<acronym>CTF</acronym> <acronym>ELF</acronym> sections from
|
|
objects into either executables or shared libraries.</para>
|
|
|
|
<para>Some different providers exist for &os; than for &solaris;.
|
|
Most notable is the <literal>dtmalloc</literal> provider, which
|
|
allows tracing <function>malloc()</function> by type in the &os;
|
|
kernel. Some of the providers found in &solaris;, such as
|
|
<literal>cpc</literal> and <literal>mib</literal>, are not
|
|
present in &os;. These may appear in future versions of &os;.
|
|
Moreover, some of the providers available in both operating
|
|
systems are not compatible, in the sense that their probes have
|
|
different argument types. Thus, <acronym>D</acronym> scripts
|
|
written on &solaris; may or may not work unmodified on &os;, and
|
|
vice versa.</para>
|
|
|
|
<para>Due to security differences, only <systemitem
|
|
class="username">root</systemitem> may use &dtrace; on &os;.
|
|
&solaris; has a few low level security checks which do not yet
|
|
exist in &os;. As such, the
|
|
<filename>/dev/dtrace/dtrace</filename> is strictly limited to
|
|
<systemitem class="username">root</systemitem>.</para>
|
|
|
|
<para>&dtrace; falls under the Common Development and Distribution
|
|
License (<acronym>CDDL</acronym>) license. To view this license
|
|
on &os;, see
|
|
<filename>/usr/src/cddl/contrib/opensolaris/OPENSOLARIS.LICENSE</filename>
|
|
or view it online at <uri
|
|
xlink:href="http://www.opensolaris.org/os/licensing">http://www.opensolaris.org/os/licensing</uri>.
|
|
While a &os; kernel with &dtrace; support is
|
|
<acronym>BSD</acronym> licensed, the <acronym>CDDL</acronym> is
|
|
used when the modules are distributed in binary form or the
|
|
binaries are loaded.</para>
|
|
</sect1>
|
|
|
|
<sect1 xml:id="dtrace-enable">
|
|
<title>Enabling &dtrace; Support</title>
|
|
|
|
<para>In &os; 9.2 and 10.0, &dtrace; support is built into the
|
|
<filename>GENERIC</filename> kernel. Users of earlier versions
|
|
of &os; or who prefer to statically compile in &dtrace; support
|
|
should add the following lines to a custom kernel configuration
|
|
file and recompile the kernel using the instructions in <xref
|
|
linkend="kernelconfig"/>:</para>
|
|
|
|
<programlisting>options KDTRACE_HOOKS
|
|
options DDB_CTF
|
|
options DEBUG=-g</programlisting>
|
|
|
|
<para>Users of the AMD64 architecture should also add this
|
|
line:</para>
|
|
|
|
<programlisting>options KDTRACE_FRAME</programlisting>
|
|
|
|
<para>This option provides support for <acronym>FBT</acronym>.
|
|
While &dtrace; will work without this option, there will be
|
|
limited support for function boundary tracing.</para>
|
|
|
|
<para>Once the &os; system has rebooted into the new kernel, or
|
|
the &dtrace; kernel modules have been loaded using
|
|
<command>kldload dtraceall</command>, the system will need
|
|
support for the Korn shell as the &dtrace;
|
|
Toolkit has several utilities written in <command>ksh</command>.
|
|
Make sure that the <package>shells/ksh93</package> package or
|
|
port is installed. It is also possible to run these tools under
|
|
<package>shells/pdksh</package> or
|
|
<package>shells/mksh</package>.</para>
|
|
|
|
<para>Finally, install the current &dtrace; Toolkit,
|
|
a collection of ready-made scripts
|
|
for collecting system information. There are scripts to check
|
|
open files, memory, <acronym>CPU</acronym> usage, and a lot
|
|
more. &os; 10
|
|
installs a few of these scripts into
|
|
<filename>/usr/share/dtrace</filename>. On other &os; versions,
|
|
or to install the full
|
|
&dtrace; Toolkit, use the
|
|
<package>sysutils/DTraceToolkit</package> package or
|
|
port.</para>
|
|
|
|
<note>
|
|
<para>The scripts found in
|
|
<filename>/usr/share/dtrace</filename> have been specifically
|
|
ported to &os;. Not all of the scripts found in the &dtrace;
|
|
Toolkit will work as-is on &os; and some scripts may require
|
|
some effort in order for them to work on &os;.</para>
|
|
</note>
|
|
|
|
<para>The &dtrace; Toolkit includes many scripts in the special
|
|
language of &dtrace;. This language is called the D language
|
|
and it is very similar to C++. An in depth discussion of the
|
|
language is beyond the scope of this document. It is
|
|
extensively discussed at <uri
|
|
xlink:href="http://wikis.oracle.com/display/DTrace/Documentation">http://wikis.oracle.com/display/DTrace/Documentation</uri>.</para>
|
|
</sect1>
|
|
|
|
<sect1 xml:id="dtrace-using">
|
|
<title>Using &dtrace;</title>
|
|
|
|
<para>&dtrace; scripts consist of a list of one or more
|
|
<firstterm>probes</firstterm>, or instrumentation points, where
|
|
each probe is associated with an action. Whenever the condition
|
|
for a probe is met, the associated action is executed. For
|
|
example, an action may occur when a file is opened, a process is
|
|
started, or a line of code is executed. The action might be to
|
|
log some information or to modify context variables. The
|
|
reading and writing of context variables allows probes to share
|
|
information and to cooperatively analyze the correlation of
|
|
different events.</para>
|
|
|
|
<para>To view all probes, the administrator can execute the
|
|
following command:</para>
|
|
|
|
<screen>&prompt.root; <userinput>dtrace -l | more</userinput></screen>
|
|
|
|
<para>Each probe has an <literal>ID</literal>, a
|
|
<literal>PROVIDER</literal> (dtrace or fbt), a
|
|
<literal>MODULE</literal>, and a
|
|
<literal>FUNCTION NAME</literal>. Refer to &man.dtrace.1; for
|
|
more information about this command.</para>
|
|
|
|
<para>The examples in this section provide an overview of how to
|
|
use two of the fully supported scripts from the
|
|
&dtrace; Toolkit: the
|
|
<filename>hotkernel</filename> and
|
|
<filename>procsystime</filename> scripts.</para>
|
|
|
|
<para>The <filename>hotkernel</filename> script is designed to
|
|
identify which function is using the most kernel time. It will
|
|
produce output similar to the following:</para>
|
|
|
|
<screen>&prompt.root; <userinput>cd /usr/share/dtrace/toolkit</userinput>
|
|
&prompt.root; <userinput>./hotkernel</userinput>
|
|
Sampling... Hit Ctrl-C to end.</screen>
|
|
|
|
<para>As instructed, use the
|
|
<keycombo action="simul"><keycap>Ctrl</keycap><keycap>C</keycap>
|
|
</keycombo> key combination to stop the process. Upon
|
|
termination, the script will display a list of kernel functions
|
|
and timing information, sorting the output in increasing order
|
|
of time:</para>
|
|
|
|
<screen>kernel`_thread_lock_flags 2 0.0%
|
|
0xc1097063 2 0.0%
|
|
kernel`sched_userret 2 0.0%
|
|
kernel`kern_select 2 0.0%
|
|
kernel`generic_copyin 3 0.0%
|
|
kernel`_mtx_assert 3 0.0%
|
|
kernel`vm_fault 3 0.0%
|
|
kernel`sopoll_generic 3 0.0%
|
|
kernel`fixup_filename 4 0.0%
|
|
kernel`_isitmyx 4 0.0%
|
|
kernel`find_instance 4 0.0%
|
|
kernel`_mtx_unlock_flags 5 0.0%
|
|
kernel`syscall 5 0.0%
|
|
kernel`DELAY 5 0.0%
|
|
0xc108a253 6 0.0%
|
|
kernel`witness_lock 7 0.0%
|
|
kernel`read_aux_data_no_wait 7 0.0%
|
|
kernel`Xint0x80_syscall 7 0.0%
|
|
kernel`witness_checkorder 7 0.0%
|
|
kernel`sse2_pagezero 8 0.0%
|
|
kernel`strncmp 9 0.0%
|
|
kernel`spinlock_exit 10 0.0%
|
|
kernel`_mtx_lock_flags 11 0.0%
|
|
kernel`witness_unlock 15 0.0%
|
|
kernel`sched_idletd 137 0.3%
|
|
0xc10981a5 42139 99.3%</screen>
|
|
|
|
<!-- XXXTR: I attempted to use objdump and nm on /boot/kernel/kernel
|
|
to find 0xc10981a5, but to no avail. It would be nice to know
|
|
how we should look that up. -->
|
|
|
|
<para>This script will also work with kernel modules. To use this
|
|
feature, run the script with <option>-m</option>:</para>
|
|
|
|
<screen>&prompt.root; <userinput>./hotkernel -m</userinput>
|
|
Sampling... Hit Ctrl-C to end.
|
|
^C
|
|
MODULE COUNT PCNT
|
|
0xc107882e 1 0.0%
|
|
0xc10e6aa4 1 0.0%
|
|
0xc1076983 1 0.0%
|
|
0xc109708a 1 0.0%
|
|
0xc1075a5d 1 0.0%
|
|
0xc1077325 1 0.0%
|
|
0xc108a245 1 0.0%
|
|
0xc107730d 1 0.0%
|
|
0xc1097063 2 0.0%
|
|
0xc108a253 73 0.0%
|
|
kernel 874 0.4%
|
|
0xc10981a5 213781 99.6%</screen>
|
|
|
|
<!-- XXXTR: I was unable to match these up with output from
|
|
kldstat and kldstat -v and grep. Maybe I'm missing something
|
|
seriously obvious. It is 5AM btw. -->
|
|
|
|
<para>The <filename>procsystime</filename> script captures and
|
|
prints the system call time usage for a given process
|
|
<acronym>ID</acronym> (<acronym>PID</acronym>) or process name.
|
|
In the following example, a new instance of
|
|
<filename>/bin/csh</filename> was spawned. Then,
|
|
<filename>procsystime</filename> was executed and remained
|
|
waiting while a few commands were typed on the other incarnation
|
|
of <command>csh</command>. These are the results of this
|
|
test:</para>
|
|
|
|
<screen>&prompt.root; <userinput>./procsystime -n csh</userinput>
|
|
Tracing... Hit Ctrl-C to end...
|
|
^C
|
|
|
|
Elapsed Times for processes csh,
|
|
|
|
SYSCALL TIME (ns)
|
|
getpid 6131
|
|
sigreturn 8121
|
|
close 19127
|
|
fcntl 19959
|
|
dup 26955
|
|
setpgid 28070
|
|
stat 31899
|
|
setitimer 40938
|
|
wait4 62717
|
|
sigaction 67372
|
|
sigprocmask 119091
|
|
gettimeofday 183710
|
|
write 263242
|
|
execve 492547
|
|
ioctl 770073
|
|
vfork 3258923
|
|
sigsuspend 6985124
|
|
read 3988049784</screen>
|
|
|
|
<para>As shown, the <function>read()</function> system call used
|
|
the most time in nanoseconds while the
|
|
<function>getpid()</function> system call used the least amount
|
|
of time.</para>
|
|
</sect1>
|
|
</chapter>
|