Various updates to the kernel debugging chapter.
- Document vmcore.last and describe it as the way to find the most recent dump rather than the highest numbered dump. - Document crashinfo and that it automatically runs to generate a core.txt.N file if core dumps are enabled in rc.conf. - Add a section on testing kernel dumps via the debug.kdb.panic sysctl. Remove a later note about debug.kdb.panic from the DDB section. - Remove any mention of gdb -k (for pre 5-3 kernels) and just talk about kgdb. - Remove paragraph that talks about trying to find the kernel.debug file. Instead, recommand 'kgdb -n <N>' which does this lookup automatically, and specifically recommend 'kgb -n last' to open the most recent crash dump. Mention the fallback of specifying the kernel and vmcore directly if needed. - Remove example dump from FreeBSD 2. It is generally no longer relevant. It used gdb -k which uses a different stack trace format as well as including a 'frame' command that doesn't existing kgdb. (kgdb instead lets you switch to different threads and processes). - Remove mention of old boot blocks that don't load debug symbols. I think this was last relevant in FreeBSD 2.x or 3.x. - Rework the description of 'boot -d' to assume the boot menu and explicitly mention 'boot -d' at the loader prompt. - Document how to get stack traces of other threads in DDB. - Fix a few references to gdb to reference kgdb instead. - Replace 'call cpu_reset' with 'reset' for DDB. Differential Revision: https://reviews.freebsd.org/D14711
This commit is contained in:
parent
ba78e991f8
commit
d26b8ed87a
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=51572
1 changed files with 66 additions and 218 deletions
|
|
@ -136,11 +136,19 @@
|
|||
<varname>dumpdir</varname> is set to), the kernel will
|
||||
increment the trailing number for every crash to avoid
|
||||
overwriting an existing <filename>vmcore</filename> (e.g.,
|
||||
<filename>vmcore.1</filename>). While debugging, it is
|
||||
highly likely that you will want to use the highest version
|
||||
<filename>vmcore</filename> in
|
||||
<filename>/var/crash</filename> when searching for the right
|
||||
<filename>vmcore</filename>.</para>
|
||||
<filename>vmcore.1</filename>). &man.savecore.8; will always
|
||||
create a symbolic link to named <filename>vmcore.last</filename>
|
||||
in <filename>/var/crash</filename> after a dump is saved.
|
||||
This symbolic link can be used to locate the name of the most
|
||||
recent dump.</para>
|
||||
|
||||
<para>The &man.crashinfo.8; utility generates a text file
|
||||
containing a summary of information from a full memory dump
|
||||
or minidump. If <varname>dumpdev</varname> has been set in
|
||||
&man.rc.conf.5;, &man.crashinfo.8; will be invoked
|
||||
automatically after &man.savecore.8;. The output is saved
|
||||
to a file in <varname>dumpdir</varname> named
|
||||
<filename>core.txt.<replaceable>N</replaceable></filename>.</para>
|
||||
|
||||
<tip>
|
||||
<para>If you are testing a new kernel but need to boot a different one in
|
||||
|
|
@ -161,45 +169,61 @@
|
|||
device as it is likely different than
|
||||
<filename>/dev/ad0s1b</filename>!</para></tip>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Testing Kernel Dump Configuration</title>
|
||||
|
||||
<para>The kernel includes a &man.sysctl.8; node that requests a
|
||||
kernel panic. This can be used to verify that your system is
|
||||
properly configured to save kernel crash dumps. You may wish
|
||||
to remount existing file systems as read-only in single user
|
||||
mode before triggering the crash to avoid data loss.</para>
|
||||
|
||||
<screen>&prompt.root; <userinput>shutdown now</userinput>
|
||||
...
|
||||
Enter full pathname of shell or RETURN for /bin/sh:
|
||||
&prompt.root; <userinput>mount -a -u -r</userinput>
|
||||
&prompt.root; <userinput>sysctl debug.kdb.panic=1</userinput>
|
||||
debug.kdb.panic:panic: kdb_sysctl_panic
|
||||
...</screen>
|
||||
|
||||
<para>After rebooting, your system should save a dump in
|
||||
<filename>/var/crash</filename> along with a matching summary
|
||||
from &man.crashinfo.8;.</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
<sect1 xml:id="kerneldebug-gdb">
|
||||
<title>Debugging a Kernel Crash Dump with <command>kgdb</command></title>
|
||||
|
||||
<note>
|
||||
<para>This section covers &man.kgdb.1; as found in &os; 5.3
|
||||
and later. In previous versions, one must use
|
||||
<command>gdb -k</command> to read a core dump file.
|
||||
Since &os; 12 kgdb is acquired by installing
|
||||
<package>devel/gdb</package>.</para>
|
||||
<para>This section covers &man.kgdb.1;. The latest version is
|
||||
included in the <package>devel/gdb</package>. An older version
|
||||
is also present in &os; 11 and earlier.</para>
|
||||
</note>
|
||||
|
||||
<para>Once a dump has been obtained, getting useful information
|
||||
out of the dump is relatively easy for simple problems. Before
|
||||
launching into the internals of &man.kgdb.1; to debug
|
||||
the crash dump, locate the debug version of your kernel
|
||||
(normally called <filename>kernel.debug</filename>) and the path
|
||||
to the source files used to build your kernel (normally
|
||||
<filename>/usr/obj/usr/src/sys/<replaceable>KERNCONF</replaceable></filename>
|
||||
or
|
||||
<filename>/usr/obj/usr/src/<replaceable>amd64.amd64</replaceable>/sys/<replaceable>KERNCONF</replaceable></filename>,
|
||||
where <filename><replaceable>amd64.amd64</replaceable></filename>
|
||||
is the architecture and
|
||||
<filename><replaceable>KERNCONF</replaceable></filename>
|
||||
is the <varname>ident</varname> specified in a kernel
|
||||
&man.config.5;). With those two pieces of info, let the
|
||||
debugging commence!</para>
|
||||
|
||||
<para>To enter into the debugger and begin getting information
|
||||
from the dump, the following steps are required at a minimum:</para>
|
||||
from the dump, start kgdb:</para>
|
||||
|
||||
<screen>&prompt.root; <userinput>cd /usr/obj/usr/src/sys/<replaceable>KERNCONF</replaceable></userinput>
|
||||
&prompt.root; <userinput>kgdb kernel.debug /var/crash/vmcore.0</userinput></screen>
|
||||
<screen>&prompt.root; <userinput>kgdb -n <replaceable>N</replaceable></userinput></screen>
|
||||
|
||||
<para>Where <replaceable>N</replaceable> is the suffix of the
|
||||
<filename>vmcore.<replaceable>N</replaceable></filename> to
|
||||
examine. To open the most recent dump use:</para>
|
||||
|
||||
<screen>&prompt.root; <userinput>kgdb -n last</userinput></screen>
|
||||
|
||||
<para>Normally, &man.kgdb.1; should be able to locate the kernel
|
||||
running at the time the dump was generated. If it is not able to
|
||||
locate the correct kernel, pass the pathname of the kernel and
|
||||
dump as two arguments to kgdb:</para>
|
||||
|
||||
<screen>&prompt.root; <userinput>kgdb /boot/kernel/kernel /var/crash/vmcore.0</userinput></screen>
|
||||
|
||||
<para>You can debug the crash dump using the kernel sources just like
|
||||
you can for any other program.</para>
|
||||
|
||||
<para>This first dump is from a 5.2-BETA kernel and the crash
|
||||
<para>This dump is from a 5.2-BETA kernel and the crash
|
||||
comes from deep within the kernel. The output below has been
|
||||
modified to include line numbers on the left. This first trace
|
||||
inspects the instruction pointer and obtains a back trace. The
|
||||
|
|
@ -301,173 +325,12 @@
|
|||
88:#20 0xc070ca4d in Xint0x80_syscall () at {standard input}:136
|
||||
89:---Can't read userspace from dump, or kernel process---
|
||||
90:<prompt>(kgdb)</prompt> <userinput>quit</userinput></screen>
|
||||
|
||||
|
||||
<para>This next trace is an older dump from the FreeBSD 2 time
|
||||
frame, but is more involved and demonstrates more of the
|
||||
features of <command>gdb</command>. Long lines have been folded
|
||||
to improve readability, and the lines are numbered for
|
||||
reference. Despite this, it is a real-world error trace taken
|
||||
during the development of the pcvt console driver.</para>
|
||||
|
||||
<screen> 1:Script started on Fri Dec 30 23:15:22 1994
|
||||
2:&prompt.root; <userinput>cd /sys/compile/URIAH</userinput>
|
||||
3:&prompt.root; <userinput>gdb -k kernel /var/crash/vmcore.1</userinput>
|
||||
4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
|
||||
...done.
|
||||
5:IdlePTD 1f3000
|
||||
6:panic: because you said to!
|
||||
7:current pcb at 1e3f70
|
||||
8:Reading in symbols for ../../i386/i386/machdep.c...done.
|
||||
9:<prompt>(kgdb)</prompt> <userinput>backtrace</userinput>
|
||||
10:#0 boot (arghowto=256) (../../i386/i386/machdep.c line 767)
|
||||
11:#1 0xf0115159 in panic ()
|
||||
12:#2 0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
|
||||
13:#3 0xf010185e in db_fncall ()
|
||||
14:#4 0xf0101586 in db_command (-266509132, -266509516, -267381073)
|
||||
15:#5 0xf0101711 in db_command_loop ()
|
||||
16:#6 0xf01040a0 in db_trap ()
|
||||
17:#7 0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
|
||||
18:#8 0xf019d2eb in trap_fatal (...)
|
||||
19:#9 0xf019ce60 in trap_pfault (...)
|
||||
20:#10 0xf019cb2f in trap (...)
|
||||
21:#11 0xf01932a1 in exception:calltrap ()
|
||||
22:#12 0xf0191503 in cnopen (...)
|
||||
23:#13 0xf0132c34 in spec_open ()
|
||||
24:#14 0xf012d014 in vn_open ()
|
||||
25:#15 0xf012a183 in open ()
|
||||
26:#16 0xf019d4eb in syscall (...)
|
||||
27:<prompt>(kgdb)</prompt> <userinput>up 10</userinput>
|
||||
28:Reading in symbols for ../../i386/i386/trap.c...done.
|
||||
29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
|
||||
30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
|
||||
31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
|
||||
32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
|
||||
33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
|
||||
34:ss = -266427884}) (../../i386/i386/trap.c line 283)
|
||||
35:283 (void) trap_pfault(&frame, FALSE);
|
||||
36:<prompt>(kgdb)</prompt> <userinput>frame frame->tf_ebp frame->tf_eip</userinput>
|
||||
37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
|
||||
38:#0 0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
|
||||
39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
|
||||
40:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
|
||||
41:<prompt>(kgdb)</prompt> <userinput>list</userinput>
|
||||
42:398
|
||||
43:399 tp->t_state |= TS_CARR_ON;
|
||||
44:400 tp->t_cflag |= CLOCAL; /* cannot be a modem (:-) */
|
||||
45:401
|
||||
46:402 #if PCVT_NETBSD || (PCVT_FREEBSD >= 200)
|
||||
47:403 return ((*linesw[tp->t_line].l_open)(dev, tp));
|
||||
48:404 #else
|
||||
49:405 return ((*linesw[tp->t_line].l_open)(dev, tp, flag));
|
||||
50:406 #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */
|
||||
51:407 }
|
||||
52:<prompt>(kgdb)</prompt> <userinput>print tp</userinput>
|
||||
53:Reading in symbols for ../../i386/i386/cons.c...done.
|
||||
54:$1 = (struct tty *) 0x1bae
|
||||
55:<prompt>(kgdb)</prompt> <userinput>print tp->t_line</userinput>
|
||||
56:$2 = 1767990816
|
||||
57:<prompt>(kgdb)</prompt> <userinput>up</userinput>
|
||||
58:#1 0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
|
||||
59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
|
||||
60: return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
|
||||
61:<prompt>(kgdb)</prompt> <userinput>up</userinput>
|
||||
62:#2 0xf0132c34 in spec_open ()
|
||||
63:<prompt>(kgdb)</prompt> <userinput>up</userinput>
|
||||
64:#3 0xf012d014 in vn_open ()
|
||||
65:<prompt>(kgdb)</prompt> <userinput>up</userinput>
|
||||
66:#4 0xf012a183 in open ()
|
||||
67:<prompt>(kgdb)</prompt> <userinput>up</userinput>
|
||||
68:#5 0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
|
||||
69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
|
||||
70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
|
||||
71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
|
||||
72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
|
||||
73:673 error = (*callp->sy_call)(p, args, rval);
|
||||
74:<prompt>(kgdb)</prompt> <userinput>up</userinput>
|
||||
75:Initial frame selected; you cannot go up.
|
||||
76:<prompt>(kgdb)</prompt> <userinput>quit</userinput></screen>
|
||||
<para>Comments to the above script:</para>
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>line 6:</term>
|
||||
|
||||
<listitem>
|
||||
<para>This is a dump taken from within DDB (see below), hence the
|
||||
panic comment <quote>because you said to!</quote>, and a rather
|
||||
long stack trace; the initial reason for going into DDB has been a
|
||||
page fault trap though.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>line 20:</term>
|
||||
|
||||
<listitem>
|
||||
<para>This is the location of function <function>trap()</function>
|
||||
in the stack trace.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>line 36:</term>
|
||||
|
||||
<listitem>
|
||||
<para>Force usage of a new stack frame; this is no longer necessary.
|
||||
The stack frames are supposed to point to the right
|
||||
locations now, even in case of a trap.
|
||||
From looking at the code in source line 403, there is a
|
||||
high probability that either the pointer access for
|
||||
<quote>tp</quote> was messed up, or the array access was out of
|
||||
bounds.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>line 52:</term>
|
||||
|
||||
<listitem>
|
||||
<para>The pointer looks suspicious, but happens to be a valid
|
||||
address.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>line 56:</term>
|
||||
|
||||
<listitem>
|
||||
<para>However, it obviously points to garbage, so we have found our
|
||||
error! (For those unfamiliar with that particular piece of code:
|
||||
<literal>tp->t_line</literal> refers to the line discipline of
|
||||
the console device here, which must be a rather small integer
|
||||
number.)</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
|
||||
<tip><para>If your system is crashing regularly and you are running
|
||||
out of disk space, deleting old <filename>vmcore</filename>
|
||||
files in <filename>/var/crash</filename> could save a
|
||||
considerable amount of disk space!</para></tip>
|
||||
</sect1>
|
||||
|
||||
<sect1 xml:id="kerneldebug-ddd">
|
||||
<title>Debugging a Crash Dump with DDD</title>
|
||||
|
||||
<para>Examining a kernel crash dump with a graphical debugger like
|
||||
<command>ddd</command> is also possible (you will need to install
|
||||
the <package>devel/ddd</package> port in order to use the
|
||||
<command>ddd</command> debugger). Add the <option>-k</option>
|
||||
option to the <command>ddd</command> command line you would use
|
||||
normally. For example;</para>
|
||||
|
||||
<screen>&prompt.root; <userinput>ddd --debugger kgdb kernel.debug /var/crash/vmcore.0</userinput></screen>
|
||||
|
||||
<para>You should then be able to go about looking at the crash dump using
|
||||
<command>ddd</command>'s graphical interface.</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 xml:id="kerneldebug-online-ddb">
|
||||
<title>On-Line Kernel Debugging Using DDB</title>
|
||||
|
||||
|
|
@ -481,7 +344,7 @@
|
|||
breakpoints, single-stepping kernel functions, examining and changing
|
||||
kernel variables, etc. However, it cannot access kernel source files,
|
||||
and only has access to the global and static symbols, not to the full
|
||||
debug information like <command>gdb</command> does.</para>
|
||||
debug information like <command>kgdb</command> does.</para>
|
||||
|
||||
<para>To configure your kernel to include DDB, add the options
|
||||
|
||||
|
|
@ -491,19 +354,13 @@
|
|||
to your config file, and rebuild. (See <link xlink:href="&url.books.handbook;/index.html">The FreeBSD Handbook</link> for details on
|
||||
configuring the FreeBSD kernel).</para>
|
||||
|
||||
<note>
|
||||
<para>If you have an older version of the boot blocks, your
|
||||
debugger symbols might not be loaded at all. Update the boot blocks;
|
||||
the recent ones load the DDB symbols automatically.</para>
|
||||
</note>
|
||||
|
||||
<para>Once your DDB kernel is running, there are several ways to enter
|
||||
DDB. The first, and earliest way is to type the boot flag
|
||||
<option>-d</option> right at the boot prompt. The kernel will start up
|
||||
DDB. The first, and earliest way is to use the boot flag
|
||||
<option>-d</option>. The kernel will start up
|
||||
in debug mode and enter DDB prior to any device probing. Hence you can
|
||||
even debug the device probe/attach functions. Users of &os.current;
|
||||
will need to use the boot menu option, six, to escape to a command
|
||||
prompt.</para>
|
||||
even debug the device probe/attach functions. To use this, exit
|
||||
the loader's boot menu and enter <command>boot -d</command> at
|
||||
the loader prompt.</para>
|
||||
|
||||
<para>The second scenario is to drop to the debugger once the
|
||||
system has booted. There are two simple ways to accomplish
|
||||
|
|
@ -511,10 +368,6 @@
|
|||
command prompt, simply type the command:</para>
|
||||
|
||||
<screen>&prompt.root; <userinput>sysctl debug.kdb.enter=1</userinput></screen>
|
||||
<note>
|
||||
<para>To force a panic on the fly, issue the following command:</para>
|
||||
<screen>&prompt.root; <userinput>sysctl debug.kdb.panic=1</userinput></screen>
|
||||
</note>
|
||||
|
||||
<para>Alternatively, if you are at the system console, you may use
|
||||
a hot-key on the keyboard. The default break-to-debugger
|
||||
|
|
@ -556,15 +409,13 @@
|
|||
|
||||
<screen><userinput>continue</userinput></screen>
|
||||
|
||||
<para>To get a stack trace, use:</para>
|
||||
<para>To get a stack trace of the current thread, use:</para>
|
||||
|
||||
<screen><userinput>trace</userinput></screen>
|
||||
|
||||
<note>
|
||||
<para>Note that when entering DDB via a hot-key, the kernel is currently
|
||||
servicing an interrupt, so the stack trace might be not of much use
|
||||
to you.</para>
|
||||
</note>
|
||||
<para>To get a stack trace of an arbitrary thread, specify a
|
||||
process ID or thread ID as a second argument to
|
||||
<command>trace</command>.</para>
|
||||
|
||||
<para>If you want to remove a breakpoint, use</para>
|
||||
|
||||
|
|
@ -662,10 +513,7 @@
|
|||
<screen><userinput>panic</userinput></screen>
|
||||
|
||||
<para>This will cause your kernel to dump core and reboot, so you can
|
||||
later analyze the core on a higher level with <command>gdb</command>.
|
||||
This command
|
||||
usually must be followed by another <command>continue</command>
|
||||
statement.</para>
|
||||
later analyze the core on a higher level with &man.kgdb.1;.</para>
|
||||
|
||||
<screen><userinput>call boot(0)</userinput></screen>
|
||||
|
||||
|
|
@ -675,7 +523,7 @@
|
|||
the disk and filesystem interfaces of the kernel are not damaged, this
|
||||
could be a good way for an almost clean shutdown.</para>
|
||||
|
||||
<screen><userinput>call cpu_reset()</userinput></screen>
|
||||
<screen><userinput>reset</userinput></screen>
|
||||
|
||||
<para>This is the final way out of disaster and almost the same as hitting the
|
||||
Big Red Button.</para>
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue