clearer to understand and easier to read than having it all in one long sentence. Suggested by: wblock@ Sponsored by: Essen Hackathon 2016
		
			
				
	
	
		
			1221 lines
		
	
	
	
		
			51 KiB
		
	
	
	
		
			XML
		
	
	
	
	
	
			
		
		
	
	
			1221 lines
		
	
	
	
		
			51 KiB
		
	
	
	
		
			XML
		
	
	
	
	
	
<?xml version="1.0" encoding="iso-8859-1"?>
 | 
						|
<!--
 | 
						|
     The FreeBSD Documentation Project
 | 
						|
 | 
						|
     $FreeBSD$
 | 
						|
-->
 | 
						|
<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0" xml:id="kerneldebug">
 | 
						|
  <info><title>Kernel Debugging</title>
 | 
						|
    <authorgroup>
 | 
						|
      <author><personname><firstname>Paul</firstname><surname>Richards</surname></personname><contrib>Contributed by </contrib></author>
 | 
						|
      <author><personname><firstname>Jörg</firstname><surname>Wunsch</surname></personname></author>
 | 
						|
      <author><personname><firstname>Robert</firstname><surname>Watson</surname></personname></author>
 | 
						|
    </authorgroup>
 | 
						|
  </info>
 | 
						|
 | 
						|
  
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-obtain">
 | 
						|
    <title>Obtaining a Kernel Crash Dump</title>
 | 
						|
 | 
						|
    <para>When running a development kernel (e.g., &os.current;), such as a
 | 
						|
      kernel under extreme conditions (e.g., very high load averages,
 | 
						|
      tens of thousands of connections, exceedingly high number of
 | 
						|
      concurrent users, hundreds of &man.jail.8;s, etc.), or using a
 | 
						|
      new feature or device driver on &os.stable; (e.g.,
 | 
						|
      <acronym>PAE</acronym>), sometimes a kernel will panic.  In the
 | 
						|
      event that it does, this chapter will demonstrate how to extract
 | 
						|
      useful information out of a crash.</para>
 | 
						|
 | 
						|
    <para>A system reboot is inevitable once a kernel panics.  Once a
 | 
						|
      system is rebooted, the contents of a system's physical memory
 | 
						|
      (<acronym>RAM</acronym>) is lost, as well as any bits that are
 | 
						|
      on the swap device before the panic.  To preserve the bits in
 | 
						|
      physical memory, the kernel makes use of the swap device as a
 | 
						|
      temporary place to store the bits that are in RAM across a
 | 
						|
      reboot after a crash.  In doing this, when &os; boots after a
 | 
						|
      crash, a kernel image can now be extracted and debugging can
 | 
						|
      take place.</para>
 | 
						|
 | 
						|
    <note><para>A swap device that has been configured as a dump
 | 
						|
      device still acts as a swap device.  Dumps to non-swap devices
 | 
						|
      (such as tapes or CDRWs, for example) are not supported at this time.  A
 | 
						|
      <quote>swap device</quote> is synonymous with a <quote>swap
 | 
						|
      partition.</quote></para></note>
 | 
						|
 | 
						|
    <para>Several types of kernel crash dumps are available:</para>
 | 
						|
      <variablelist>
 | 
						|
	<varlistentry>
 | 
						|
	  <term>Full memory dumps</term>
 | 
						|
 | 
						|
	  <listitem>
 | 
						|
	    <para>Hold the complete contents of physical
 | 
						|
	      memory.</para>
 | 
						|
	  </listitem>
 | 
						|
	</varlistentry>
 | 
						|
 | 
						|
	<varlistentry>
 | 
						|
	  <term>Minidumps</term>
 | 
						|
 | 
						|
	  <listitem>
 | 
						|
	    <para>Hold only memory pages in use by the kernel
 | 
						|
	      (&os; 6.2 and higher).</para>
 | 
						|
	  </listitem>
 | 
						|
	</varlistentry>
 | 
						|
 | 
						|
	<varlistentry>
 | 
						|
	  <term>Textdumps</term>
 | 
						|
 | 
						|
	  <listitem>
 | 
						|
	    <para>Hold captured, scripted, or interactive debugger
 | 
						|
	      output (&os; 7.1 and higher).</para>
 | 
						|
	  </listitem>
 | 
						|
	</varlistentry>
 | 
						|
      </variablelist>
 | 
						|
 | 
						|
      <para>Minidumps are the default dump type as of &os; 7.0,
 | 
						|
	and in most cases will capture all necessary information
 | 
						|
	present in a full memory dump, as most problems can be
 | 
						|
	isolated only using kernel state.</para>
 | 
						|
 | 
						|
    <sect2 xml:id="config-dumpdev">
 | 
						|
      <title>Configuring the Dump Device</title>
 | 
						|
 | 
						|
      <para>Before the kernel will dump the contents of its physical
 | 
						|
	memory to a dump device, a dump device must be configured.  A
 | 
						|
	dump device is specified by using the &man.dumpon.8; command
 | 
						|
	to tell the kernel where to save kernel crash dumps.  The
 | 
						|
	&man.dumpon.8; program must be called after the swap partition
 | 
						|
	has been configured with &man.swapon.8;.  This is normally
 | 
						|
	handled by setting the <varname>dumpdev</varname> variable in
 | 
						|
	&man.rc.conf.5; to the path of the swap device (the
 | 
						|
	recommended way to extract a kernel dump) or
 | 
						|
	<literal>AUTO</literal> to use the first configured swap
 | 
						|
	device.  The default for <varname>dumpdev</varname> is
 | 
						|
	<literal>AUTO</literal> in HEAD, and changed to
 | 
						|
	<literal>NO</literal> on RELENG_* branches (except for RELENG_7,
 | 
						|
	which was left set to <literal>AUTO</literal>).
 | 
						|
	On &os; 9.0-RELEASE and later versions,
 | 
						|
	<application>bsdinstall</application> will ask whether crash dumps
 | 
						|
	should be enabled on the target system during the install process.</para>
 | 
						|
 | 
						|
      <tip><para>Check <filename>/etc/fstab</filename> or
 | 
						|
	&man.swapinfo.8; for a list of swap devices.</para></tip>
 | 
						|
 | 
						|
      <important><para>Make sure the <varname>dumpdir</varname>
 | 
						|
        specified in &man.rc.conf.5; exists before a kernel
 | 
						|
        crash!</para>
 | 
						|
 | 
						|
        <screen>&prompt.root; <userinput>mkdir /var/crash</userinput>
 | 
						|
&prompt.root; <userinput>chmod 700 /var/crash</userinput></screen>
 | 
						|
 | 
						|
        <para>Also, remember that the contents of
 | 
						|
	  <filename>/var/crash</filename> is sensitive and very likely
 | 
						|
	  contains confidential information such as passwords.</para>
 | 
						|
      </important>
 | 
						|
    </sect2>
 | 
						|
 | 
						|
    <sect2 xml:id="extract-dump">
 | 
						|
      <title>Extracting a Kernel Dump</title>
 | 
						|
 | 
						|
        <para>Once a dump has been written to a dump device, the dump
 | 
						|
	  must be extracted before the swap device is mounted.
 | 
						|
	  To extract a dump
 | 
						|
	  from a dump device, use the &man.savecore.8; program.  If
 | 
						|
	  <varname>dumpdev</varname> has been set in &man.rc.conf.5;,
 | 
						|
	  &man.savecore.8; will be called automatically on the first
 | 
						|
	  multi-user boot after the crash and before the swap device
 | 
						|
	  is mounted.  The location of the extracted core is placed in
 | 
						|
	  the &man.rc.conf.5; value <varname>dumpdir</varname>, by
 | 
						|
	  default <filename>/var/crash</filename> and will be named
 | 
						|
	  <filename>vmcore.0</filename>.</para>
 | 
						|
 | 
						|
        <para>In the event that there is already a file called
 | 
						|
          <filename>vmcore.0</filename> in
 | 
						|
          <filename>/var/crash</filename> (or whatever
 | 
						|
          <varname>dumpdir</varname> is set to), the kernel will
 | 
						|
          increment the trailing number for every crash to avoid
 | 
						|
          overwriting an existing <filename>vmcore</filename> (e.g.,
 | 
						|
          <filename>vmcore.1</filename>).  While debugging, it is
 | 
						|
          highly likely that you will want to use the highest version
 | 
						|
          <filename>vmcore</filename> in
 | 
						|
          <filename>/var/crash</filename> when searching for the right
 | 
						|
          <filename>vmcore</filename>.</para>
 | 
						|
 | 
						|
    <tip>
 | 
						|
      <para>If you are testing a new kernel but need to boot a different one in
 | 
						|
      order to get your system up and running again, boot it only into single
 | 
						|
      user mode using the <option>-s</option> flag at the boot prompt, and
 | 
						|
      then perform the following steps:</para>
 | 
						|
 | 
						|
    <screen>&prompt.root; <userinput>fsck -p</userinput>
 | 
						|
&prompt.root; <userinput>mount -a -t ufs</userinput>       # make sure /var/crash is writable
 | 
						|
&prompt.root; <userinput>savecore /var/crash /dev/ad0s1b</userinput>
 | 
						|
&prompt.root; <userinput>exit</userinput>                  # exit to multi-user</screen>
 | 
						|
 | 
						|
    <para>This instructs &man.savecore.8; to extract a kernel dump
 | 
						|
      from <filename>/dev/ad0s1b</filename> and place the contents in
 | 
						|
      <filename>/var/crash</filename>.  Do not forget to make sure the
 | 
						|
      destination directory <filename>/var/crash</filename> has enough
 | 
						|
      space for the dump.  Also, do not forget to specify the correct path to your swap
 | 
						|
      device as it is likely different than
 | 
						|
      <filename>/dev/ad0s1b</filename>!</para></tip>
 | 
						|
    </sect2>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-gdb">
 | 
						|
    <title>Debugging a Kernel Crash Dump with <command>kgdb</command></title>
 | 
						|
 | 
						|
    <note>
 | 
						|
      <para>This section covers &man.kgdb.1; as found in &os; 5.3
 | 
						|
	and later.  In previous versions, one must use
 | 
						|
	<command>gdb -k</command> to read a core dump file.</para>
 | 
						|
    </note>
 | 
						|
 | 
						|
    <para>Once a dump has been obtained, getting useful information
 | 
						|
      out of the dump is relatively easy for simple problems.  Before
 | 
						|
      launching into the internals of &man.kgdb.1; to debug
 | 
						|
      the crash dump, locate the debug version of your kernel
 | 
						|
      (normally called <filename>kernel.debug</filename>) and the path
 | 
						|
      to the source files used to build your kernel (normally
 | 
						|
      <filename>/usr/obj/usr/src/sys/<replaceable>KERNCONF</replaceable></filename>,
 | 
						|
      where <filename><replaceable>KERNCONF</replaceable></filename>
 | 
						|
      is the <varname>ident</varname> specified in a kernel
 | 
						|
      &man.config.5;).  With those two pieces of info, let the
 | 
						|
      debugging commence!</para>
 | 
						|
 | 
						|
    <para>To enter into the debugger and begin getting information
 | 
						|
      from the dump, the following steps are required at a minimum:</para>
 | 
						|
 | 
						|
    <screen>&prompt.root; <userinput>cd /usr/obj/usr/src/sys/<replaceable>KERNCONF</replaceable></userinput>
 | 
						|
&prompt.root; <userinput>kgdb kernel.debug /var/crash/vmcore.0</userinput></screen>
 | 
						|
 | 
						|
    <para>You can debug the crash dump using the kernel sources just like
 | 
						|
      you can for any other program.</para>
 | 
						|
 | 
						|
    <para>This first dump is from a 5.2-BETA kernel and the crash
 | 
						|
      comes from deep within the kernel.  The output below has been
 | 
						|
      modified to include line numbers on the left.  This first trace
 | 
						|
      inspects the instruction pointer and obtains a back trace.  The
 | 
						|
      address that is used on line 41 for the <command>list</command>
 | 
						|
      command is the instruction pointer and can be found on line
 | 
						|
      17.  Most developers will request having at least this
 | 
						|
      information sent to them if you are unable to debug the problem
 | 
						|
      yourself.  If, however, you do solve the problem, make sure that
 | 
						|
      your patch winds its way into the source tree via a problem
 | 
						|
      report, mailing lists, or by being able to commit it!</para>
 | 
						|
 | 
						|
      <screen> 1:&prompt.root; <userinput>cd /usr/obj/usr/src/sys/<replaceable>KERNCONF</replaceable></userinput>
 | 
						|
 2:&prompt.root; <userinput>kgdb kernel.debug /var/crash/vmcore.0</userinput>
 | 
						|
 3:GNU gdb 5.2.1 (FreeBSD)
 | 
						|
 4:Copyright 2002 Free Software Foundation, Inc.
 | 
						|
 5:GDB is free software, covered by the GNU General Public License, and you are
 | 
						|
 6:welcome to change it and/or distribute copies of it under certain conditions.
 | 
						|
 7:Type "show copying" to see the conditions.
 | 
						|
 8:There is absolutely no warranty for GDB.  Type "show warranty" for details.
 | 
						|
 9:This GDB was configured as "i386-undermydesk-freebsd"...
 | 
						|
10:panic: page fault
 | 
						|
11:panic messages:
 | 
						|
12:---
 | 
						|
13:Fatal trap 12: page fault while in kernel mode
 | 
						|
14:cpuid = 0; apic id = 00
 | 
						|
15:fault virtual address   = 0x300
 | 
						|
16:fault code:             = supervisor read, page not present
 | 
						|
17:instruction pointer     = 0x8:0xc0713860
 | 
						|
18:stack pointer           = 0x10:0xdc1d0b70
 | 
						|
19:frame pointer           = 0x10:0xdc1d0b7c
 | 
						|
20:code segment            = base 0x0, limit 0xfffff, type 0x1b
 | 
						|
21:                        = DPL 0, pres 1, def32 1, gran 1
 | 
						|
22:processor eflags        = resume, IOPL = 0
 | 
						|
23:current process         = 14394 (uname)
 | 
						|
24:trap number             = 12
 | 
						|
25:panic: page fault
 | 
						|
26      cpuid = 0;
 | 
						|
27:Stack backtrace:
 | 
						|
28
 | 
						|
29:syncing disks, buffers remaining... 2199 2199 panic: mi_switch: switch in a critical section
 | 
						|
30:cpuid = 0;
 | 
						|
31:Uptime: 2h43m19s
 | 
						|
32:Dumping 255 MB
 | 
						|
33: 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
 | 
						|
34:---
 | 
						|
35:Reading symbols from /boot/kernel/snd_maestro3.ko...done.
 | 
						|
36:Loaded symbols for /boot/kernel/snd_maestro3.ko
 | 
						|
37:Reading symbols from /boot/kernel/snd_pcm.ko...done.
 | 
						|
38:Loaded symbols for /boot/kernel/snd_pcm.ko
 | 
						|
39:#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
 | 
						|
40:240             dumping++;
 | 
						|
41:<prompt>(kgdb)</prompt> <userinput>list *0xc0713860</userinput>
 | 
						|
42:0xc0713860 is in lapic_ipi_wait (/usr/src/sys/i386/i386/local_apic.c:663).
 | 
						|
43:658                     incr = 0;
 | 
						|
44:659                     delay = 1;
 | 
						|
45:660             } else
 | 
						|
46:661                     incr = 1;
 | 
						|
47:662             for (x = 0; x < delay; x += incr) {
 | 
						|
48:663                     if ((lapic->icr_lo & APIC_DELSTAT_MASK) == APIC_DELSTAT_IDLE)
 | 
						|
49:664                             return (1);
 | 
						|
50:665                     ia32_pause();
 | 
						|
51:666             }
 | 
						|
52:667             return (0);
 | 
						|
53:<prompt>(kgdb)</prompt> <userinput>backtrace</userinput>
 | 
						|
54:#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
 | 
						|
55:#1  0xc055fd9b in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372
 | 
						|
56:#2  0xc056019d in panic () at /usr/src/sys/kern/kern_shutdown.c:550
 | 
						|
57:#3  0xc0567ef5 in mi_switch () at /usr/src/sys/kern/kern_synch.c:470
 | 
						|
58:#4  0xc055fa87 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:312
 | 
						|
59:#5  0xc056019d in panic () at /usr/src/sys/kern/kern_shutdown.c:550
 | 
						|
60:#6  0xc0720c66 in trap_fatal (frame=0xdc1d0b30, eva=0)
 | 
						|
61:    at /usr/src/sys/i386/i386/trap.c:821
 | 
						|
62:#7  0xc07202b3 in trap (frame=
 | 
						|
63:      {tf_fs = -1065484264, tf_es = -1065484272, tf_ds = -1065484272, tf_edi = 1, tf_esi = 0, tf_ebp = -602076292, tf_isp = -602076324, tf_ebx = 0, tf_edx = 0, tf_ecx = 1000000, tf_eax = 243, tf_trapno = 12, tf_err = 0, tf_eip = -1066321824, tf_cs = 8, tf_eflags = 65671, tf_esp = 243, tf_ss = 0})
 | 
						|
64:    at /usr/src/sys/i386/i386/trap.c:250
 | 
						|
65:#8  0xc070c9f8 in calltrap () at {standard input}:94
 | 
						|
66:#9  0xc07139f3 in lapic_ipi_vectored (vector=0, dest=0)
 | 
						|
67:    at /usr/src/sys/i386/i386/local_apic.c:733
 | 
						|
68:#10 0xc0718b23 in ipi_selected (cpus=1, ipi=1)
 | 
						|
69:    at /usr/src/sys/i386/i386/mp_machdep.c:1115
 | 
						|
70:#11 0xc057473e in kseq_notify (ke=0xcc05e360, cpu=0)
 | 
						|
71:    at /usr/src/sys/kern/sched_ule.c:520
 | 
						|
72:#12 0xc0575cad in sched_add (td=0xcbcf5c80)
 | 
						|
73:    at /usr/src/sys/kern/sched_ule.c:1366
 | 
						|
74:#13 0xc05666c6 in setrunqueue (td=0xcc05e360)
 | 
						|
75:    at /usr/src/sys/kern/kern_switch.c:422
 | 
						|
76:#14 0xc05752f4 in sched_wakeup (td=0xcbcf5c80)
 | 
						|
77:    at /usr/src/sys/kern/sched_ule.c:999
 | 
						|
78:#15 0xc056816c in setrunnable (td=0xcbcf5c80)
 | 
						|
79:    at /usr/src/sys/kern/kern_synch.c:570
 | 
						|
80:#16 0xc0567d53 in wakeup (ident=0xcbcf5c80)
 | 
						|
81:    at /usr/src/sys/kern/kern_synch.c:411
 | 
						|
82:#17 0xc05490a8 in exit1 (td=0xcbcf5b40, rv=0)
 | 
						|
83:    at /usr/src/sys/kern/kern_exit.c:509
 | 
						|
84:#18 0xc0548011 in sys_exit () at /usr/src/sys/kern/kern_exit.c:102
 | 
						|
85:#19 0xc0720fd0 in syscall (frame=
 | 
						|
86:      {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = -1, tf_ebp = -1077940712, tf_isp = -602075788, tf_ebx = 672411944, tf_edx = 10, tf_ecx = 672411600, tf_eax = 1, tf_trapno = 12, tf_err = 2, tf_eip = 671899563, tf_cs = 31, tf_eflags = 642, tf_esp = -1077940740, tf_ss = 47})
 | 
						|
87:    at /usr/src/sys/i386/i386/trap.c:1010
 | 
						|
88:#20 0xc070ca4d in Xint0x80_syscall () at {standard input}:136
 | 
						|
89:---Can't read userspace from dump, or kernel process---
 | 
						|
90:<prompt>(kgdb)</prompt> <userinput>quit</userinput></screen>
 | 
						|
 | 
						|
 | 
						|
    <para>This next trace is an older dump from the FreeBSD 2 time
 | 
						|
      frame, but is more involved and demonstrates more of the
 | 
						|
      features of <command>gdb</command>.  Long lines have been folded
 | 
						|
      to improve readability, and the lines are numbered for
 | 
						|
      reference. Despite this, it is a real-world error trace taken
 | 
						|
      during the development of the pcvt console driver.</para>
 | 
						|
 | 
						|
<screen> 1:Script started on Fri Dec 30 23:15:22 1994
 | 
						|
 2:&prompt.root; <userinput>cd /sys/compile/URIAH</userinput>
 | 
						|
 3:&prompt.root; <userinput>gdb -k kernel /var/crash/vmcore.1</userinput>
 | 
						|
 4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
 | 
						|
...done.
 | 
						|
 5:IdlePTD 1f3000
 | 
						|
 6:panic: because you said to!
 | 
						|
 7:current pcb at 1e3f70
 | 
						|
 8:Reading in symbols for ../../i386/i386/machdep.c...done.
 | 
						|
 9:<prompt>(kgdb)</prompt> <userinput>backtrace</userinput>
 | 
						|
10:#0  boot (arghowto=256) (../../i386/i386/machdep.c line 767)
 | 
						|
11:#1  0xf0115159 in panic ()
 | 
						|
12:#2  0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
 | 
						|
13:#3  0xf010185e in db_fncall ()
 | 
						|
14:#4  0xf0101586 in db_command (-266509132, -266509516, -267381073)
 | 
						|
15:#5  0xf0101711 in db_command_loop ()
 | 
						|
16:#6  0xf01040a0 in db_trap ()
 | 
						|
17:#7  0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
 | 
						|
18:#8  0xf019d2eb in trap_fatal (...)
 | 
						|
19:#9  0xf019ce60 in trap_pfault (...)
 | 
						|
20:#10 0xf019cb2f in trap (...)
 | 
						|
21:#11 0xf01932a1 in exception:calltrap ()
 | 
						|
22:#12 0xf0191503 in cnopen (...)
 | 
						|
23:#13 0xf0132c34 in spec_open ()
 | 
						|
24:#14 0xf012d014 in vn_open ()
 | 
						|
25:#15 0xf012a183 in open ()
 | 
						|
26:#16 0xf019d4eb in syscall (...)
 | 
						|
27:<prompt>(kgdb)</prompt> <userinput>up 10</userinput>
 | 
						|
28:Reading in symbols for ../../i386/i386/trap.c...done.
 | 
						|
29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
 | 
						|
30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
 | 
						|
31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
 | 
						|
32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
 | 
						|
33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
 | 
						|
34:ss = -266427884}) (../../i386/i386/trap.c line 283)
 | 
						|
35:283                             (void) trap_pfault(&frame, FALSE);
 | 
						|
36:<prompt>(kgdb)</prompt> <userinput>frame frame->tf_ebp frame->tf_eip</userinput>
 | 
						|
37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
 | 
						|
38:#0  0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
 | 
						|
39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
 | 
						|
40:403             return ((*linesw[tp->t_line].l_open)(dev, tp));
 | 
						|
41:<prompt>(kgdb)</prompt> <userinput>list</userinput>
 | 
						|
42:398
 | 
						|
43:399             tp->t_state |= TS_CARR_ON;
 | 
						|
44:400             tp->t_cflag |= CLOCAL;  /* cannot be a modem (:-) */
 | 
						|
45:401
 | 
						|
46:402     #if PCVT_NETBSD || (PCVT_FREEBSD >= 200)
 | 
						|
47:403             return ((*linesw[tp->t_line].l_open)(dev, tp));
 | 
						|
48:404     #else
 | 
						|
49:405             return ((*linesw[tp->t_line].l_open)(dev, tp, flag));
 | 
						|
50:406     #endif /* PCVT_NETBSD || (PCVT_FREEBSD >= 200) */
 | 
						|
51:407     }
 | 
						|
52:<prompt>(kgdb)</prompt> <userinput>print tp</userinput>
 | 
						|
53:Reading in symbols for ../../i386/i386/cons.c...done.
 | 
						|
54:$1 = (struct tty *) 0x1bae
 | 
						|
55:<prompt>(kgdb)</prompt> <userinput>print tp->t_line</userinput>
 | 
						|
56:$2 = 1767990816
 | 
						|
57:<prompt>(kgdb)</prompt> <userinput>up</userinput>
 | 
						|
58:#1  0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
 | 
						|
59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
 | 
						|
60:       return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
 | 
						|
61:<prompt>(kgdb)</prompt> <userinput>up</userinput>
 | 
						|
62:#2  0xf0132c34 in spec_open ()
 | 
						|
63:<prompt>(kgdb)</prompt> <userinput>up</userinput>
 | 
						|
64:#3  0xf012d014 in vn_open ()
 | 
						|
65:<prompt>(kgdb)</prompt> <userinput>up</userinput>
 | 
						|
66:#4  0xf012a183 in open ()
 | 
						|
67:<prompt>(kgdb)</prompt> <userinput>up</userinput>
 | 
						|
68:#5  0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
 | 
						|
69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
 | 
						|
70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
 | 
						|
71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
 | 
						|
72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
 | 
						|
73:673             error = (*callp->sy_call)(p, args, rval);
 | 
						|
74:<prompt>(kgdb)</prompt> <userinput>up</userinput>
 | 
						|
75:Initial frame selected; you cannot go up.
 | 
						|
76:<prompt>(kgdb)</prompt> <userinput>quit</userinput></screen>
 | 
						|
    <para>Comments to the above script:</para>
 | 
						|
 | 
						|
    <variablelist>
 | 
						|
      <varlistentry>
 | 
						|
	<term>line 6:</term>
 | 
						|
 | 
						|
	<listitem>
 | 
						|
	  <para>This is a dump taken from within DDB (see below), hence the
 | 
						|
	    panic comment <quote>because you said to!</quote>, and a rather
 | 
						|
	    long stack trace; the initial reason for going into DDB has been a
 | 
						|
	    page fault trap though.</para>
 | 
						|
	</listitem>
 | 
						|
      </varlistentry>
 | 
						|
 | 
						|
      <varlistentry>
 | 
						|
	<term>line 20:</term>
 | 
						|
 | 
						|
	<listitem>
 | 
						|
	  <para>This is the location of function <function>trap()</function>
 | 
						|
	    in the stack trace.</para>
 | 
						|
	</listitem>
 | 
						|
      </varlistentry>
 | 
						|
 | 
						|
      <varlistentry>
 | 
						|
	<term>line 36:</term>
 | 
						|
 | 
						|
	<listitem>
 | 
						|
	  <para>Force usage of a new stack frame; this is no longer necessary.
 | 
						|
	    The stack frames are supposed to point to the right
 | 
						|
	    locations now, even in case of a trap.
 | 
						|
	    From looking at the code in source line 403, there is a
 | 
						|
	    high probability that either the pointer access for
 | 
						|
	    <quote>tp</quote> was messed up, or the array access was out of
 | 
						|
	    bounds.</para>
 | 
						|
	</listitem>
 | 
						|
      </varlistentry>
 | 
						|
 | 
						|
      <varlistentry>
 | 
						|
	<term>line 52:</term>
 | 
						|
 | 
						|
	<listitem>
 | 
						|
	  <para>The pointer looks suspicious, but happens to be a valid
 | 
						|
	    address.</para>
 | 
						|
	</listitem>
 | 
						|
      </varlistentry>
 | 
						|
 | 
						|
      <varlistentry>
 | 
						|
	<term>line 56:</term>
 | 
						|
 | 
						|
	<listitem>
 | 
						|
	  <para>However, it obviously points to garbage, so we have found our
 | 
						|
	    error! (For those unfamiliar with that particular piece of code:
 | 
						|
	    <literal>tp->t_line</literal> refers to the line discipline  of
 | 
						|
	    the console device here, which must be a rather small integer
 | 
						|
	    number.)</para>
 | 
						|
	</listitem>
 | 
						|
      </varlistentry>
 | 
						|
    </variablelist>
 | 
						|
 | 
						|
    <tip><para>If your system is crashing regularly and you are running
 | 
						|
      out of disk space, deleting old <filename>vmcore</filename>
 | 
						|
      files in <filename>/var/crash</filename> could save a
 | 
						|
      considerable amount of disk space!</para></tip>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-ddd">
 | 
						|
    <title>Debugging a Crash Dump with DDD</title>
 | 
						|
 | 
						|
    <para>Examining a kernel crash dump with a graphical debugger like
 | 
						|
      <command>ddd</command> is also possible (you will need to install
 | 
						|
      the <package>devel/ddd</package> port in order to use the
 | 
						|
      <command>ddd</command> debugger).  Add the <option>-k</option>
 | 
						|
      option to the <command>ddd</command> command line you would use
 | 
						|
      normally.  For example;</para>
 | 
						|
 | 
						|
    <screen>&prompt.root; <userinput>ddd --debugger kgdb kernel.debug /var/crash/vmcore.0</userinput></screen>
 | 
						|
 | 
						|
    <para>You should then be able to go about looking at the crash dump using
 | 
						|
      <command>ddd</command>'s graphical interface.</para>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-online-ddb">
 | 
						|
    <title>On-Line Kernel Debugging Using DDB</title>
 | 
						|
 | 
						|
    <para>While <command>kgdb</command> as an off-line debugger provides a very
 | 
						|
      high level of user interface, there are some things it cannot do.  The
 | 
						|
      most important ones being breakpointing and single-stepping kernel
 | 
						|
      code.</para>
 | 
						|
 | 
						|
    <para>If you need to do low-level debugging on your kernel, there is an
 | 
						|
      on-line debugger available called DDB.  It allows setting of
 | 
						|
      breakpoints, single-stepping kernel functions, examining and changing
 | 
						|
      kernel variables, etc.  However, it cannot access kernel source files,
 | 
						|
      and only has access to the global and static symbols, not to the full
 | 
						|
      debug information like <command>gdb</command> does.</para>
 | 
						|
 | 
						|
    <para>To configure your kernel to include DDB, add the options
 | 
						|
 | 
						|
      <programlisting>options KDB</programlisting>
 | 
						|
      <programlisting>options DDB</programlisting>
 | 
						|
 | 
						|
      to your config file, and rebuild.  (See <link xlink:href="&url.books.handbook;/index.html">The FreeBSD Handbook</link> for details on
 | 
						|
      configuring the FreeBSD kernel).</para>
 | 
						|
 | 
						|
    <note>
 | 
						|
      <para>If you have an older version of the boot blocks, your
 | 
						|
	debugger symbols might not be loaded at all.  Update the boot blocks;
 | 
						|
	the recent ones load the DDB symbols automatically.</para>
 | 
						|
    </note>
 | 
						|
 | 
						|
    <para>Once your DDB kernel is running, there are several ways to enter
 | 
						|
      DDB.  The first, and earliest way is to type the boot flag
 | 
						|
      <option>-d</option> right at the boot prompt.  The kernel will start up
 | 
						|
      in debug mode and enter DDB prior to any device probing.  Hence you can
 | 
						|
      even debug the device probe/attach functions.  Users of &os.current;
 | 
						|
      will need to use the boot menu option, six, to escape to a command
 | 
						|
      prompt.</para>
 | 
						|
 | 
						|
    <para>The second scenario is to drop to the debugger once the
 | 
						|
      system has booted.  There are two simple ways to accomplish
 | 
						|
      this.  If you would like to break to the debugger from the
 | 
						|
      command prompt, simply type the command:</para>
 | 
						|
 | 
						|
    <screen>&prompt.root; <userinput>sysctl debug.kdb.enter=1</userinput></screen>
 | 
						|
    <note>
 | 
						|
      <para>To force a panic on the fly, issue the following command:</para>
 | 
						|
      <screen>&prompt.root; <userinput>sysctl debug.kdb.panic=1</userinput></screen>
 | 
						|
    </note>
 | 
						|
 | 
						|
    <para>Alternatively, if you are at the system console, you may use
 | 
						|
      a hot-key on the keyboard.  The default break-to-debugger
 | 
						|
      sequence is <keycombo action="simul"><keycap>Ctrl</keycap>
 | 
						|
      <keycap>Alt</keycap><keycap>ESC</keycap></keycombo>.  For
 | 
						|
      syscons, this sequence can be remapped and some of the
 | 
						|
      distributed maps out there do this, so check to make sure you
 | 
						|
      know the right sequence to use.  There is an option available
 | 
						|
      for serial consoles that allows the use of a serial line BREAK on the
 | 
						|
      console line to enter DDB (<literal>options BREAK_TO_DEBUGGER</literal>
 | 
						|
      in the kernel config file).  It is not the default since there are a lot
 | 
						|
      of serial adapters around that gratuitously generate a BREAK
 | 
						|
      condition, for example when pulling the cable.</para>
 | 
						|
 | 
						|
    <para>The third way is that any panic condition will branch to DDB if the
 | 
						|
      kernel is configured to use it.  For this reason, it is not wise to
 | 
						|
      configure a kernel with DDB for a machine running unattended.</para>
 | 
						|
 | 
						|
    <para>To obtain the unattended functionality, add:</para>
 | 
						|
 | 
						|
    <programlisting>options	KDB_UNATTENDED</programlisting>
 | 
						|
 | 
						|
    <para>to the kernel configuration file and rebuild/reinstall.</para>
 | 
						|
 | 
						|
    <para>The DDB commands roughly resemble some <command>gdb</command>
 | 
						|
      commands.  The first thing you probably need to do is to set a
 | 
						|
      breakpoint:</para>
 | 
						|
 | 
						|
    <screen><userinput>break function-name address</userinput></screen>
 | 
						|
 | 
						|
    <para>Numbers are taken hexadecimal by default, but to make them distinct
 | 
						|
      from symbol names; hexadecimal numbers starting with the letters
 | 
						|
      <literal>a-f</literal> need to be preceded with <literal>0x</literal>
 | 
						|
      (this is optional for other numbers).  Simple expressions are allowed,
 | 
						|
      for example: <literal>function-name + 0x103</literal>.</para>
 | 
						|
 | 
						|
    <para>To exit the debugger and continue execution,
 | 
						|
      type:</para>
 | 
						|
 | 
						|
    <screen><userinput>continue</userinput></screen>
 | 
						|
 | 
						|
    <para>To get a stack trace, use:</para>
 | 
						|
 | 
						|
    <screen><userinput>trace</userinput></screen>
 | 
						|
 | 
						|
    <note>
 | 
						|
      <para>Note that when entering DDB via a hot-key, the kernel is currently
 | 
						|
	servicing an interrupt, so the stack trace might be not of much use
 | 
						|
	to you.</para>
 | 
						|
    </note>
 | 
						|
 | 
						|
    <para>If you want to remove a breakpoint, use</para>
 | 
						|
 | 
						|
    <screen><userinput>del</userinput>
 | 
						|
<userinput>del address-expression</userinput></screen>
 | 
						|
 | 
						|
    <para>The first form will be accepted immediately after a breakpoint hit,
 | 
						|
      and deletes the current breakpoint.  The second form can remove any
 | 
						|
      breakpoint, but you need to specify the exact address; this can be
 | 
						|
      obtained from:</para>
 | 
						|
 | 
						|
    <screen><userinput>show b</userinput></screen>
 | 
						|
 | 
						|
    <para>or:</para>
 | 
						|
 | 
						|
    <screen><userinput>show break</userinput></screen>
 | 
						|
 | 
						|
    <para>To single-step the kernel, try:</para>
 | 
						|
 | 
						|
    <screen><userinput>s</userinput></screen>
 | 
						|
 | 
						|
    <para>This will step into functions, but you can make DDB trace them until
 | 
						|
      the matching return statement is reached by:</para>
 | 
						|
 | 
						|
    <screen><userinput>n</userinput></screen>
 | 
						|
 | 
						|
    <note>
 | 
						|
      <para>This is different from <command>gdb</command>'s
 | 
						|
	<command>next</command> statement; it is like <command>gdb</command>'s
 | 
						|
	<command>finish</command>.  Pressing <keycap>n</keycap> more than once
 | 
						|
        will cause a continue.</para>
 | 
						|
    </note>
 | 
						|
 | 
						|
    <para>To examine data from memory, use (for example):
 | 
						|
 | 
						|
      <screen><userinput>x/wx 0xf0133fe0,40</userinput>
 | 
						|
<userinput>x/hd db_symtab_space</userinput>
 | 
						|
<userinput>x/bc termbuf,10</userinput>
 | 
						|
<userinput>x/s stringbuf</userinput></screen>
 | 
						|
 | 
						|
      for word/halfword/byte access, and hexadecimal/decimal/character/ string
 | 
						|
      display.  The number after the comma is the object count.  To display
 | 
						|
      the next 0x10 items, simply use:</para>
 | 
						|
 | 
						|
    <screen><userinput>x ,10</userinput></screen>
 | 
						|
 | 
						|
    <para>Similarly, use
 | 
						|
 | 
						|
      <screen><userinput>x/ia foofunc,10</userinput></screen>
 | 
						|
 | 
						|
      to disassemble the first 0x10 instructions of
 | 
						|
      <function>foofunc</function>, and display them along with their offset
 | 
						|
      from the beginning of <function>foofunc</function>.</para>
 | 
						|
 | 
						|
    <para>To modify memory, use the write command:</para>
 | 
						|
 | 
						|
    <screen><userinput>w/b termbuf 0xa 0xb 0</userinput>
 | 
						|
<userinput>w/w 0xf0010030 0 0</userinput></screen>
 | 
						|
 | 
						|
    <para>The command modifier
 | 
						|
      (<literal>b</literal>/<literal>h</literal>/<literal>w</literal>)
 | 
						|
      specifies the size of the data to be written, the first following
 | 
						|
      expression is the address to write to and the remainder is interpreted
 | 
						|
      as data to write to successive memory locations.</para>
 | 
						|
 | 
						|
    <para>If you need to know the current registers, use:</para>
 | 
						|
 | 
						|
    <screen><userinput>show reg</userinput></screen>
 | 
						|
 | 
						|
    <para>Alternatively, you can display a single register value by e.g.
 | 
						|
 | 
						|
      <screen><userinput>p $eax</userinput></screen>
 | 
						|
 | 
						|
      and modify it by:</para>
 | 
						|
 | 
						|
    <screen><userinput>set $eax new-value</userinput></screen>
 | 
						|
 | 
						|
    <para>Should you need to call some kernel functions from DDB, simply
 | 
						|
      say:</para>
 | 
						|
 | 
						|
    <screen><userinput>call func(arg1, arg2, ...)</userinput></screen>
 | 
						|
 | 
						|
    <para>The return value will be printed.</para>
 | 
						|
 | 
						|
    <para>For a &man.ps.1; style summary of all running processes, use:</para>
 | 
						|
 | 
						|
    <screen><userinput>ps</userinput></screen>
 | 
						|
 | 
						|
    <para>Now you have examined why your kernel failed, and you wish to
 | 
						|
      reboot.  Remember that, depending on the severity of previous
 | 
						|
      malfunctioning, not all parts of the kernel might still be working as
 | 
						|
      expected.  Perform one of the following actions to shut down and reboot
 | 
						|
      your system:</para>
 | 
						|
 | 
						|
    <screen><userinput>panic</userinput></screen>
 | 
						|
 | 
						|
    <para>This will cause your kernel to dump core and reboot, so you can
 | 
						|
      later analyze the core on a higher level with <command>gdb</command>.
 | 
						|
      This command
 | 
						|
      usually must be followed by another <command>continue</command>
 | 
						|
      statement.</para>
 | 
						|
 | 
						|
    <screen><userinput>call boot(0)</userinput></screen>
 | 
						|
 | 
						|
    <para>Might be a good way to cleanly shut down the running system,
 | 
						|
      <function>sync()</function> all disks, and finally, in some cases,
 | 
						|
      reboot.  As long as
 | 
						|
      the disk and filesystem interfaces of the kernel are not damaged, this
 | 
						|
      could be a good way for an almost clean shutdown.</para>
 | 
						|
 | 
						|
    <screen><userinput>call cpu_reset()</userinput></screen>
 | 
						|
 | 
						|
    <para>This is the final way out of disaster and almost the same as hitting the
 | 
						|
      Big Red Button.</para>
 | 
						|
 | 
						|
    <para>If you need a short command summary, simply type:</para>
 | 
						|
 | 
						|
    <screen><userinput>help</userinput></screen>
 | 
						|
 | 
						|
    <para>It is highly recommended to have a printed copy of the
 | 
						|
	&man.ddb.4; manual page ready for a debugging
 | 
						|
      session.  Remember that it is hard to read the on-line manual while
 | 
						|
      single-stepping the kernel.</para>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-online-gdb">
 | 
						|
    <title>On-Line Kernel Debugging Using Remote GDB</title>
 | 
						|
 | 
						|
    <para>This feature has been supported since FreeBSD 2.2, and it is
 | 
						|
      actually a very neat one.</para>
 | 
						|
 | 
						|
    <para>GDB has already supported <emphasis>remote debugging</emphasis> for
 | 
						|
      a long time.  This is done using a very simple protocol along a serial
 | 
						|
      line.  Unlike the other methods described above, you will need two
 | 
						|
      machines for doing this.  One is the host providing the debugging
 | 
						|
      environment, including all the sources, and a copy of the kernel binary
 | 
						|
      with all the symbols in it, and the other one is the target machine that
 | 
						|
      simply runs a similar copy of the very same kernel (but stripped of the
 | 
						|
      debugging information).</para>
 | 
						|
 | 
						|
    <para>You should configure the kernel in question with <command>config
 | 
						|
	-g</command> if building the <quote>traditional</quote> way.  If
 | 
						|
      building the <quote>new</quote> way, make sure that
 | 
						|
      <literal>makeoptions DEBUG=-g</literal> is in the configuration.
 | 
						|
      In both cases, include <option>DDB</option> in the configuration, and
 | 
						|
      compile it as usual.  This gives a large binary, due to the
 | 
						|
      debugging information.  Copy this kernel to the target machine, strip
 | 
						|
      the debugging symbols off with <command>strip -x</command>, and boot it
 | 
						|
      using the <option>-d</option> boot option.  Connect the serial line
 | 
						|
      of the target machine that has "flags 080" set on its uart device
 | 
						|
      to any serial line of the debugging host.  See &man.uart.4; for
 | 
						|
      information on how to set the flags on an uart device.
 | 
						|
      Now, on the debugging machine, go to the compile directory of the target
 | 
						|
      kernel, and start <command>gdb</command>:</para>
 | 
						|
 | 
						|
    <screen>&prompt.user; <userinput>kgdb kernel</userinput>
 | 
						|
GDB is free software and you are welcome to distribute copies of it
 | 
						|
 under certain conditions; type "show copying" to see the conditions.
 | 
						|
There is absolutely no warranty for GDB; type "show warranty" for details.
 | 
						|
GDB 4.16 (i386-unknown-freebsd),
 | 
						|
Copyright 1996 Free Software Foundation, Inc...
 | 
						|
<prompt>(kgdb)</prompt> </screen>
 | 
						|
 | 
						|
    <para>Initialize the remote debugging session (assuming the first serial
 | 
						|
      port is being used) by:</para>
 | 
						|
 | 
						|
    <screen><prompt>(kgdb)</prompt> <userinput>target remote /dev/cuau0</userinput></screen>
 | 
						|
 | 
						|
    <para>Now, on the target host (the one that entered DDB right before even
 | 
						|
      starting the device probe), type:</para>
 | 
						|
 | 
						|
    <screen>Debugger("Boot flags requested debugger")
 | 
						|
Stopped at Debugger+0x35: movb	$0, edata+0x51bc
 | 
						|
<prompt>db></prompt> <userinput>gdb</userinput></screen>
 | 
						|
 | 
						|
    <para>DDB will respond with:</para>
 | 
						|
 | 
						|
    <screen>Next trap will enter GDB remote protocol mode</screen>
 | 
						|
 | 
						|
    <para>Every time you type <command>gdb</command>, the mode will be toggled
 | 
						|
      between remote GDB and local DDB.  In order to force a next trap
 | 
						|
      immediately, simply type <command>s</command> (step).  Your hosting GDB
 | 
						|
      will now gain control over the target kernel:</para>
 | 
						|
 | 
						|
    <screen>Remote debugging using /dev/cuau0
 | 
						|
Debugger (msg=0xf01b0383 "Boot flags requested debugger")
 | 
						|
    at ../../i386/i386/db_interface.c:257
 | 
						|
<prompt>(kgdb)</prompt></screen>
 | 
						|
 | 
						|
    <para>You can use this session almost as any other GDB session, including
 | 
						|
      full access to the source, running it in gud-mode inside an Emacs window
 | 
						|
      (which gives you an automatic source code display in another Emacs
 | 
						|
      window), etc.</para>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-console">
 | 
						|
    <title>Debugging a Console Driver</title>
 | 
						|
 | 
						|
    <para>Since you need a console driver to run DDB on, things are more
 | 
						|
      complicated if the console driver itself is failing.  You might remember
 | 
						|
      the use of a serial console (either with modified boot blocks, or by
 | 
						|
      specifying <option>-h</option> at the <prompt>Boot:</prompt> prompt),
 | 
						|
      and hook up a standard terminal onto your first serial port.  DDB works
 | 
						|
      on any configured console driver, including a serial
 | 
						|
      console.</para>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-deadlocks">
 | 
						|
    <title>Debugging Deadlocks</title>
 | 
						|
 | 
						|
    <para>You may experience so called deadlocks, a situation where
 | 
						|
      a system stops doing useful work.  To provide a helpful bug
 | 
						|
      report in this situation, use &man.ddb.4; as described in the
 | 
						|
      previous section.  Include the output of <command>ps</command>
 | 
						|
      and <command>trace</command> for suspected processes in the
 | 
						|
      report.</para>
 | 
						|
 | 
						|
    <para>If possible, consider doing further investigation.  The
 | 
						|
      recipe below is especially useful if you suspect that a deadlock
 | 
						|
      occurs in the VFS layer.  Add these options to the kernel
 | 
						|
      configuration file.</para>
 | 
						|
 | 
						|
    <programlisting>makeoptions 	DEBUG=-g
 | 
						|
options 	INVARIANTS
 | 
						|
options 	INVARIANT_SUPPORT
 | 
						|
options 	WITNESS
 | 
						|
options 	WITNESS_SKIPSPIN
 | 
						|
options 	DEBUG_LOCKS
 | 
						|
options 	DEBUG_VFS_LOCKS
 | 
						|
options 	DIAGNOSTIC</programlisting>
 | 
						|
 | 
						|
    <para>When a deadlock occurs, in addition to the output of the
 | 
						|
      <command>ps</command> command, provide information from the
 | 
						|
      <command>show pcpu</command>, <command>show allpcpu</command>,
 | 
						|
      <command>show locks</command>, <command>show alllocks</command>,
 | 
						|
      <command>show lockedvnods</command> and
 | 
						|
      <command>alltrace</command>.</para>
 | 
						|
 | 
						|
    <para>To obtain meaningful backtraces for threaded processes, use
 | 
						|
      <command>thread thread-id</command> to switch to the thread
 | 
						|
      stack, and do a backtrace with <command>where</command>.</para>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-dcons">
 | 
						|
    <title>Kernel debugging with Dcons</title>
 | 
						|
 | 
						|
    <para>&man.dcons.4; is a very simple console driver that is
 | 
						|
      not directly connected with any physical devices.  It just reads
 | 
						|
      and writes characters from and to a buffer in a kernel or
 | 
						|
      loader.  Due to its simple nature, it is very useful for kernel
 | 
						|
      debugging, especially with a &firewire; device.  Currently, &os;
 | 
						|
      provides two ways to interact with the buffer from outside of
 | 
						|
      the kernel using &man.dconschat.8;.</para>
 | 
						|
 | 
						|
    <sect2>
 | 
						|
      <title>Dcons over &firewire;</title>
 | 
						|
 | 
						|
      <para>Most &firewire; (IEEE1394) host controllers are
 | 
						|
	based on the <acronym>OHCI</acronym> specification that
 | 
						|
	supports physical access to the host memory.  This means that
 | 
						|
	once the host controller is initialized, we can access the
 | 
						|
	host memory without the help of software (kernel).   We can
 | 
						|
	exploit this facility for interaction with &man.dcons.4;.
 | 
						|
	&man.dcons.4; provides similar functionality as a serial
 | 
						|
	console.  It emulates two serial ports, one for the console
 | 
						|
	and <acronym>DDB</acronym>, the other for
 | 
						|
	<acronym>GDB</acronym>.  Because remote memory access is fully
 | 
						|
	handled by the hardware, the &man.dcons.4; buffer is
 | 
						|
	accessible even when the system crashes.</para>
 | 
						|
 | 
						|
      <para>&firewire; devices are not limited to those
 | 
						|
	integrated into motherboards.  <acronym>PCI</acronym> cards
 | 
						|
	exist for desktops, and a cardbus interface can be purchased
 | 
						|
	for laptops.</para>
 | 
						|
 | 
						|
      <sect3>
 | 
						|
	<title>Enabling &firewire; and Dcons support on the target
 | 
						|
	  machine</title>
 | 
						|
 | 
						|
	<para>To enable &firewire; and Dcons support in the kernel of
 | 
						|
	  the <emphasis>target machine</emphasis>:</para>
 | 
						|
 | 
						|
	<itemizedlist>
 | 
						|
	  <listitem>
 | 
						|
	    <para>Make sure your kernel supports
 | 
						|
	      <literal>dcons</literal>, <literal>dcons_crom</literal>
 | 
						|
	      and <literal>firewire</literal>.
 | 
						|
	      <literal>Dcons</literal> should be statically linked
 | 
						|
	      with the kernel.  For <literal>dcons_crom</literal> and
 | 
						|
	      <literal>firewire</literal>, modules should be
 | 
						|
	      OK.</para>
 | 
						|
	  </listitem>
 | 
						|
	  <listitem>
 | 
						|
	    <para>Make sure physical <acronym>DMA</acronym> is enabled.
 | 
						|
	      You may need to add
 | 
						|
	      <literal>hw.firewire.phydma_enable=1</literal> to
 | 
						|
	      <filename>/boot/loader.conf</filename>.</para>
 | 
						|
	  </listitem>
 | 
						|
	  <listitem>
 | 
						|
	    <para>Add options for debugging.</para>
 | 
						|
	  </listitem>
 | 
						|
	  <listitem>
 | 
						|
	    <para>Add <literal>dcons_gdb=1</literal> in
 | 
						|
	      <filename>/boot/loader.conf</filename> if you use GDB
 | 
						|
	      over &firewire;.</para>
 | 
						|
	  </listitem>
 | 
						|
	  <listitem>
 | 
						|
	    <para>Enable <literal>dcons</literal> in
 | 
						|
	      <filename>/etc/ttys</filename>.</para>
 | 
						|
	  </listitem>
 | 
						|
	  <listitem>
 | 
						|
	    <para>Optionally, to force <literal>dcons</literal> to
 | 
						|
	      be the high-level console, add 
 | 
						|
	      <literal>hw.firewire.dcons_crom.force_console=1</literal> 
 | 
						|
	      to <filename>loader.conf</filename>.</para>
 | 
						|
	  </listitem>
 | 
						|
        </itemizedlist>
 | 
						|
 | 
						|
        <para>To enable &firewire; and Dcons support in &man.loader.8;
 | 
						|
	  on i386 or amd64:</para>
 | 
						|
	    
 | 
						|
        <para>Add
 | 
						|
	  <literal>LOADER_FIREWIRE_SUPPORT=YES</literal> in
 | 
						|
	  <filename>/etc/make.conf</filename> and rebuild
 | 
						|
	  &man.loader.8;:</para>
 | 
						|
 | 
						|
        <screen>&prompt.root; <userinput>cd /sys/boot/i386 && make clean && make && make install</userinput></screen>
 | 
						|
 | 
						|
        <para>To enable &man.dcons.4; as an active low-level
 | 
						|
	  console, add <literal>boot_multicons="YES"</literal> to 
 | 
						|
	  <filename>/boot/loader.conf</filename>.</para>
 | 
						|
	  
 | 
						|
	<para>Here are a few configuration examples.  A sample kernel
 | 
						|
	  configuration file would contain:</para>
 | 
						|
 | 
						|
	<screen>device dcons
 | 
						|
device dcons_crom
 | 
						|
options KDB
 | 
						|
options DDB
 | 
						|
options GDB
 | 
						|
options ALT_BREAK_TO_DEBUGGER</screen>
 | 
						|
 | 
						|
	<para>And a sample <filename>/boot/loader.conf</filename>
 | 
						|
	  would contain:</para>
 | 
						|
 | 
						|
	<screen>dcons_crom_load="YES"
 | 
						|
dcons_gdb=1
 | 
						|
boot_multicons="YES"
 | 
						|
hw.firewire.phydma_enable=1
 | 
						|
hw.firewire.dcons_crom.force_console=1</screen>
 | 
						|
 | 
						|
      </sect3>
 | 
						|
 | 
						|
      <sect3>
 | 
						|
	<title>Enabling &firewire; and Dcons support on the host
 | 
						|
	  machine</title>
 | 
						|
 | 
						|
	<para>To enable &firewire; support in the kernel on the
 | 
						|
	  <emphasis>host machine</emphasis>:</para>
 | 
						|
 | 
						|
	<screen>&prompt.root; <userinput>kldload firewire</userinput></screen>
 | 
						|
 | 
						|
	<para>Find out the <acronym>EUI64</acronym> (the unique 64
 | 
						|
	  bit identifier) of the &firewire; host controller, and
 | 
						|
	  use &man.fwcontrol.8; or <command>dmesg</command> to
 | 
						|
	  find the <acronym>EUI64</acronym> of the target machine.</para>
 | 
						|
 | 
						|
	<para>Run &man.dconschat.8;, with:</para>
 | 
						|
 | 
						|
	<screen>&prompt.root; <userinput>dconschat -e \# -br -G 12345 -t <replaceable>00-11-22-33-44-55-66-77</replaceable></userinput></screen>
 | 
						|
 | 
						|
	<para>The following key combinations can be used once
 | 
						|
	  &man.dconschat.8; is running:</para>
 | 
						|
 | 
						|
	<informaltable pgwide="1">
 | 
						|
	  <tgroup cols="2">
 | 
						|
	    <tbody>
 | 
						|
	      <row>
 | 
						|
		<entry>
 | 
						|
		  <keycombo action="seq">
 | 
						|
		    <keycap>~</keycap>
 | 
						|
		    <keycap>.</keycap>
 | 
						|
		  </keycombo>
 | 
						|
		</entry>
 | 
						|
		<entry>Disconnect</entry>
 | 
						|
	      </row>
 | 
						|
	      <row>
 | 
						|
		<entry>
 | 
						|
		  <keycombo action="seq">
 | 
						|
		    <keycap>~</keycap>
 | 
						|
		    <keycombo action="simul">
 | 
						|
		      <keycap>Ctrl</keycap>
 | 
						|
		      <keycap>B</keycap>
 | 
						|
		    </keycombo>
 | 
						|
		  </keycombo>
 | 
						|
		</entry>	  
 | 
						|
		<entry>ALT BREAK</entry>
 | 
						|
	      </row>
 | 
						|
	      <row>
 | 
						|
		<entry>
 | 
						|
		  <keycombo action="seq">
 | 
						|
		    <keycap>~</keycap>
 | 
						|
		    <keycombo action="simul">
 | 
						|
		      <keycap>Ctrl</keycap>
 | 
						|
		      <keycap>R</keycap>
 | 
						|
		    </keycombo>
 | 
						|
		  </keycombo>
 | 
						|
		</entry>
 | 
						|
		<entry>RESET target</entry>
 | 
						|
	      </row>
 | 
						|
	      <row>
 | 
						|
		<entry>
 | 
						|
		  <keycombo action="seq">
 | 
						|
		    <keycap>~</keycap>
 | 
						|
		    <keycombo action="simul">
 | 
						|
		      <keycap>Ctrl</keycap>
 | 
						|
		      <keycap>Z</keycap>
 | 
						|
		    </keycombo>
 | 
						|
		  </keycombo>
 | 
						|
		</entry>	
 | 
						|
		<entry>Suspend dconschat</entry>
 | 
						|
	      </row>
 | 
						|
	    </tbody>
 | 
						|
	  </tgroup>
 | 
						|
	</informaltable>
 | 
						|
 | 
						|
	<para>Attach remote <acronym>GDB</acronym> by starting
 | 
						|
	  &man.kgdb.1; with a remote debugging session:</para>
 | 
						|
 | 
						|
	<screen><userinput>kgdb -r :12345 kernel</userinput></screen>
 | 
						|
 | 
						|
      </sect3>
 | 
						|
      <sect3>
 | 
						|
	<title>Some general tips</title>
 | 
						|
 | 
						|
	<para>Here are some general tips:</para>
 | 
						|
 | 
						|
	<para>To take full advantage of the speed of &firewire;,
 | 
						|
	  disable other slow console drivers:</para>
 | 
						|
 | 
						|
	<screen>&prompt.root; conscontrol delete ttyd0	     # serial console
 | 
						|
&prompt.root; conscontrol delete consolectl	# video/keyboard</screen>
 | 
						|
 | 
						|
	<para>There exists a <acronym>GDB</acronym> mode for
 | 
						|
	  &man.emacs.1;; this is what you will need to add to your
 | 
						|
	  <filename>.emacs</filename>:</para>
 | 
						|
 | 
						|
	<screen><userinput>(setq gud-gdba-command-name "kgdb -a -a -a -r :12345")
 | 
						|
(setq gdb-many-windows t)
 | 
						|
(xterm-mouse-mode 1)
 | 
						|
M-x gdba</userinput></screen>
 | 
						|
 | 
						|
	<para>And for <acronym>DDD</acronym> (<filename>devel/ddd</filename>):</para>
 | 
						|
 | 
						|
	<screen># remote serial protocol
 | 
						|
LANG=C ddd --debugger kgdb -r :12345 kernel
 | 
						|
# live core debug
 | 
						|
LANG=C ddd --debugger kgdb kernel /dev/fwmem0.2</screen>
 | 
						|
      </sect3>
 | 
						|
    </sect2>
 | 
						|
 | 
						|
    <sect2>
 | 
						|
      <title>Dcons with KVM</title>
 | 
						|
 | 
						|
      <para>We can directly read the &man.dcons.4; buffer via
 | 
						|
	<filename>/dev/mem</filename> for live systems, and in the
 | 
						|
	core dump for crashed systems.  These give you similar output
 | 
						|
	to <command>dmesg -a</command>, but the &man.dcons.4; buffer
 | 
						|
	includes more information.</para> 
 | 
						|
 | 
						|
      <sect3>
 | 
						|
	<title>Using Dcons with KVM</title>
 | 
						|
 | 
						|
	<para>To use &man.dcons.4; with <acronym>KVM</acronym>:</para>
 | 
						|
 | 
						|
	<para>Dump a &man.dcons.4; buffer of a live system:</para>
 | 
						|
 | 
						|
	<screen>&prompt.root; <userinput>dconschat -1</userinput></screen>
 | 
						|
 | 
						|
	<para>Dump a &man.dcons.4; buffer of a crash dump:</para>
 | 
						|
 | 
						|
	<screen>&prompt.root; <userinput>dconschat -1 -M vmcore.XX</userinput></screen>
 | 
						|
 | 
						|
	<para>Live core debugging can be done via:</para>
 | 
						|
 | 
						|
	<screen>&prompt.root; <userinput>fwcontrol -m target_eui64</userinput>
 | 
						|
&prompt.root; <userinput>kgdb kernel /dev/fwmem0.2</userinput></screen>
 | 
						|
      </sect3>
 | 
						|
    </sect2>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
  <sect1 xml:id="kerneldebug-options">
 | 
						|
    <title>Glossary of Kernel Options for Debugging</title>
 | 
						|
 | 
						|
    <para>This section provides a brief glossary of compile-time kernel
 | 
						|
      options used for debugging:</para>
 | 
						|
 | 
						|
    <itemizedlist>
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options KDB</literal>: compiles in the kernel
 | 
						|
	  debugger framework.  Required for <literal>options DDB</literal>
 | 
						|
	  and <literal>options GDB</literal>.  Little or no performance
 | 
						|
	  overhead.  By default, the debugger will be entered on panic
 | 
						|
	  instead of an automatic reboot.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options KDB_UNATTENDED</literal>: change the default
 | 
						|
	  value of the <literal>debug.debugger_on_panic</literal> sysctl to
 | 
						|
	  0, which controls whether the debugger is entered on panic.  When
 | 
						|
	  <literal>options KDB</literal> is not compiled into the kernel, the
 | 
						|
	  behavior is to automatically reboot on panic; when it is compiled
 | 
						|
	  into the kernel, the default behavior is to drop into the debugger
 | 
						|
	  unless <literal>options KDB_UNATTENDED</literal> is compiled in.
 | 
						|
	  If you want to leave the kernel debugger compiled into the kernel
 | 
						|
	  but want the system to come back up unless you're on-hand to use
 | 
						|
	  the debugger for diagnostics, use this option.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options KDB_TRACE</literal>: change the default value
 | 
						|
	  of the <literal>debug.trace_on_panic</literal> sysctl to 1, which
 | 
						|
	  controls whether the debugger automatically prints a stack trace
 | 
						|
	  on panic.  Especially if running with <literal>options
 | 
						|
	  KDB_UNATTENDED</literal>, this can be helpful to gather basic
 | 
						|
	  debugging information on the serial or firewire console while
 | 
						|
	  still rebooting to recover.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options DDB</literal>: compile in support for the
 | 
						|
	  console debugger, DDB.  This interactive debugger runs on whatever
 | 
						|
	  the active low-level console of the system is, which includes the
 | 
						|
	  video console, serial console, or firewire console.  It provides
 | 
						|
	  basic integrated debugging facilities, such as stack tracing,
 | 
						|
	  process and thread listing, dumping of lock state, VM state, file
 | 
						|
	  system state, and kernel memory management.  DDB does not require
 | 
						|
	  software running on a second machine or being able to generate a
 | 
						|
	  core dump or full debugging kernel symbols, and provides detailed
 | 
						|
	  diagnostics of the kernel at run-time.  Many bugs can be fully
 | 
						|
	  diagnosed using only DDB output.  This option depends on
 | 
						|
	  <literal>options KDB</literal>.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options GDB</literal>: compile in support for the
 | 
						|
	  remote debugger, GDB, which can operate over serial cable or
 | 
						|
	  firewire.  When the debugger is entered, GDB may be attached to
 | 
						|
	  inspect structure contents, generate stack traces, etc.  Some
 | 
						|
	  kernel state is more awkward to access than in DDB, which is able
 | 
						|
	  to generate useful summaries of kernel state automatically, such
 | 
						|
	  as automatically walking lock debugging or kernel memory
 | 
						|
	  management structures, and a second machine running the debugger
 | 
						|
	  is required.  On the other hand, GDB combines information from
 | 
						|
	  the kernel source and full debugging symbols, and is aware of full
 | 
						|
	  data structure definitions, local variables, and is scriptable.
 | 
						|
	  This option is not required to run GDB on a kernel core dump.
 | 
						|
	  This option depends on <literal>options KDB</literal>.
 | 
						|
	  </para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options BREAK_TO_DEBUGGER</literal>, <literal>options
 | 
						|
	  ALT_BREAK_TO_DEBUGGER</literal>: allow a break signal or
 | 
						|
	  alternative signal on the console to enter the debugger.  If the
 | 
						|
	  system hangs without a panic, this is a useful way to reach the
 | 
						|
	  debugger.  Due to the current kernel locking, a break signal
 | 
						|
	  generated on a serial console is significantly more reliable at
 | 
						|
	  getting into the debugger, and is generally recommended.  This
 | 
						|
	  option has little or no performance impact.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
        <para><literal>options INVARIANTS</literal>: compile into the kernel
 | 
						|
	  a large number of run-time assertion checks and tests, which
 | 
						|
	  constantly test the integrity of kernel data structures and the
 | 
						|
	  invariants of kernel algorithms.  These tests can be expensive, so
 | 
						|
	  are not compiled in by default, but help provide useful "fail stop"
 | 
						|
	  behavior, in which certain classes of undesired behavior enter the
 | 
						|
	  debugger before kernel data corruption occurs, making them easier
 | 
						|
	  to debug.  Tests include memory scrubbing and use-after-free
 | 
						|
	  testing, which is one of the more significant sources of overhead.
 | 
						|
	  This option depends on <literal>options INVARIANT_SUPPORT</literal>.
 | 
						|
	  </para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options INVARIANT_SUPPORT</literal>: many of the tests
 | 
						|
	  present in <literal>options INVARIANTS</literal> require modified
 | 
						|
	  data structures or additional kernel symbols to be defined.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options WITNESS</literal>: this option enables run-time
 | 
						|
	  lock order tracking and verification, and is an invaluable tool for
 | 
						|
	  deadlock diagnosis.  WITNESS maintains a graph of acquired lock
 | 
						|
	  orders by lock type, and checks the graph at each acquire for
 | 
						|
	  cycles (implicit or explicit).  If a cycle is detected, a warning
 | 
						|
	  and stack trace are generated to the console, indicating that a
 | 
						|
	  potential deadlock might have occurred.  WITNESS is required in
 | 
						|
	  order to use the <command>show locks</command>, <command>show
 | 
						|
	  witness</command> and <command>show alllocks</command> DDB
 | 
						|
	  commands.  This debug option has significant performance overhead,
 | 
						|
	  which may be somewhat mitigated through the use of <literal>options
 | 
						|
	  WITNESS_SKIPSPIN</literal>.  Detailed documentation may be found in
 | 
						|
	  &man.witness.4;.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options WITNESS_SKIPSPIN</literal>: disable run-time
 | 
						|
	  checking of spinlock lock order with WITNESS.  As spin locks are
 | 
						|
	  acquired most frequently in the scheduler, and scheduler events
 | 
						|
	  occur often, this option can significantly speed up systems
 | 
						|
	  running with WITNESS.  This option depends on <literal>options
 | 
						|
	  WITNESS</literal>.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options WITNESS_KDB</literal>: change the default
 | 
						|
	  value of the <literal>debug.witness.kdb</literal> sysctl to 1,
 | 
						|
	  which causes WITNESS to enter the debugger when a lock order
 | 
						|
	  violation is detected, rather than simply printing a warning.  This
 | 
						|
	  option depends on <literal>options WITNESS</literal>.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options SOCKBUF_DEBUG</literal>: perform extensive
 | 
						|
	  run-time consistency checking on socket buffers, which can be
 | 
						|
	  useful for debugging both socket bugs and race conditions in
 | 
						|
	  protocols and device drivers that interact with sockets.  This
 | 
						|
	  option significantly impacts network performance, and may change
 | 
						|
	  the timing in device driver races.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options DEBUG_VFS_LOCKS</literal>: track lock
 | 
						|
	  acquisition points for lockmgr/vnode locks, expanding the amount
 | 
						|
	  of information displayed by <command>show lockedvnods</command>
 | 
						|
	  in DDB.  This option has a measurable performance impact.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options DEBUG_MEMGUARD</literal>: a replacement for
 | 
						|
	  the &man.malloc.9; kernel memory allocator that uses the VM system
 | 
						|
	  to detect reads or writes from allocated memory after free.
 | 
						|
	  Details may be found in &man.memguard.9;.  This option has a
 | 
						|
	  significant performance impact, but can be very helpful in
 | 
						|
	  debugging kernel memory corruption bugs.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
      <listitem>
 | 
						|
	<para><literal>options DIAGNOSTIC</literal>: enable additional, more
 | 
						|
	  expensive diagnostic tests along the lines of <literal>options
 | 
						|
	  INVARIANTS</literal>.</para>
 | 
						|
      </listitem>
 | 
						|
 | 
						|
    </itemizedlist>
 | 
						|
  </sect1>
 | 
						|
 | 
						|
</chapter>
 |