Update Question 18.14, a major rework:

- Dissolve the original e-mail style description - Add a procedure on how to make (install|build)kernel - Recommend kgdb(1) instead of gdb(1) (based on the Developer's Handbook) - Improve markup (suggested by gabor) - Update path names - Turn "FreeBSD" into &os; - Merge wpaul's original comment into the answer Reviewed by: trhodes Approved by: gabor
svn path=/head/; revision=32488
2008-07-05 03:22:25 +00:00 · 2008-07-05 03:22:25 +00:00 · b909f782f9 · 2020-12-08 03:00:23 +00:00
commit b909f782f9
parent 0411abd4e6
1 changed files with 107 additions and 112 deletions
--- a/en_US.ISO8859-1/books/faq/book.sgml
+++ b/en_US.ISO8859-1/books/faq/book.sgml
@ -10688,42 +10688,28 @@ hint.sio.7.irq="12"</programlisting>
        </question>

        <answer>
-          <para><emphasis>[This section was extracted from a mail
-            written by &a.wpaul; on the freebsd-current
-            <link linkend="mailing">mailing list</link> by &a.des;, who
-            fixed a few typos and added the bracketed comments]
-            </emphasis></para>
+          <para>Here is typical kernel panic:</para>

-          <programlisting>From: Bill Paul &lt;wpaul@skynet.ctr.columbia.edu&gt;
-Subject: Re: the fs fun never stops
-To: Ben Rosengart
-Date: Sun, 20 Sep 1998 15:22:50 -0400 (EDT)
-Cc: current@FreeBSD.org</programlisting>
+          <programlisting>Fatal trap 12: page fault while in kernel mode
+fault virtual address   = 0x40
+fault code              = supervisor read, page not present
+instruction pointer     = 0x8:0xf014a7e5
+stack pointer           = 0x10:0xf4ed6f24
+frame pointer           = 0x10:0xf4ed6f28
+code segment            = base 0x0, limit 0xfffff, type 0x1b
+                        = DPL 0, pres 1, def32 1, gran 1
+processor eflags        = interrupt enabled, resume, IOPL = 0
+current process         = 80 (mount)
+interrupt mask          =
+trap number             = 12
+panic: page fault</programlisting>

-          <para><emphasis>Ben Rosengart posted the following
-            panic message]</emphasis></para>
-
-          <programlisting>&gt; Fatal trap 12: page fault while in kernel mode
-&gt; fault virtual address   = 0x40
-&gt; fault code              = supervisor read, page not present
-&gt; instruction pointer     = 0x8:0xf014a7e5
-                                ^^^^^^^^^^
-&gt; stack pointer           = 0x10:0xf4ed6f24
-&gt; frame pointer           = 0x10:0xf4ed6f28
-&gt; code segment            = base 0x0, limit 0xfffff, type 0x1b
-&gt;                         = DPL 0, pres 1, def32 1, gran 1
-&gt; processor eflags        = interrupt enabled, resume, IOPL = 0
-&gt; current process         = 80 (mount)
-&gt; interrupt mask          =
-&gt; trap number             = 12
-&gt; panic: page fault</programlisting>
-
-          <para>[When] you see a message like this, it is not enough to just
-            reproduce it and send it in. The instruction pointer value that
+          <para>When you see a message like this, it is not enough to just
+            reproduce it and send it in.  The instruction pointer value that
            I highlighted up there is important; unfortunately, it is also
-            configuration dependent. In other words, the value varies
-            depending on the exact kernel image that you are using. If
-            you are using a GENERIC kernel image from one of the snapshots,
+            configuration dependent.  In other words, the value varies
+            depending on the exact kernel image that you are using.  If
+            you are using a <filename>GENERIC</filename> kernel image from one of the snapshots,
            then it is possible for somebody else to track down the
            offending function, but if you are running a custom kernel then
            only <emphasis>you</emphasis> can tell us where the fault
@ -10733,93 +10719,98 @@ Cc: current@FreeBSD.org</programlisting>

            <procedure>
              <step>
-                <para>Write down the instruction pointer value. Note that
+                <para>Write down the instruction pointer value.  Note that
                  the <literal>0x8:</literal> part at the beginning is not
                  significant in this case: it is the
                  <literal>0xf0xxxxxx</literal> part that we want.</para>
              </step>

              <step>
-                <para>When the system reboots, do the following:
+                <para>When the system reboots, do the following:</para>

-                  <screen>&prompt.user; <userinput>nm -n /kernel.that.caused.the.panic | grep f0xxxxxx</userinput></screen>
+                <screen>&prompt.user; <userinput><command>nm</command> <option>-n</option> <replaceable>kernel.that.caused.the.panic</replaceable> | <command>grep</command> f0xxxxxx</userinput></screen>

-                  where <literal>f0xxxxxx</literal> is the instruction
-                  pointer value. The odds are you will not get an exact
+                <para>where <literal>f0xxxxxx</literal> is the instruction
+                  pointer value.  The odds are you will not get an exact
                  match since the symbols in the kernel symbol table are
                  for the entry points of functions and the instruction
                  pointer address will be somewhere inside a function, not
                  at the start. If you do not get an exact match, omit the
                  last digit from the instruction pointer value and try
-                  again, i.e.:
+                  again, i.e.:</para>

-                  <screen>&prompt.user; <userinput>nm -n /kernel.that.caused.the.panic | grep f0xxxxx</userinput></screen>
+                <screen>&prompt.user; <userinput><command>nm</command> <option>-n</option> <replaceable>kernel.that.caused.the.panic</replaceable> | <command>grep</command> f0xxxxx</userinput></screen>

-                   If that does not yield any results, chop off another
-                   digit. Repeat until you get some sort of output. The
+                <para>If that does not yield any results, chop off another
+                   digit.  Repeat until you get some sort of output.  The
                   result will be a possible list of functions which caused
-                   the panic. This is a less than exact mechanism for
+                   the panic.  This is a less than exact mechanism for
                   tracking down the point of failure, but it is better than
                   nothing.</para>
              </step>
            </procedure>

-          <para>I see people constantly show panic messages like this
-            but rarely do I see someone take the time to match up the
-            instruction pointer with a function in the kernel symbol
-            table.</para>
-
-          <para>The best way to track down the cause of a panic is by
-            capturing a crash dump, then using &man.gdb.1; to generate
+          <para>However, the best way to track down the cause of a panic is by
+            capturing a crash dump, then using &man.kgdb.1; to generate
            a stack trace on the crash dump.</para>

-          <para>In any case, the method I normally use is this:</para>
+          <para>In any case, the method is this:</para>

            <procedure>
-              <step>
-                <para>Set up a kernel config file, optionally adding
-                  <literal>options DDB</literal> if you think you need
-                  the kernel debugger for something. (I use this mainly
-                  for setting breakpoints if I suspect an infinite loop
-                  condition of some kind.)</para>
-              </step>
+	      <step>
+		<para>Make sure that the following line is included in
+		  your kernel configuration file
+		  (/usr/src/sys/<replaceable>arch</replaceable>/conf/<replaceable>MYKERNEL</replaceable>):</para>

-              <step>
-                <para>Use <command>config -g
-                  <replaceable>KERNELCONFIG</replaceable></command> to set
-                  up the build directory.</para>
-              </step>
+		<programlisting>makeoptions     DEBUG=-g          # Build kernel with gdb(1) debug symbols</programlisting>
+	      </step>

-              <step>
-                <para><command>cd /sys/compile/<replaceable>KERNELCONFIG</replaceable>; make</command></para>
-              </step>
+	      <step>
+		<para>Change to the <filename
+		    role="directory">/usr/src</filename> directory:</para>

-              <step>
-                <para>Wait for kernel to finish compiling.</para>
-              </step>
+		<screen>&prompt.root; <command>cd</command> <filename role="directory">/usr/src</filename></screen>
+	      </step>

-              <step>
-                <para><command>make install</command></para>
-              </step>
+	      <step>
+		<para>Compile the kernel:</para>

-              <step>
-                <para>reboot</para>
-              </step>
+		<screen>&prompt.root; <command>make</command> <maketarget>buildkernel</maketarget> <makevar>KERNCONFIG</makevar>=<replaceable>MYKERNEL</replaceable></screen>
+	      </step>
+
+	      <step>
+		<para>Wait for &man.make.1; to finish compiling.</para>
+	      </step>
+
+	      <step>
+		<screen>&prompt.root; <command>make</command> <maketarget>installkernel</maketarget> <makevar>KERNCONFIG</makevar>=<replaceable>MYKERNEL</replaceable></screen>
+	      </step>
+
+	      <step>
+		<para>Reboot.</para>
+	      </step>
            </procedure>

-          <para>The &man.make.1; process will have built two kernels.
-            <filename>kernel</filename> and
-            <filename>kernel.debug</filename>.
-            <filename>kernel</filename> was installed as
-            <filename>/kernel</filename>, while
-            <filename>kernel.debug</filename> can be used as the
-            source of debugging symbols for &man.gdb.1;.</para>
+	    <note>
+	      <para>If you do not use the <makevar>KERNCONFIG</makevar>
+		make variable a <filename>GENERIC</filename> kernel will
+		be built and installed.</para>
+	    </note>

-          <para>To make sure you capture a crash dump, you need edit
+	  <para>The &man.make.1; process will have built two kernels.
+	    <filename>/usr/obj/usr/src/sys/<replaceable>MYKERNEL</replaceable>/kernel</filename>
+	    and
+	    <filename>/usr/obj/usr/src/sys/<replaceable>MYKERNEL</replaceable>/kernel.debug</filename>.
+	    <filename>kernel</filename> was installed as
+	    <filename>/boot/kernel/kernel</filename>, while
+	    <filename>kernel.debug</filename> can be used as the source
+	    of debugging symbols for &man.kgdb.1;.</para>
+
+         <para>To make sure you capture a crash dump, you need edit
            <filename>/etc/rc.conf</filename> and set
            <literal>dumpdev</literal> to point to your swap
-            partition. This will cause the &man.rc.8; scripts to use
-            the &man.dumpon.8; command to enable crash dumps. You can
+            partition (or <literal>AUTO</literal>).  This will cause the &man.rc.8; scripts to use
+            the &man.dumpon.8; command to enable crash dumps.  You can
            also run &man.dumpon.8; manually.  After a panic, the
            crash dump can be recovered using &man.savecore.8;; if
            <literal>dumpdev</literal> is set in
@ -10828,27 +10819,28 @@ Cc: current@FreeBSD.org</programlisting>
            dump in <filename>/var/crash</filename>.</para>

            <note>
-              <para>FreeBSD crash dumps are usually the same size as the
-                physical RAM size of your machine. That is, if you have
-                64MB of RAM, you will get a 64MB crash dump. Therefore you
+              <para>&os; crash dumps are usually the same size as the
+                physical RAM size of your machine.  That is, if you have
+                512&nbsp;MB of RAM, you will get a 512&nbsp;MB crash dump. Therefore you
                must make sure there is enough space in
                <filename>/var/crash</filename> to hold the dump.
                Alternatively, you run &man.savecore.8;
                manually and have it recover the crash dump to another
-                directory where you have more room. It is possible to limit
+                directory where you have more room.  It is possible to limit
                the size of the crash dump by using <literal>options
-                MAXMEM=(foo)</literal> to set the amount of memory the
-                kernel will use to something a little more sensible. For
-                example, if you have 128MB of RAM, you can limit the
-                kernel's memory usage to 16MB so that your crash dump size
-                will be 16MB instead of 128MB.</para>
+                MAXMEM=<replaceable>N</replaceable></literal> where
+                <replaceable>N</replaceable> is the size of kernel's memory
+                usage in KBs.
+                For example, if you have 1&nbsp;GB of RAM, you can limit the
+                kernel's memory usage to 128&nbsp;MB by this way, so that your crash dump size
+                will be 128&nbsp;MB instead of 1&nbsp;GB.</para>
            </note>

          <para>Once you have recovered the crash dump, you can get a
-            stack trace with &man.gdb.1; as follows:</para>
+            stack trace with &man.kgdb.1; as follows:</para>

-          <screen>&prompt.user; <userinput>gdb -k /sys/compile/KERNELCONFIG/kernel.debug /var/crash/vmcore.0</userinput>
-<prompt>(gdb)</prompt> <userinput>where</userinput></screen>
+          <screen>&prompt.user; <userinput><command>kgdb</command> <filename>/usr/obj/usr/src/sys/<replaceable>MYKERNEL</replaceable>/kernel.debug</filename> <filename>/var/crash/<replaceable>vmcore.0</replaceable></filename></userinput>
+<prompt>(kgdb)</prompt> <userinput>backtrace</userinput></screen>

          <para>Note that there may be several screens worth of
            information; ideally you should use
@ -10857,25 +10849,28 @@ Cc: current@FreeBSD.org</programlisting>
            the exact line of kernel source code where the panic occurred.
            Usually you have to read the stack trace from the bottom up in
            order to trace the exact sequence of events that lead to the
-            crash. You can also use &man.gdb.1; to print out
+            crash. You can also use &man.kgdb.1; to print out
            the contents of various variables or structures in order to
            examine the system state at the time of the crash.</para>

-          <para>Now, if you are really insane and have a second computer,
-            you can also configure &man.gdb.1; to do remote
-            debugging such that you can use &man.gdb.1; on
-            one system to debug the kernel on another system, including
-            setting breakpoints, single-stepping through the kernel code,
-            just like you can do with a normal user-mode program. I have not
-            played with this yet as I do not often have the chance to set up
-            two machines side by side for debugging purposes.</para>
+	  <tip>
+	    <para>Now, if you are really insane and have a second
+	      computer, you can also configure &man.kgdb.1; to do remote
+	      debugging such that you can use &man.kgdb.1; on one system
+	      to debug the kernel on another system, including setting
+	      breakpoints, single-stepping through the kernel code, just
+	      like you can do with a normal user-mode program.</para>
+	  </tip>

-          <para><emphasis>[Bill adds: "I forgot to mention one thing: if
-            you have DDB enabled and the kernel drops into the debugger,
-            you can force a panic (and a crash dump) just by typing 'panic'
-            at the ddb prompt. It may stop in the debugger again during the
-            panic phase. If it does, type 'continue' and it will finish the
-            crash dump." -ed]</emphasis></para>
+	  <note>
+	    <para>If you have <literal>DDB</literal> enabled and the
+	      kernel drops into the debugger, you can force a panic (and a
+	      crash dump) just by typing <literal>panic</literal> at the
+	      <literal>ddb</literal> prompt.  It may stop in the
+	      debugger again during the panic phase.  If it does, type
+	      <literal>continue</literal> and it will finish the crash
+	      dump.</para>
+	  </note>
        </answer>
      </qandaentry>