Update Question 18.14, a major rework:
- Dissolve the original e-mail style description - Add a procedure on how to make (install|build)kernel - Recommend kgdb(1) instead of gdb(1) (based on the Developer's Handbook) - Improve markup (suggested by gabor) - Update path names - Turn "FreeBSD" into &os; - Merge wpaul's original comment into the answer Reviewed by: trhodes Approved by: gabor
This commit is contained in:
parent
0411abd4e6
commit
b909f782f9
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=32488
1 changed files with 107 additions and 112 deletions
|
|
@ -10688,42 +10688,28 @@ hint.sio.7.irq="12"</programlisting>
|
||||||
</question>
|
</question>
|
||||||
|
|
||||||
<answer>
|
<answer>
|
||||||
<para><emphasis>[This section was extracted from a mail
|
<para>Here is typical kernel panic:</para>
|
||||||
written by &a.wpaul; on the freebsd-current
|
|
||||||
<link linkend="mailing">mailing list</link> by &a.des;, who
|
|
||||||
fixed a few typos and added the bracketed comments]
|
|
||||||
</emphasis></para>
|
|
||||||
|
|
||||||
<programlisting>From: Bill Paul <wpaul@skynet.ctr.columbia.edu>
|
<programlisting>Fatal trap 12: page fault while in kernel mode
|
||||||
Subject: Re: the fs fun never stops
|
fault virtual address = 0x40
|
||||||
To: Ben Rosengart
|
fault code = supervisor read, page not present
|
||||||
Date: Sun, 20 Sep 1998 15:22:50 -0400 (EDT)
|
instruction pointer = 0x8:0xf014a7e5
|
||||||
Cc: current@FreeBSD.org</programlisting>
|
stack pointer = 0x10:0xf4ed6f24
|
||||||
|
frame pointer = 0x10:0xf4ed6f28
|
||||||
|
code segment = base 0x0, limit 0xfffff, type 0x1b
|
||||||
|
= DPL 0, pres 1, def32 1, gran 1
|
||||||
|
processor eflags = interrupt enabled, resume, IOPL = 0
|
||||||
|
current process = 80 (mount)
|
||||||
|
interrupt mask =
|
||||||
|
trap number = 12
|
||||||
|
panic: page fault</programlisting>
|
||||||
|
|
||||||
<para><emphasis>Ben Rosengart posted the following
|
<para>When you see a message like this, it is not enough to just
|
||||||
panic message]</emphasis></para>
|
reproduce it and send it in. The instruction pointer value that
|
||||||
|
|
||||||
<programlisting>> Fatal trap 12: page fault while in kernel mode
|
|
||||||
> fault virtual address = 0x40
|
|
||||||
> fault code = supervisor read, page not present
|
|
||||||
> instruction pointer = 0x8:0xf014a7e5
|
|
||||||
^^^^^^^^^^
|
|
||||||
> stack pointer = 0x10:0xf4ed6f24
|
|
||||||
> frame pointer = 0x10:0xf4ed6f28
|
|
||||||
> code segment = base 0x0, limit 0xfffff, type 0x1b
|
|
||||||
> = DPL 0, pres 1, def32 1, gran 1
|
|
||||||
> processor eflags = interrupt enabled, resume, IOPL = 0
|
|
||||||
> current process = 80 (mount)
|
|
||||||
> interrupt mask =
|
|
||||||
> trap number = 12
|
|
||||||
> panic: page fault</programlisting>
|
|
||||||
|
|
||||||
<para>[When] you see a message like this, it is not enough to just
|
|
||||||
reproduce it and send it in. The instruction pointer value that
|
|
||||||
I highlighted up there is important; unfortunately, it is also
|
I highlighted up there is important; unfortunately, it is also
|
||||||
configuration dependent. In other words, the value varies
|
configuration dependent. In other words, the value varies
|
||||||
depending on the exact kernel image that you are using. If
|
depending on the exact kernel image that you are using. If
|
||||||
you are using a GENERIC kernel image from one of the snapshots,
|
you are using a <filename>GENERIC</filename> kernel image from one of the snapshots,
|
||||||
then it is possible for somebody else to track down the
|
then it is possible for somebody else to track down the
|
||||||
offending function, but if you are running a custom kernel then
|
offending function, but if you are running a custom kernel then
|
||||||
only <emphasis>you</emphasis> can tell us where the fault
|
only <emphasis>you</emphasis> can tell us where the fault
|
||||||
|
|
@ -10733,93 +10719,98 @@ Cc: current@FreeBSD.org</programlisting>
|
||||||
|
|
||||||
<procedure>
|
<procedure>
|
||||||
<step>
|
<step>
|
||||||
<para>Write down the instruction pointer value. Note that
|
<para>Write down the instruction pointer value. Note that
|
||||||
the <literal>0x8:</literal> part at the beginning is not
|
the <literal>0x8:</literal> part at the beginning is not
|
||||||
significant in this case: it is the
|
significant in this case: it is the
|
||||||
<literal>0xf0xxxxxx</literal> part that we want.</para>
|
<literal>0xf0xxxxxx</literal> part that we want.</para>
|
||||||
</step>
|
</step>
|
||||||
|
|
||||||
<step>
|
<step>
|
||||||
<para>When the system reboots, do the following:
|
<para>When the system reboots, do the following:</para>
|
||||||
|
|
||||||
<screen>&prompt.user; <userinput>nm -n /kernel.that.caused.the.panic | grep f0xxxxxx</userinput></screen>
|
<screen>&prompt.user; <userinput><command>nm</command> <option>-n</option> <replaceable>kernel.that.caused.the.panic</replaceable> | <command>grep</command> f0xxxxxx</userinput></screen>
|
||||||
|
|
||||||
where <literal>f0xxxxxx</literal> is the instruction
|
<para>where <literal>f0xxxxxx</literal> is the instruction
|
||||||
pointer value. The odds are you will not get an exact
|
pointer value. The odds are you will not get an exact
|
||||||
match since the symbols in the kernel symbol table are
|
match since the symbols in the kernel symbol table are
|
||||||
for the entry points of functions and the instruction
|
for the entry points of functions and the instruction
|
||||||
pointer address will be somewhere inside a function, not
|
pointer address will be somewhere inside a function, not
|
||||||
at the start. If you do not get an exact match, omit the
|
at the start. If you do not get an exact match, omit the
|
||||||
last digit from the instruction pointer value and try
|
last digit from the instruction pointer value and try
|
||||||
again, i.e.:
|
again, i.e.:</para>
|
||||||
|
|
||||||
<screen>&prompt.user; <userinput>nm -n /kernel.that.caused.the.panic | grep f0xxxxx</userinput></screen>
|
<screen>&prompt.user; <userinput><command>nm</command> <option>-n</option> <replaceable>kernel.that.caused.the.panic</replaceable> | <command>grep</command> f0xxxxx</userinput></screen>
|
||||||
|
|
||||||
If that does not yield any results, chop off another
|
<para>If that does not yield any results, chop off another
|
||||||
digit. Repeat until you get some sort of output. The
|
digit. Repeat until you get some sort of output. The
|
||||||
result will be a possible list of functions which caused
|
result will be a possible list of functions which caused
|
||||||
the panic. This is a less than exact mechanism for
|
the panic. This is a less than exact mechanism for
|
||||||
tracking down the point of failure, but it is better than
|
tracking down the point of failure, but it is better than
|
||||||
nothing.</para>
|
nothing.</para>
|
||||||
</step>
|
</step>
|
||||||
</procedure>
|
</procedure>
|
||||||
|
|
||||||
<para>I see people constantly show panic messages like this
|
<para>However, the best way to track down the cause of a panic is by
|
||||||
but rarely do I see someone take the time to match up the
|
capturing a crash dump, then using &man.kgdb.1; to generate
|
||||||
instruction pointer with a function in the kernel symbol
|
|
||||||
table.</para>
|
|
||||||
|
|
||||||
<para>The best way to track down the cause of a panic is by
|
|
||||||
capturing a crash dump, then using &man.gdb.1; to generate
|
|
||||||
a stack trace on the crash dump.</para>
|
a stack trace on the crash dump.</para>
|
||||||
|
|
||||||
<para>In any case, the method I normally use is this:</para>
|
<para>In any case, the method is this:</para>
|
||||||
|
|
||||||
<procedure>
|
<procedure>
|
||||||
<step>
|
<step>
|
||||||
<para>Set up a kernel config file, optionally adding
|
<para>Make sure that the following line is included in
|
||||||
<literal>options DDB</literal> if you think you need
|
your kernel configuration file
|
||||||
the kernel debugger for something. (I use this mainly
|
(/usr/src/sys/<replaceable>arch</replaceable>/conf/<replaceable>MYKERNEL</replaceable>):</para>
|
||||||
for setting breakpoints if I suspect an infinite loop
|
|
||||||
condition of some kind.)</para>
|
|
||||||
</step>
|
|
||||||
|
|
||||||
<step>
|
<programlisting>makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols</programlisting>
|
||||||
<para>Use <command>config -g
|
</step>
|
||||||
<replaceable>KERNELCONFIG</replaceable></command> to set
|
|
||||||
up the build directory.</para>
|
|
||||||
</step>
|
|
||||||
|
|
||||||
<step>
|
<step>
|
||||||
<para><command>cd /sys/compile/<replaceable>KERNELCONFIG</replaceable>; make</command></para>
|
<para>Change to the <filename
|
||||||
</step>
|
role="directory">/usr/src</filename> directory:</para>
|
||||||
|
|
||||||
<step>
|
<screen>&prompt.root; <command>cd</command> <filename role="directory">/usr/src</filename></screen>
|
||||||
<para>Wait for kernel to finish compiling.</para>
|
</step>
|
||||||
</step>
|
|
||||||
|
|
||||||
<step>
|
<step>
|
||||||
<para><command>make install</command></para>
|
<para>Compile the kernel:</para>
|
||||||
</step>
|
|
||||||
|
|
||||||
<step>
|
<screen>&prompt.root; <command>make</command> <maketarget>buildkernel</maketarget> <makevar>KERNCONFIG</makevar>=<replaceable>MYKERNEL</replaceable></screen>
|
||||||
<para>reboot</para>
|
</step>
|
||||||
</step>
|
|
||||||
|
<step>
|
||||||
|
<para>Wait for &man.make.1; to finish compiling.</para>
|
||||||
|
</step>
|
||||||
|
|
||||||
|
<step>
|
||||||
|
<screen>&prompt.root; <command>make</command> <maketarget>installkernel</maketarget> <makevar>KERNCONFIG</makevar>=<replaceable>MYKERNEL</replaceable></screen>
|
||||||
|
</step>
|
||||||
|
|
||||||
|
<step>
|
||||||
|
<para>Reboot.</para>
|
||||||
|
</step>
|
||||||
</procedure>
|
</procedure>
|
||||||
|
|
||||||
<para>The &man.make.1; process will have built two kernels.
|
<note>
|
||||||
<filename>kernel</filename> and
|
<para>If you do not use the <makevar>KERNCONFIG</makevar>
|
||||||
<filename>kernel.debug</filename>.
|
make variable a <filename>GENERIC</filename> kernel will
|
||||||
<filename>kernel</filename> was installed as
|
be built and installed.</para>
|
||||||
<filename>/kernel</filename>, while
|
</note>
|
||||||
<filename>kernel.debug</filename> can be used as the
|
|
||||||
source of debugging symbols for &man.gdb.1;.</para>
|
|
||||||
|
|
||||||
<para>To make sure you capture a crash dump, you need edit
|
<para>The &man.make.1; process will have built two kernels.
|
||||||
|
<filename>/usr/obj/usr/src/sys/<replaceable>MYKERNEL</replaceable>/kernel</filename>
|
||||||
|
and
|
||||||
|
<filename>/usr/obj/usr/src/sys/<replaceable>MYKERNEL</replaceable>/kernel.debug</filename>.
|
||||||
|
<filename>kernel</filename> was installed as
|
||||||
|
<filename>/boot/kernel/kernel</filename>, while
|
||||||
|
<filename>kernel.debug</filename> can be used as the source
|
||||||
|
of debugging symbols for &man.kgdb.1;.</para>
|
||||||
|
|
||||||
|
<para>To make sure you capture a crash dump, you need edit
|
||||||
<filename>/etc/rc.conf</filename> and set
|
<filename>/etc/rc.conf</filename> and set
|
||||||
<literal>dumpdev</literal> to point to your swap
|
<literal>dumpdev</literal> to point to your swap
|
||||||
partition. This will cause the &man.rc.8; scripts to use
|
partition (or <literal>AUTO</literal>). This will cause the &man.rc.8; scripts to use
|
||||||
the &man.dumpon.8; command to enable crash dumps. You can
|
the &man.dumpon.8; command to enable crash dumps. You can
|
||||||
also run &man.dumpon.8; manually. After a panic, the
|
also run &man.dumpon.8; manually. After a panic, the
|
||||||
crash dump can be recovered using &man.savecore.8;; if
|
crash dump can be recovered using &man.savecore.8;; if
|
||||||
<literal>dumpdev</literal> is set in
|
<literal>dumpdev</literal> is set in
|
||||||
|
|
@ -10828,27 +10819,28 @@ Cc: current@FreeBSD.org</programlisting>
|
||||||
dump in <filename>/var/crash</filename>.</para>
|
dump in <filename>/var/crash</filename>.</para>
|
||||||
|
|
||||||
<note>
|
<note>
|
||||||
<para>FreeBSD crash dumps are usually the same size as the
|
<para>&os; crash dumps are usually the same size as the
|
||||||
physical RAM size of your machine. That is, if you have
|
physical RAM size of your machine. That is, if you have
|
||||||
64MB of RAM, you will get a 64MB crash dump. Therefore you
|
512 MB of RAM, you will get a 512 MB crash dump. Therefore you
|
||||||
must make sure there is enough space in
|
must make sure there is enough space in
|
||||||
<filename>/var/crash</filename> to hold the dump.
|
<filename>/var/crash</filename> to hold the dump.
|
||||||
Alternatively, you run &man.savecore.8;
|
Alternatively, you run &man.savecore.8;
|
||||||
manually and have it recover the crash dump to another
|
manually and have it recover the crash dump to another
|
||||||
directory where you have more room. It is possible to limit
|
directory where you have more room. It is possible to limit
|
||||||
the size of the crash dump by using <literal>options
|
the size of the crash dump by using <literal>options
|
||||||
MAXMEM=(foo)</literal> to set the amount of memory the
|
MAXMEM=<replaceable>N</replaceable></literal> where
|
||||||
kernel will use to something a little more sensible. For
|
<replaceable>N</replaceable> is the size of kernel's memory
|
||||||
example, if you have 128MB of RAM, you can limit the
|
usage in KBs.
|
||||||
kernel's memory usage to 16MB so that your crash dump size
|
For example, if you have 1 GB of RAM, you can limit the
|
||||||
will be 16MB instead of 128MB.</para>
|
kernel's memory usage to 128 MB by this way, so that your crash dump size
|
||||||
|
will be 128 MB instead of 1 GB.</para>
|
||||||
</note>
|
</note>
|
||||||
|
|
||||||
<para>Once you have recovered the crash dump, you can get a
|
<para>Once you have recovered the crash dump, you can get a
|
||||||
stack trace with &man.gdb.1; as follows:</para>
|
stack trace with &man.kgdb.1; as follows:</para>
|
||||||
|
|
||||||
<screen>&prompt.user; <userinput>gdb -k /sys/compile/KERNELCONFIG/kernel.debug /var/crash/vmcore.0</userinput>
|
<screen>&prompt.user; <userinput><command>kgdb</command> <filename>/usr/obj/usr/src/sys/<replaceable>MYKERNEL</replaceable>/kernel.debug</filename> <filename>/var/crash/<replaceable>vmcore.0</replaceable></filename></userinput>
|
||||||
<prompt>(gdb)</prompt> <userinput>where</userinput></screen>
|
<prompt>(kgdb)</prompt> <userinput>backtrace</userinput></screen>
|
||||||
|
|
||||||
<para>Note that there may be several screens worth of
|
<para>Note that there may be several screens worth of
|
||||||
information; ideally you should use
|
information; ideally you should use
|
||||||
|
|
@ -10857,25 +10849,28 @@ Cc: current@FreeBSD.org</programlisting>
|
||||||
the exact line of kernel source code where the panic occurred.
|
the exact line of kernel source code where the panic occurred.
|
||||||
Usually you have to read the stack trace from the bottom up in
|
Usually you have to read the stack trace from the bottom up in
|
||||||
order to trace the exact sequence of events that lead to the
|
order to trace the exact sequence of events that lead to the
|
||||||
crash. You can also use &man.gdb.1; to print out
|
crash. You can also use &man.kgdb.1; to print out
|
||||||
the contents of various variables or structures in order to
|
the contents of various variables or structures in order to
|
||||||
examine the system state at the time of the crash.</para>
|
examine the system state at the time of the crash.</para>
|
||||||
|
|
||||||
<para>Now, if you are really insane and have a second computer,
|
<tip>
|
||||||
you can also configure &man.gdb.1; to do remote
|
<para>Now, if you are really insane and have a second
|
||||||
debugging such that you can use &man.gdb.1; on
|
computer, you can also configure &man.kgdb.1; to do remote
|
||||||
one system to debug the kernel on another system, including
|
debugging such that you can use &man.kgdb.1; on one system
|
||||||
setting breakpoints, single-stepping through the kernel code,
|
to debug the kernel on another system, including setting
|
||||||
just like you can do with a normal user-mode program. I have not
|
breakpoints, single-stepping through the kernel code, just
|
||||||
played with this yet as I do not often have the chance to set up
|
like you can do with a normal user-mode program.</para>
|
||||||
two machines side by side for debugging purposes.</para>
|
</tip>
|
||||||
|
|
||||||
<para><emphasis>[Bill adds: "I forgot to mention one thing: if
|
<note>
|
||||||
you have DDB enabled and the kernel drops into the debugger,
|
<para>If you have <literal>DDB</literal> enabled and the
|
||||||
you can force a panic (and a crash dump) just by typing 'panic'
|
kernel drops into the debugger, you can force a panic (and a
|
||||||
at the ddb prompt. It may stop in the debugger again during the
|
crash dump) just by typing <literal>panic</literal> at the
|
||||||
panic phase. If it does, type 'continue' and it will finish the
|
<literal>ddb</literal> prompt. It may stop in the
|
||||||
crash dump." -ed]</emphasis></para>
|
debugger again during the panic phase. If it does, type
|
||||||
|
<literal>continue</literal> and it will finish the crash
|
||||||
|
dump.</para>
|
||||||
|
</note>
|
||||||
</answer>
|
</answer>
|
||||||
</qandaentry>
|
</qandaentry>
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue