Re-wrote the SIG 11 section of the FAQ

Submitted by:	A cast of thousands
This commit is contained in:
Mark Ovens 2000-09-22 18:35:37 +00:00
parent d693bb4ad1
commit eee08fd198
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/head/; revision=7998
2 changed files with 202 additions and 30 deletions

View file

@ -15,7 +15,7 @@
</author>
</authorgroup>
<pubdate>$FreeBSD: doc/en_US.ISO_8859-1/books/faq/book.sgml,v 1.91 2000/09/16 18:13:48 jim Exp $</pubdate>
<pubdate>$FreeBSD: doc/en_US.ISO_8859-1/books/faq/book.sgml,v 1.92 2000/09/18 19:50:21 jim Exp $</pubdate>
<abstract>
<para>This is the FAQ for FreeBSD versions 2.X, 3.X, and 4.X. All entries
@ -3216,22 +3216,108 @@ understood) timing problem.</para>
<qandaentry><question>
<para>My programs occasionally die with <literal>Signal 11</literal> errors.</para></question><answer>
<para>This can be caused by bad hardware (memory, motherboard, etc.).
Try running a memory-testing program on your PC. Note that, even
though every memory testing program you try will report your
memory as being fine, it's possible for slightly marginal memory
to pass all memory tests, yet fail under operating conditions
(such as during bus mastering DMA from a SCSI controller like the
Adaptec 1542, when you're beating on memory by compiling a kernel,
or just when the system's running particularly hot).</para>
<para>Signal 11 errors are caused when your process has attempted to
access memory which the operating system has not granted it access to.
If something like this is happening at seemingly random intervals then
you need to start investigating things very carefully.</para>
<para>The SIG11 FAQ (listed below) points up slow memory as being the
most common problem. Increase the number of wait states in your
BIOS setup, or get faster memory.</para>
<para>These problems can usually be attributed to either:</para>
<para>For me the guilty party has been bad cache RAM or a bad on-board
cache controller. Try disabling the on-board (secondary) cache in
the BIOS setup and see if that solves the problem.</para>
<orderedlist>
<listitem>
<para>If the problem is occurring only in a specific application
that you are developing yourself it is probably a bug in your code.
</para>
</listitem>
<listitem>
<para>If it's a problem with part of the base FreeBSD system, it
may also be buggy code, but more often than not these problems are
found and fixed long before us general FAQ readers get to use
these bits of code (that's what -current is for).</para>
</listitem>
</orderedlist>
<para>In particular, a dead giveaway that this is *not* a FreeBSD bug
is if you see the problem when you're compiling a program, but the
activity that the compiler is carrying out changes each time.
</para>
<para>For example, suppose you're running "make buildworld", and the
compile fails while trying to compile ls.c in to ls.o. If you next run
"make buildworld" again, and the compile fails in the same place then
this is a broken build -- try updating your sources and try again. If
the compile fails elsewhere then this is almost certainly hardware.
</para>
<para>What you should do:</para>
<para>In the first case you can use a debugger e.g. gdb to find the
point in the program which is attempting to access a bogus address and
then fix it.
</para>
<para>In the second case you need to verify that it's not your hardware
at fault.</para>
<para> Common causes of this include :</para>
<orderedlist>
<listitem><para>Your hard disks might be overheating: Check the fans in
your case are still working, as your disk (and perhaps other hardware
might be overheating).</para>
</listitem>
<listitem>
<para>The processor running is overheating: This might be because the
processor has been overclocked, or the fan on the processor might
have died. In either case you need to ensure that you have hardware
running at what it's specified to run at, at least while trying to
solve this problem. i.e. Clock it back to the default settings.</para>
<para> If you are overclocking then note that it's far cheaper
to have a slow system than a fried system that needs replacing!
Also the wider community is not often sympathetic to problems on
overclocked systems, whether you believe it's safe or not.</para>
</listitem>
<listitem>
<para>Dodgy memory: If you have multiple memory SIMMS/DIMMS installed
then pull them all out and try running the machine with each SIMM or
DIMM individually and narrow the problem down to either the problematic
DIMM/SIMM or perhaps even a combination.
</listitem>
<listitem>
<para>Over-optimistic Motherboard settings: In your BIOS settings, and
some motherboard jumpers you have options to set various timings, mostly
the defaults will be sufficient, but sometimes, setting the wait
states on RAM too low, or setting the "RAM Speed: Turbo" option,
or similar in the BIOS will cause strange behaviour.
A possible idea is to set to BIOS defaults, but it might be worth
noting down your settings first!</para>
</listitem>
<listitem>
<para>Unclean or insufficient power to the motherboard. If you
have any unused I/O boards, hard disks, or CDROMs in your system,
try temporarily removing them or disconnecting the power cable
from them, to see if your power supply can manage a smaller load.
Or try another power supply, preferably one with a little more
power (for instance, if your current power supply is rated at 250
Watts try one rated at 300 Watts).</para>
</listitem>
</orderedlist>
<para>You should also read the SIG11 FAQ (listed below) which has
excellent explanations of all these problems, albeit from a Linux
viewpoint. It also discusses how memory testing software or hardware
can still pass faulty memory.</para>
<para>Finally, if none of this has helped it is possible that you've
just found a bug in FreeBSD, and you should follow the instructions to
send a problem report.</para>
<para>There's an extensive FAQ on this at
<ulink URL="http://www.bitwizard.nl/sig11/">the SIG11 problem FAQ</ulink></para>

View file

@ -15,7 +15,7 @@
</author>
</authorgroup>
<pubdate>$FreeBSD: doc/en_US.ISO_8859-1/books/faq/book.sgml,v 1.91 2000/09/16 18:13:48 jim Exp $</pubdate>
<pubdate>$FreeBSD: doc/en_US.ISO_8859-1/books/faq/book.sgml,v 1.92 2000/09/18 19:50:21 jim Exp $</pubdate>
<abstract>
<para>This is the FAQ for FreeBSD versions 2.X, 3.X, and 4.X. All entries
@ -3216,22 +3216,108 @@ understood) timing problem.</para>
<qandaentry><question>
<para>My programs occasionally die with <literal>Signal 11</literal> errors.</para></question><answer>
<para>This can be caused by bad hardware (memory, motherboard, etc.).
Try running a memory-testing program on your PC. Note that, even
though every memory testing program you try will report your
memory as being fine, it's possible for slightly marginal memory
to pass all memory tests, yet fail under operating conditions
(such as during bus mastering DMA from a SCSI controller like the
Adaptec 1542, when you're beating on memory by compiling a kernel,
or just when the system's running particularly hot).</para>
<para>Signal 11 errors are caused when your process has attempted to
access memory which the operating system has not granted it access to.
If something like this is happening at seemingly random intervals then
you need to start investigating things very carefully.</para>
<para>The SIG11 FAQ (listed below) points up slow memory as being the
most common problem. Increase the number of wait states in your
BIOS setup, or get faster memory.</para>
<para>These problems can usually be attributed to either:</para>
<para>For me the guilty party has been bad cache RAM or a bad on-board
cache controller. Try disabling the on-board (secondary) cache in
the BIOS setup and see if that solves the problem.</para>
<orderedlist>
<listitem>
<para>If the problem is occurring only in a specific application
that you are developing yourself it is probably a bug in your code.
</para>
</listitem>
<listitem>
<para>If it's a problem with part of the base FreeBSD system, it
may also be buggy code, but more often than not these problems are
found and fixed long before us general FAQ readers get to use
these bits of code (that's what -current is for).</para>
</listitem>
</orderedlist>
<para>In particular, a dead giveaway that this is *not* a FreeBSD bug
is if you see the problem when you're compiling a program, but the
activity that the compiler is carrying out changes each time.
</para>
<para>For example, suppose you're running "make buildworld", and the
compile fails while trying to compile ls.c in to ls.o. If you next run
"make buildworld" again, and the compile fails in the same place then
this is a broken build -- try updating your sources and try again. If
the compile fails elsewhere then this is almost certainly hardware.
</para>
<para>What you should do:</para>
<para>In the first case you can use a debugger e.g. gdb to find the
point in the program which is attempting to access a bogus address and
then fix it.
</para>
<para>In the second case you need to verify that it's not your hardware
at fault.</para>
<para> Common causes of this include :</para>
<orderedlist>
<listitem><para>Your hard disks might be overheating: Check the fans in
your case are still working, as your disk (and perhaps other hardware
might be overheating).</para>
</listitem>
<listitem>
<para>The processor running is overheating: This might be because the
processor has been overclocked, or the fan on the processor might
have died. In either case you need to ensure that you have hardware
running at what it's specified to run at, at least while trying to
solve this problem. i.e. Clock it back to the default settings.</para>
<para> If you are overclocking then note that it's far cheaper
to have a slow system than a fried system that needs replacing!
Also the wider community is not often sympathetic to problems on
overclocked systems, whether you believe it's safe or not.</para>
</listitem>
<listitem>
<para>Dodgy memory: If you have multiple memory SIMMS/DIMMS installed
then pull them all out and try running the machine with each SIMM or
DIMM individually and narrow the problem down to either the problematic
DIMM/SIMM or perhaps even a combination.
</listitem>
<listitem>
<para>Over-optimistic Motherboard settings: In your BIOS settings, and
some motherboard jumpers you have options to set various timings, mostly
the defaults will be sufficient, but sometimes, setting the wait
states on RAM too low, or setting the "RAM Speed: Turbo" option,
or similar in the BIOS will cause strange behaviour.
A possible idea is to set to BIOS defaults, but it might be worth
noting down your settings first!</para>
</listitem>
<listitem>
<para>Unclean or insufficient power to the motherboard. If you
have any unused I/O boards, hard disks, or CDROMs in your system,
try temporarily removing them or disconnecting the power cable
from them, to see if your power supply can manage a smaller load.
Or try another power supply, preferably one with a little more
power (for instance, if your current power supply is rated at 250
Watts try one rated at 300 Watts).</para>
</listitem>
</orderedlist>
<para>You should also read the SIG11 FAQ (listed below) which has
excellent explanations of all these problems, albeit from a Linux
viewpoint. It also discusses how memory testing software or hardware
can still pass faulty memory.</para>
<para>Finally, if none of this has helped it is possible that you've
just found a bug in FreeBSD, and you should follow the instructions to
send a problem report.</para>
<para>There's an extensive FAQ on this at
<ulink URL="http://www.bitwizard.nl/sig11/">the SIG11 problem FAQ</ulink></para>