Expand a bit the description of a LOR and what witness does

While here try to explain a bit why a deadlock could happen when
witness(4) reports a ’lock order reversal’ and why ’reversal’ is
an important part of the deadlock-related warning.

Approved by:    gjb (mentor)
Reviewed by:   eadler
This commit is contained in:
Giorgos Keramidas 2013-02-24 22:19:53 +00:00
parent c77b86c7a2
commit b02e847a22
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/head/; revision=41038

View file

@ -2494,18 +2494,37 @@ kern.timecounter.hardware: TSC -&gt; i8254</screen>
<answer> <answer>
<para>The &os; kernel uses a number of resource locks to <para>The &os; kernel uses a number of resource locks to
arbitrate contention for certain resources. A run-time arbitrate contention for certain resources. When multiple
lock diagnostic system found in &os.current; kernels kernel threads try to obtain multiple resource locks,
(but removed for releases), called &man.witness.4;, there's always the potential for a deadlock,
detects the potential for deadlocks due to locking errors. where two threads have each obtained one of the locks and
(It is possible to get false positives, as &man.witness.4; blocks forever waiting for the other thread to release one
is slightly conservative.) A true positive report of the other locks. This sort of locking problem can be
indicates that "if you were unlucky, a deadlock would have avoided if all threads obtain the locks in the same
happened here">.</para> order.</para>
<para>Problematic <acronym>LOR</acronym>s tend to get fixed <para>A run-time lock diagnostic system called &man.witness.4;,
quickly, so check &a.current.url; before posting to the enabled in &os.current; and disabled by default for stable
mailing lists.</para> branches and releases, detects the potential for deadlocks due to
locking errors, including errors caused by obtaining multiple
resource locks with a different order from different parts of the
kernel. The &man.witness.4; framework tries to detect this
problem as it happens, and reports it by printing a message to the
system console about a <errorname>lock order reversal</errorname>
(often referred to also as <acronym>LOR</acronym>).</para>
<para>It is possible to get false positives, as &man.witness.4;
is conservative. A true positive report <emphasis>does
not</emphasis> mean that a system is dead-locked; instead
it should be understood as a warning of the form <quote>if
you were unlucky, a deadlock would have happened
here</quote>.</para>
<note>
<para>Problematic <acronym>LOR</acronym>s tend to get fixed
quickly, so check &a.current.url; before posting to the
mailing lists.</para>
</note>
</answer> </answer>
</qandaentry> </qandaentry>