Update PCID report based on some feedback from wblock

Differential Revision:	https://reviews.freebsd.org/D3127
This commit is contained in:
Ed Maste 2015-07-19 19:59:00 +00:00
parent aa81505573
commit 5bf6e8e1bd
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/head/; revision=47019

View file

@ -1091,19 +1091,20 @@
</links>
<body>
<p>Process-Context Identifiers (PCIDs) is a feature of the
<p>A Process-Context Identifier (PCID) is a performance enhancing
feature of the
Translation Lookaside Buffer (TLB) on Intel processors,
introduced with the Sandy Bridge micro-architecture. It
allows the TLB to
simultaneously cache translation information for several
address spaces, and gives an opportunity for the operating
system context switch code to avoid flushing the TLB on the
system context switch code to avoid flushing the TLB upon
process switch. Each cached translation is tagged with some
context identifier, and at context switch time, the operating
system instructs the processor which context is becoming
active. The feature slightly reduces context switch time by
avoiding flush, and more importantly, it reduces the warm-up
period for the thread after a context switch.</p>
avoiding TLB flush, and more importantly, reduces the warm-up
period for a thread after context switch.</p>
<p>&os; already used PCID, but the existing implementation
had several shortcomings. The <tt>amd64</tt> pmap (the
@ -1113,9 +1114,9 @@
on the context switch. The bitmap was used to direct
Inter-Processor Interrupts to the marked CPU when the
operating system needed to perform TLB invalidation. The most
important deficiency of the implementation is the increase of
TLB invalidation IPIs since the bitmap could only grow until
full TLB shootdown is performed. It increases the TLB rate,
important deficiency of the implementation was the increase of
TLB invalidation IPIs, since the bitmap could only grow until
full TLB shootdown was performed. It increased the TLB rate,
which negated the positive effects of avoiding TLB flushes on
large machines. Secondarily, the bitmap maintenance in both
the pmap and the context code was quite complicated, leading
@ -1125,13 +1126,14 @@
<p>The new PCID implementation uses an algorithm described in
the U. Vahalia book "UNIX Internals: The New Frontiers". The
algorithm is already used, for example, by the MIPS pmap for
assigning the ASIDs to software-managed TLB entries. The pmap
assigning Address Space Identifiers (ASIDs) to software-managed
TLB entries. The pmap
maintains a per-CPU generation count, which is assigned to the
next unused PCID when the context is activated on CPU. TLB
invalidation includes resetting the generation count, which
causes reallocation of PCID when a context switch is
causes reallocation of the PCID when a context switch is
performed. As result, the new implementation issues exactly
the same amount of shootdown IPIs as pmap which does not
the same amount of shootdown IPIs as a pmap which does not
utilize PCID.</p>
<p>Another change included with the PCID rewrite is a move of
@ -1139,9 +1141,8 @@
making the algorithm easier to understand and validate.</p>
<p>Measurements done with <tt>hwpmc(4)</tt> on a Haswell machine
indicated that the new implementation reduced the amount of
data TLB misses up to 10 times, without an impact on the IPI
counters.</p>
indicated that the new implementation reduced the TLB miss rate by
up to 10 times, without an increase in TLB shootdown IPIs.</p>
<p>The rewrite was committed to HEAD at r282684.</p>