Update PCID report based on some feedback from wblock
Differential Revision: https://reviews.freebsd.org/D3127
This commit is contained in:
parent
aa81505573
commit
5bf6e8e1bd
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=47019
1 changed files with 14 additions and 13 deletions
|
|
@ -1091,19 +1091,20 @@
|
|||
</links>
|
||||
|
||||
<body>
|
||||
<p>Process-Context Identifiers (PCIDs) is a feature of the
|
||||
<p>A Process-Context Identifier (PCID) is a performance enhancing
|
||||
feature of the
|
||||
Translation Lookaside Buffer (TLB) on Intel processors,
|
||||
introduced with the Sandy Bridge micro-architecture. It
|
||||
allows the TLB to
|
||||
simultaneously cache translation information for several
|
||||
address spaces, and gives an opportunity for the operating
|
||||
system context switch code to avoid flushing the TLB on the
|
||||
system context switch code to avoid flushing the TLB upon
|
||||
process switch. Each cached translation is tagged with some
|
||||
context identifier, and at context switch time, the operating
|
||||
system instructs the processor which context is becoming
|
||||
active. The feature slightly reduces context switch time by
|
||||
avoiding flush, and more importantly, it reduces the warm-up
|
||||
period for the thread after a context switch.</p>
|
||||
avoiding TLB flush, and more importantly, reduces the warm-up
|
||||
period for a thread after context switch.</p>
|
||||
|
||||
<p>&os; already used PCID, but the existing implementation
|
||||
had several shortcomings. The <tt>amd64</tt> pmap (the
|
||||
|
|
@ -1113,9 +1114,9 @@
|
|||
on the context switch. The bitmap was used to direct
|
||||
Inter-Processor Interrupts to the marked CPU when the
|
||||
operating system needed to perform TLB invalidation. The most
|
||||
important deficiency of the implementation is the increase of
|
||||
TLB invalidation IPIs since the bitmap could only grow until
|
||||
full TLB shootdown is performed. It increases the TLB rate,
|
||||
important deficiency of the implementation was the increase of
|
||||
TLB invalidation IPIs, since the bitmap could only grow until
|
||||
full TLB shootdown was performed. It increased the TLB rate,
|
||||
which negated the positive effects of avoiding TLB flushes on
|
||||
large machines. Secondarily, the bitmap maintenance in both
|
||||
the pmap and the context code was quite complicated, leading
|
||||
|
|
@ -1125,13 +1126,14 @@
|
|||
<p>The new PCID implementation uses an algorithm described in
|
||||
the U. Vahalia book "UNIX Internals: The New Frontiers". The
|
||||
algorithm is already used, for example, by the MIPS pmap for
|
||||
assigning the ASIDs to software-managed TLB entries. The pmap
|
||||
assigning Address Space Identifiers (ASIDs) to software-managed
|
||||
TLB entries. The pmap
|
||||
maintains a per-CPU generation count, which is assigned to the
|
||||
next unused PCID when the context is activated on CPU. TLB
|
||||
invalidation includes resetting the generation count, which
|
||||
causes reallocation of PCID when a context switch is
|
||||
causes reallocation of the PCID when a context switch is
|
||||
performed. As result, the new implementation issues exactly
|
||||
the same amount of shootdown IPIs as pmap which does not
|
||||
the same amount of shootdown IPIs as a pmap which does not
|
||||
utilize PCID.</p>
|
||||
|
||||
<p>Another change included with the PCID rewrite is a move of
|
||||
|
|
@ -1139,9 +1141,8 @@
|
|||
making the algorithm easier to understand and validate.</p>
|
||||
|
||||
<p>Measurements done with <tt>hwpmc(4)</tt> on a Haswell machine
|
||||
indicated that the new implementation reduced the amount of
|
||||
data TLB misses up to 10 times, without an impact on the IPI
|
||||
counters.</p>
|
||||
indicated that the new implementation reduced the TLB miss rate by
|
||||
up to 10 times, without an increase in TLB shootdown IPIs.</p>
|
||||
|
||||
<p>The rewrite was committed to HEAD at r282684.</p>
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue