Whitespace-change only.

Make the output a little bit more readable by

- addings some list (orderedlist, itemized list)
- adding some additional paragraphs
- rearranging the Q/A section so that something useful is displayed in the list of Q/A.

Suggested by:   Benjamin Lukas (qavvap att googlemail dott com)
This commit is contained in:
Johann Kois 2010-06-17 11:24:25 +00:00
parent 77f5bcecff
commit f74d50dc72
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/head/; revision=35897

View file

@ -96,22 +96,33 @@
environmental factors. In any direct comparison between platforms,
these issues become most apparent when system resources begin to get
stressed. As I describe &os;'s VM/Swap subsystem the reader should
always keep two points in mind. First, the most important aspect of
performance design is what is known as <quote>Optimizing the Critical
Path</quote>. It is often the case that performance optimizations add a
little bloat to the code in order to make the critical path perform
better. Second, a solid, generalized design outperforms a
heavily-optimized design over the long run. While a generalized design
may end up being slower than an heavily-optimized design when they are
first implemented, the generalized design tends to be easier to adapt to
changing conditions and the heavily-optimized design winds up having to
be thrown away. Any codebase that will survive and be maintainable for
always keep two points in mind:</para>
<orderedlist>
<listitem>
<para>The most important aspect of performance design is what is
known as <quote>Optimizing the Critical Path</quote>. It is often
the case that performance optimizations add a little bloat to the
code in order to make the critical path perform better.</para>
</listitem>
<listitem>
<para>A solid, generalized design outperforms a heavily-optimized
design over the long run. While a generalized design may end up
being slower than an heavily-optimized design when they are
first implemented, the generalized design tends to be easier to
adapt to changing conditions and the heavily-optimized design
winds up having to be thrown away.</para>
</listitem>
</orderedlist>
<para>Any codebase that will survive and be maintainable for
years must therefore be designed properly from the beginning even if it
costs some performance. Twenty years ago people were still arguing that
programming in assembly was better than programming in a high-level
language because it produced code that was ten times as fast. Today,
the fallibility of that argument is obvious&mdash;as are the parallels
to algorithmic design and code generalization.</para>
the fallibility of that argument is obvious &nbsp;&mdash;&nbsp;as are
the parallels to algorithmic design and code generalization.</para>
</sect1>
<sect1 id="vm-objects">
@ -318,40 +329,85 @@
memory that does not otherwise have it. &os; allocates the swap
management structure for a VM Object only when it is actually needed.
However, the swap management structure has had problems
historically.</para>
historically:</para>
<para>Under &os; 3.X the swap management structure preallocates an
array that encompasses the entire object requiring swap backing
store&mdash;even if only a few pages of that object are swap-backed.
This creates a kernel memory fragmentation problem when large objects
are mapped, or processes with large runsizes (RSS) fork. Also, in order
to keep track of swap space, a <quote>list of holes</quote> is kept in
kernel memory, and this tends to get severely fragmented as well. Since
the <quote>list of holes</quote> is a linear list, the swap allocation and freeing
performance is a non-optimal O(n)-per-page. It also requires kernel
memory allocations to take place during the swap freeing process, and
that creates low memory deadlock problems. The problem is further
exacerbated by holes created due to the interleaving algorithm. Also,
the swap block map can become fragmented fairly easily resulting in
non-contiguous allocations. Kernel memory must also be allocated on the
fly for additional swap management structures when a swapout occurs. It
is evident that there was plenty of room for improvement.</para>
<itemizedlist>
<listitem>
<para>Under &os; 3.X the swap management structure preallocates an
array that encompasses the entire object requiring swap backing
store&mdash;even if only a few pages of that object are
swap-backed. This creates a kernel memory fragmentation problem
when large objects are mapped, or processes with large runsizes
(RSS) fork.</para>
</listitem>
<para>For &os; 4.X, I completely rewrote the swap subsystem. With this
rewrite, swap management structures are allocated through a hash table
rather than a linear array giving them a fixed allocation size and much
finer granularity. Rather then using a linearly linked list to keep
track of swap space reservations, it now uses a bitmap of swap blocks
arranged in a radix tree structure with free-space hinting in the radix
node structures. This effectively makes swap allocation and freeing an
O(1) operation. The entire radix tree bitmap is also preallocated in
order to avoid having to allocate kernel memory during critical low
memory swapping operations. After all, the system tends to swap when it
is low on memory so we should avoid allocating kernel memory at such
times in order to avoid potential deadlocks. Finally, to reduce
fragmentation the radix tree is capable of allocating large contiguous
chunks at once, skipping over smaller fragmented chunks. I did not take
the final step of having an <quote>allocating hint pointer</quote> that would trundle
<listitem>
<para>Also, in order to keep track of swap space, a <quote>list of
holes</quote> is kept in kernel memory, and this tends to get
severely fragmented as well. Since the <quote>list of
holes</quote> is a linear list, the swap allocation and freeing
performance is a non-optimal O(n)-per-page.</para>
</listitem>
<listitem>
<para>It requires kernel memory allocations to take place during
the swap freeing process, and that creates low memory deadlock
problems.</para>
</listitem>
<listitem>
<para>The problem is further exacerbated by holes created due to
the interleaving algorithm.</para>
</listitem>
<listitem>
<para>Also, the swap block map can become fragmented fairly easily
resulting in non-contiguous allocations.</para>
</listitem>
<listitem>
<para>Kernel memory must also be allocated on the fly for additional
swap management structures when a swapout occurs.</para>
</listitem>
</itemizedlist>
<para>It is evident from that list that there was plenty of room for
improvement. For &os; 4.X, I completely rewrote the swap
subsystem:</para>
<itemizedlist>
<listitem>
<para>Swap management structures are allocated through a hash
table rather than a linear array giving them a fixed allocation
size and much finer granularity.</para>
</listitem>
<listitem>
<para>Rather then using a linearly linked list to keep track of
swap space reservations, it now uses a bitmap of swap blocks
arranged in a radix tree structure with free-space hinting in
the radix node structures. This effectively makes swap
allocation and freeing an O(1) operation.</para>
</listitem>
<listitem>
<para>The entire radix tree bitmap is also preallocated in
order to avoid having to allocate kernel memory during critical
low memory swapping operations. After all, the system tends to
swap when it is low on memory so we should avoid allocating
kernel memory at such times in order to avoid potential
deadlocks.</para>
</listitem>
<listitem>
<para>To reduce fragmentation the radix tree is capable
of allocating large contiguous chunks at once, skipping over
smaller fragmented chunks.</para>
</listitem>
</itemizedlist>
<para>I did not take the final step of having an
<quote>allocating hint pointer</quote> that would trundle
through a portion of swap as allocations were made in order to further
guarantee contiguous allocations or at least locality of reference, but
I ensured that such an addition could be made.</para>
@ -431,7 +487,9 @@
systems with very low cache queue counts and high active queue counts
when doing a <command>systat -vm</command> command. As the VM system
becomes more stressed, it makes a greater effort to maintain the various
page queues at the levels determined to be the most effective. An urban
page queues at the levels determined to be the most effective.</para>
<para>An urban
myth has circulated for years that Linux did a better job avoiding
swapouts than &os;, but this in fact is not true. What was actually
occurring was that &os; was proactively paging out unused pages in
@ -623,6 +681,12 @@
<qandaentry>
<question>
<para>How is the separation of clean and dirty (inactive) pages
related to the situation where you see low cache queue counts and
high active queue counts in <command>systat -vm</command>? Do the
systat stats roll the active and dirty pages together for the
active queue count?</para>
<para>I do not get the following:</para>
<blockquote>
@ -635,12 +699,6 @@
cache queue counts and high active queue counts when doing a
<command>systat -vm</command> command.</para>
</blockquote>
<para>How is the separation of clean and dirty (inactive) pages
related to the situation where you see low cache queue counts and
high active queue counts in <command>systat -vm</command>? Do the
systat stats roll the active and dirty pages together for the
active queue count?</para>
</question>
<answer>