- Many minor fixes in the CAM report
Submitted by: bjk, wblock
This commit is contained in:
parent
31078a2f3f
commit
faccc55b81
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=42635
1 changed files with 29 additions and 28 deletions
|
@ -135,51 +135,52 @@
|
||||||
</links>
|
</links>
|
||||||
|
|
||||||
<body>
|
<body>
|
||||||
<p>Last year's high-performance storage vendors reported
|
<p>Last year's high-performance storage vendors reported a
|
||||||
performance bottleneck in &os; block storage subsystem, limiting
|
performance bottleneck in the &os; block storage subsystem,
|
||||||
peak performance around 300-500K IOPS. While that is still more
|
limiting peak performance around 300-500K IOPS. While that is
|
||||||
then enough for average systems, detailed investigation has
|
still more than enough for average systems, detailed
|
||||||
shown number of places that require radical improvement.
|
investigation has shown a number of places that require radical
|
||||||
Unmapped I/O support implemented early this year already
|
improvement. Unmapped I/O support implemented early this year
|
||||||
improved I/O performance by about 30% and moved more accents
|
already improved I/O performance by about 30% and moved more
|
||||||
toward GEOM and CAM subsystems scalability. Fixing these issues
|
accents toward GEOM and CAM subsystems scalability. Fixing
|
||||||
was the goal of this project.</p>
|
these issues was the goal of this project.</p>
|
||||||
|
|
||||||
<p>The existing GEOM design assumed the most of I/O handling to be
|
<p>The existing GEOM design assumed most I/O handling to be done
|
||||||
done by only two kernel threads (<tt>g_up()</tt> and
|
by only two kernel threads (<tt>g_up()</tt> and
|
||||||
<tt>g_down()</tt>). That simplified locking in some cases, but
|
<tt>g_down()</tt>). That simplified locking in some cases, but
|
||||||
limited potential SMP scalability and created additional
|
limited potential SMP scalability and created additional
|
||||||
scheduler overhead. This project introduces concept of direct
|
scheduler overhead. This project introduces the concept of
|
||||||
I/O dispatch into GEOM for cases where it is know to be safe and
|
direct I/O dispatch into GEOM for cases where it is known to be
|
||||||
efficient. That implies marking some of GEOM consumers and
|
safe and efficient. That implies marking some GEOM consumers
|
||||||
providers with one or two new flags, declaring situations when
|
and providers with one or two new flags, declaring situations
|
||||||
direct function call can be used instead of normal request
|
when a direct function call can be used instead of normal request
|
||||||
queuing. That allows to avoid any context switches inside GEOM
|
queuing. That allows avoiding any context switches inside GEOM
|
||||||
for the most widely used topologies, simultaneously processing
|
for the most widely used topologies, simultaneously processing
|
||||||
multiple I/Os from multiple calling threads.</p>
|
multiple I/Os from multiple calling threads.</p>
|
||||||
|
|
||||||
<p>Having GEOM passing through multiple concurrent calls down to
|
<p>Having GEOM passing through multiple concurrent calls down to
|
||||||
the underlying layers exposed major lock congestion in CAM. In
|
the underlying layers exposed major lock congestion in CAM. In
|
||||||
existing CAM design all devices connected to the same ATA/SCSI
|
the existing CAM design all devices connected to the same ATA/SCSI
|
||||||
controller are sharing single lock, which can be quite busy due
|
controller are sharing a single lock, which can be quite busy due
|
||||||
to multiple controller hardware accesses and/or code logic.
|
to multiple controller hardware accesses and/or code logic.
|
||||||
Experiments have shown that applying only above GEOM direct
|
Experiments have shown that applying only the above GEOM direct
|
||||||
dispatch changes burns up to 60% of system CPU time or even more
|
dispatch changes burns up to 60% of system CPU time or even more
|
||||||
in attempts to obtain these locks by multiple callers, killing
|
in attempts to obtain these locks by multiple callers, killing
|
||||||
any benefits of GEOM direct dispatch. To overcome that new
|
any benefits of GEOM direct dispatch. To overcome that, new
|
||||||
fine-grained CAM locking design was implemented. It implies
|
fine-grained CAM locking design was implemented. It implies
|
||||||
splitting big per-SIM locks into several smaller ones: per-LUN
|
splitting big per-SIM locks into several smaller ones: per-LUN
|
||||||
locks, per-bus locks, queue locks, etc. After these changes
|
locks, per-bus locks, queue locks, etc. After these changes,
|
||||||
remaining per-SIM lock protects only controller driver
|
the remaining per-SIM lock protects only the controller driver
|
||||||
internals, reducing lock congestion down to acceptable level and
|
internals, reducing lock congestion down to an acceptable level
|
||||||
allowing to keep compatibility with existing drivers.</p>
|
and keeping keep compatibility with existing drivers.</p>
|
||||||
|
|
||||||
<p>Together GEOM and CAM changes twice increase peak I/O rate,
|
<p>Together, GEOM and CAM changes double the peak I/O rate,
|
||||||
reaching up to 1,000,000 IOPS on contemporary hardware.</p>
|
reaching up to 1,000,000 IOPS on contemporary hardware.</p>
|
||||||
|
|
||||||
<p>The changes were tested by number of people and are going to be
|
<p>The changes were tested by a number of people and will be
|
||||||
committed into &os; <tt>head</tt> and merged to
|
committed into &os; <tt>head</tt> and merged to
|
||||||
<tt>stable/10</tt> after the end of &os; 10.0 release cycle.</p>
|
<tt>stable/10</tt> after the end of the &os; 10.0 release
|
||||||
|
cycle.</p>
|
||||||
|
|
||||||
<p>The project is sponsored by iXsystems, Inc.</p>
|
<p>The project is sponsored by iXsystems, Inc.</p>
|
||||||
</body>
|
</body>
|
||||||
|
|
Loading…
Reference in a new issue