- Many minor fixes in the CAM report

Submitted by:	bjk, wblock
This commit is contained in:
Gabor Pali 2013-09-09 21:08:57 +00:00
parent 31078a2f3f
commit faccc55b81
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/head/; revision=42635

View file

@ -135,51 +135,52 @@
</links> </links>
<body> <body>
<p>Last year's high-performance storage vendors reported <p>Last year's high-performance storage vendors reported a
performance bottleneck in &os; block storage subsystem, limiting performance bottleneck in the &os; block storage subsystem,
peak performance around 300-500K IOPS. While that is still more limiting peak performance around 300-500K IOPS. While that is
then enough for average systems, detailed investigation has still more than enough for average systems, detailed
shown number of places that require radical improvement. investigation has shown a number of places that require radical
Unmapped I/O support implemented early this year already improvement. Unmapped I/O support implemented early this year
improved I/O performance by about 30% and moved more accents already improved I/O performance by about 30% and moved more
toward GEOM and CAM subsystems scalability. Fixing these issues accents toward GEOM and CAM subsystems scalability. Fixing
was the goal of this project.</p> these issues was the goal of this project.</p>
<p>The existing GEOM design assumed the most of I/O handling to be <p>The existing GEOM design assumed most I/O handling to be done
done by only two kernel threads (<tt>g_up()</tt> and by only two kernel threads (<tt>g_up()</tt> and
<tt>g_down()</tt>). That simplified locking in some cases, but <tt>g_down()</tt>). That simplified locking in some cases, but
limited potential SMP scalability and created additional limited potential SMP scalability and created additional
scheduler overhead. This project introduces concept of direct scheduler overhead. This project introduces the concept of
I/O dispatch into GEOM for cases where it is know to be safe and direct I/O dispatch into GEOM for cases where it is known to be
efficient. That implies marking some of GEOM consumers and safe and efficient. That implies marking some GEOM consumers
providers with one or two new flags, declaring situations when and providers with one or two new flags, declaring situations
direct function call can be used instead of normal request when a direct function call can be used instead of normal request
queuing. That allows to avoid any context switches inside GEOM queuing. That allows avoiding any context switches inside GEOM
for the most widely used topologies, simultaneously processing for the most widely used topologies, simultaneously processing
multiple I/Os from multiple calling threads.</p> multiple I/Os from multiple calling threads.</p>
<p>Having GEOM passing through multiple concurrent calls down to <p>Having GEOM passing through multiple concurrent calls down to
the underlying layers exposed major lock congestion in CAM. In the underlying layers exposed major lock congestion in CAM. In
existing CAM design all devices connected to the same ATA/SCSI the existing CAM design all devices connected to the same ATA/SCSI
controller are sharing single lock, which can be quite busy due controller are sharing a single lock, which can be quite busy due
to multiple controller hardware accesses and/or code logic. to multiple controller hardware accesses and/or code logic.
Experiments have shown that applying only above GEOM direct Experiments have shown that applying only the above GEOM direct
dispatch changes burns up to 60% of system CPU time or even more dispatch changes burns up to 60% of system CPU time or even more
in attempts to obtain these locks by multiple callers, killing in attempts to obtain these locks by multiple callers, killing
any benefits of GEOM direct dispatch. To overcome that new any benefits of GEOM direct dispatch. To overcome that, new
fine-grained CAM locking design was implemented. It implies fine-grained CAM locking design was implemented. It implies
splitting big per-SIM locks into several smaller ones: per-LUN splitting big per-SIM locks into several smaller ones: per-LUN
locks, per-bus locks, queue locks, etc. After these changes locks, per-bus locks, queue locks, etc. After these changes,
remaining per-SIM lock protects only controller driver the remaining per-SIM lock protects only the controller driver
internals, reducing lock congestion down to acceptable level and internals, reducing lock congestion down to an acceptable level
allowing to keep compatibility with existing drivers.</p> and keeping keep compatibility with existing drivers.</p>
<p>Together GEOM and CAM changes twice increase peak I/O rate, <p>Together, GEOM and CAM changes double the peak I/O rate,
reaching up to 1,000,000 IOPS on contemporary hardware.</p> reaching up to 1,000,000 IOPS on contemporary hardware.</p>
<p>The changes were tested by number of people and are going to be <p>The changes were tested by a number of people and will be
committed into &os; <tt>head</tt> and merged to committed into &os; <tt>head</tt> and merged to
<tt>stable/10</tt> after the end of &os; 10.0 release cycle.</p> <tt>stable/10</tt> after the end of the &os; 10.0 release
cycle.</p>
<p>The project is sponsored by iXsystems, Inc.</p> <p>The project is sponsored by iXsystems, Inc.</p>
</body> </body>