Update status of various netperf-related activities. Some changes were

merged to RELENG_5, prototypes enhanced, patches posted for review, etc.
This commit is contained in:
Robert Watson 2004-11-24 17:40:03 +00:00
parent 78f2e528d0
commit 4133897c7d
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/www/; revision=23015

View file

@ -1,6 +1,6 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" [
<!ENTITY base CDATA "../..">
<!ENTITY date "$FreeBSD: www/en/projects/netperf/index.sgml,v 1.7 2004/11/13 13:54:58 rwatson Exp $">
<!ENTITY date "$FreeBSD: www/en/projects/netperf/index.sgml,v 1.8 2004/11/15 10:22:39 rwatson Exp $">
<!ENTITY title "FreeBSD Network Performance Project (netperf)">
<!ENTITY email 'mux'>
<!ENTITY % includes SYSTEM "../../includes.sgml"> %includes;
@ -144,7 +144,7 @@
<td> Prefer file descriptor reference counts to socket reference
counts for system calls. </td>
<td> &a.rwatson; </td>
<td> 20041024 </td>
<td> 20041124 </td>
<td> &status.done; </td>
<td> Sockets and file descriptors both have reference counts in order
to prevent these objects from being free'd while in use. However,
@ -155,14 +155,14 @@
thus avoiding the synchronized operations necessary to modify the
socket reference count, an approach also taken in the VFS code.
This change has been made for most socket system calls, and has
been committed to HEAD (6.x). It will be merged to 5.x in the
near future. </td>
been committed to HEAD (6.x). It has also been merged to RELENG_5
for inclusion in 5.4.</td>
</tr>
<tr>
<td> Mbuf queue library </td>
<td> &a.rwatson; </td>
<td> 20041106 </td>
<td> 20041124 </td>
<td> &status.prototyped; </td>
<td> In order to facilitate passing off queues of packets between
network stack components, create an mbuf queue primitive, struct
@ -170,7 +170,9 @@
primitive is now being applied in several sample cases to determine
whether it offers the desired semantics and benefits. The
implementation can be found in the rwatson_dispatch Perforce
branch.</td>
branch. Additional work must also be done to explore the
performance impact of "queues" vs arrays of mbuf pointers, which
are likely to behave better from a caching perspective. </td>
</tr>
<tr>
@ -185,7 +187,7 @@
required. This has not yet been benchmarked. A subset change to
dispatch a single mbuf to a driver has also been prototyped, and
bencharked at a several percentage point improvement in packet send
rates from user space.</td>
rates from user space. </td>
</tr>
<tr>
@ -201,19 +203,20 @@
<tr>
<td> Employ queued dispatch across netisr dispatch API </td>
<td> &a.rwatson; </td>
<td> 20041113 </td>
<td> &status.new; </td>
<td> Similar to if_start_mbufqueue(), allow dispatch of queues of
mbufs into the netisr interface, avoiding multiple wakeups when a
netisr thread is already in execution. Wakeups are expensive
operations even when there are no threads waiting. </td>
<td> 20041124 </td>
<td> &status.prototyped; </td>
<td> Pull all of the mbufs in the netisr ifqueue out of the ifqueue
into a thread-local mbuf queue to avoid repeated lock operations
to access the queue. Also use lock-free operations to test for
queue contents being present. This has been prototyped in the
rwatson_netperf branch. </td>
</tr>
<tr>
<td> Modify UMA allocator to use critical sections not mutexes for
per-CPU caches. </td>
<td> &a.rwatson; </td>
<td> 20041111 </td>
<td> 20041124 </td>
<td> &status.prototyped; </td>
<td> The mutexes protecting per-CPU caches require atomic operations
on SMP systems; as they are per-CPU objects, the cost of
@ -222,13 +225,14 @@
has been implemented in the rwatson_percpu branch, but is waiting
on critical section performance optimizations that will prevent
this change from negatively impacting uniprocessor performance.
</td>
The critical section operations from John Baldwin have been posted
for public review. </td>
</tr>
<tr>
<td> Optimize critical section performance </td>
<td> &a.jhb; </td>
<td> 20041111 </td>
<td> 20041124 </td>
<td> &status.prototyped; </td>
<td> Critical sections prevent preemption of a thread on a CPU, as
well as preventing migration of that thread to another CPU, and
@ -245,7 +249,8 @@
cost as a mutex, meaning that optimizations on SMP to use critical
sections instead of mutexes will not harm UP performance. A
prototype of this change is present in the jhb_lock Perforce
branch. </td>
branch, and patches have been posted to per-architecture mailing
lists for review. </td>
</tr>
</table>