Update status of various netperf-related activities. Some changes were

merged to RELENG_5, prototypes enhanced, patches posted for review, etc.
svn path=/www/; revision=23015
2004-11-24 17:40:03 +00:00 · 2004-11-24 17:40:03 +00:00 · 4133897c7d · 2020-12-08 03:00:23 +00:00
commit 4133897c7d
parent 78f2e528d0
1 changed files with 22 additions and 17 deletions
--- a/en/projects/netperf/index.sgml
+++ b/en/projects/netperf/index.sgml
@ -1,6 +1,6 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" [
 <!ENTITY base CDATA "../..">
-<!ENTITY date "$FreeBSD: www/en/projects/netperf/index.sgml,v 1.7 2004/11/13 13:54:58 rwatson Exp $">
+<!ENTITY date "$FreeBSD: www/en/projects/netperf/index.sgml,v 1.8 2004/11/15 10:22:39 rwatson Exp $">
 <!ENTITY title "FreeBSD Network Performance Project (netperf)">
 <!ENTITY email 'mux'>
 <!ENTITY % includes SYSTEM "../../includes.sgml"> %includes;
@ -144,7 +144,7 @@
 	<td> Prefer file descriptor reference counts to socket reference
 	  counts for system calls. </td>
 	<td> &a.rwatson; </td>
-	<td> 20041024 </td>
+	<td> 20041124 </td>
 	<td> &status.done; </td>
 	<td> Sockets and file descriptors both have reference counts in order
 	  to prevent these objects from being free'd while in use.  However,
@ -155,14 +155,14 @@
 	  thus avoiding the synchronized operations necessary to modify the
 	  socket reference count, an approach also taken in the VFS code.
 	  This change has been made for most socket system calls, and has
-	  been committed to HEAD (6.x).  It will be merged to 5.x in the
-	  near future. </td>
+	  been committed to HEAD (6.x).  It has also been merged to RELENG_5
+	  for inclusion in 5.4.</td>
      </tr>

      <tr>
 	<td> Mbuf queue library </td>
 	<td> &a.rwatson; </td>
-	<td> 20041106 </td>
+	<td> 20041124 </td>
 	<td> &status.prototyped; </td>
 	<td> In order to facilitate passing off queues of packets between
 	  network stack components, create an mbuf queue primitive, struct
@ -170,7 +170,9 @@
 	  primitive is now being applied in several sample cases to determine
 	  whether it offers the desired semantics and benefits.  The
 	  implementation can be found in the rwatson_dispatch Perforce
-	  branch.</td>
+	  branch.  Additional work must also be done to explore the
+	  performance impact of "queues" vs arrays of mbuf pointers, which
+	  are likely to behave better from a caching perspective. </td>
      </tr>

      <tr>
@ -185,7 +187,7 @@
 	  required.  This has not yet been benchmarked.  A subset change to
 	  dispatch a single mbuf to a driver has also been prototyped, and
 	  bencharked at a several percentage point improvement in packet send
-	  rates from user space.</td>
+	  rates from user space. </td>
      </tr>

      <tr>
@ -201,19 +203,20 @@
      <tr>
 	<td> Employ queued dispatch across netisr dispatch API </td>
 	<td> &a.rwatson; </td>
-	<td> 20041113 </td>
-	<td> &status.new; </td>
-	<td> Similar to if_start_mbufqueue(), allow dispatch of queues of
-	  mbufs into the netisr interface, avoiding multiple wakeups when a
-	  netisr thread is already in execution.  Wakeups are expensive
-	  operations even when there are no threads waiting. </td>
+	<td> 20041124 </td>
+	<td> &status.prototyped; </td>
+	<td> Pull all of the mbufs in the netisr ifqueue out of the ifqueue
+	  into a thread-local mbuf queue to avoid repeated lock operations
+	  to access the queue.  Also use lock-free operations to test for
+	  queue contents being present.  This has been prototyped in the
+	  rwatson_netperf branch. </td>
      </tr>

      <tr>
 	<td> Modify UMA allocator to use critical sections not mutexes for
 	  per-CPU caches. </td>
 	<td> &a.rwatson; </td>
-	<td> 20041111 </td>
+	<td> 20041124 </td>
 	<td> &status.prototyped; </td>
 	<td> The mutexes protecting per-CPU caches require atomic operations
 	  on SMP systems; as they are per-CPU objects, the cost of
@ -222,13 +225,14 @@
 	  has been implemented in the rwatson_percpu branch, but is waiting
 	  on critical section performance optimizations that will prevent
 	  this change from negatively impacting uniprocessor performance.
-	  </td>
+	  The critical section operations from John Baldwin have been posted
+	  for public review. </td>
      </tr>

      <tr>
 	<td> Optimize critical section performance </td>
 	<td> &a.jhb; </td>
-	<td> 20041111 </td>
+	<td> 20041124 </td>
 	<td> &status.prototyped; </td>
 	<td> Critical sections prevent preemption of a thread on a CPU, as
 	  well as preventing migration of that thread to another CPU, and
@ -245,7 +249,8 @@
 	  cost as a mutex, meaning that optimizations on SMP to use critical
 	  sections instead of mutexes will not harm UP performance.  A
 	  prototype of this change is present in the jhb_lock Perforce
-	  branch. </td>
+	  branch, and patches have been posted to per-architecture mailing
+	  lists for review. </td>
      </tr>

    </table>