Update the 5-stable roadmap. Remove items that are done or no londer
relevant, add some new items, re-arrange the sections, and add a new schedule.
This commit is contained in:
parent
16d9a01672
commit
e6909ee045
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=18126
1 changed files with 223 additions and 348 deletions
|
@ -22,13 +22,14 @@
|
|||
<!ENTITY t.releng.5 "<literal>RELENG_5</literal>">
|
||||
<!ENTITY t.releng.5.1 "<literal>RELENG_5_1</literal>">
|
||||
<!ENTITY t.releng.5.2 "<literal>RELENG_5_2</literal>">
|
||||
<!ENTITY t.releng.5.3 "<literal>RELENG_5_3</literal>">
|
||||
<!ENTITY t.releng.head "<literal>HEAD</literal>">
|
||||
|
||||
]>
|
||||
|
||||
<article>
|
||||
<articleinfo>
|
||||
<title>The Roadmap for 5-STABLE</title>
|
||||
<title>The Road Map for 5-STABLE</title>
|
||||
|
||||
<authorgroup>
|
||||
<corpauthor>The &os; Release Engineering Team</corpauthor>
|
||||
|
@ -60,13 +61,12 @@
|
|||
of 2003. Features like the GEOM block layer, Mandatory Access Controls,
|
||||
ACPI, &sparc64; and ia64 platform support, and UFS snapshots, background
|
||||
filesystem checks, and 64-bit inode sizes make it an exciting operating
|
||||
system for both desktop and production users. However, some important
|
||||
system for both desktop and enterprise users. However, some important
|
||||
features are not complete. The foundations for fine-grained locking
|
||||
and preemption in the kernel exist, but much more work is left to be
|
||||
done. Work on Kernel Schedulable Entities (KSE), similar to Scheduler
|
||||
Activations, has been ongoing but needs a push to realize its benefit.
|
||||
Performance compared to &os; 4.<replaceable>X</replaceable> has
|
||||
declined and must be restored and surpassed.</para>
|
||||
done. Performance and stability compared to &os;
|
||||
4.<replaceable>X</replaceable> has declined and must be restored and
|
||||
surpassed.</para>
|
||||
|
||||
<para>This is somewhat similar to the situation that &os; faced in the
|
||||
3.<replaceable>X</replaceable> series. Work on 3-CURRENT trudged along
|
||||
|
@ -97,6 +97,15 @@
|
|||
<sect1 id="major-issues">
|
||||
<title>Major issues</title>
|
||||
|
||||
<para>The success of the 5.<replaceable>X</replaceable> series hinges on
|
||||
the ability to deliver fine-graned threading and re-entrancy in the
|
||||
kernel (also known as SMPng) and kernel-supported POSIX threads in
|
||||
userland, while not sacrificing overall system stability or
|
||||
performance.</para>
|
||||
|
||||
<sect2 id="SMPng">
|
||||
<title>SMPng</title>
|
||||
|
||||
<para>The state of SMPng and kernel lockdown is the biggest concern for
|
||||
5.<replaceable>X</replaceable>. To date, few major systems have come
|
||||
out from under the kernel-wide mutex known as <quote>Giant</quote>.
|
||||
|
@ -109,30 +118,35 @@
|
|||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>VM: the kmem_malloc(M_NOWAIT) path no longer needs Giant held.
|
||||
The kmem_malloc(M_WAITOK) path is in progress and is expected to be
|
||||
finished in the coming weeks. Other facets of the VM system, like
|
||||
the VFS interface, buffer/cache, etc, are largely untouched.</para>
|
||||
<para>VM: Kernel malloc is locked and free of Giant. The UMA zone
|
||||
allocator is also free of Giant. vm_object locking is in progress
|
||||
and is an important step to making the buffer/cache free of
|
||||
Giant. Pmap locking remains to be started.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>GEOM: The GEOM block layer was designed to run free of Giant,
|
||||
but at this time no block drivers can run without Giant.
|
||||
Additionally, it has the potential to suffer performance loss due
|
||||
to its upcall/downcall data paths happening in kernel threads.
|
||||
Lightweight context switches might help this.</para>
|
||||
<para>GEOM: The GEOM block layer was designed to run free of Giant
|
||||
and allow GEOM modules and underlying block drivers to run free
|
||||
of Giant. Currently, only the &man.ata.4; and &man.aac.4; drivers
|
||||
are locked and run without Giant. Work on other block drivers is
|
||||
in progress. Locking the CAM subsystem is required for nearly all
|
||||
SCSI drivers to run without Giant; this work has not started
|
||||
yet.</para>
|
||||
<para>Additionally, GEOM has the potential to suffer performance loss
|
||||
due to its upcall and downcall data paths happening in kernel threads.
|
||||
Improved lightweight context switches might help this.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Network: Locking of the TCP and UDP portions of the stack is
|
||||
complete. Work is in progress to lock up the IP stack, including
|
||||
the routing tree, ARP code, raw IP, and ifaddr and inet data
|
||||
structures. IPv6 has been lightly touched during the inp locking
|
||||
but is hindered by the KAME code being significantly out of date.
|
||||
Work has not started on any of the other protocols such as
|
||||
AppleTalk, XNS, or IPX. Locking of the socket layer is in progress
|
||||
but has been largely untested. None of the hardware drivers or
|
||||
Ethernet layers have been locked.</para>
|
||||
<para>Network: Work has restarted on locking the network stack.
|
||||
Routing tables, ARP, bridge, IPFW, Fast-Forward, TCP, UDP, IP,
|
||||
Fast IPSEC, and interface layers are being targeted initially, along
|
||||
with several Ethernet device drivers. The socket layer, IPv6, and
|
||||
other protocol layers will be targeted later. The primary goal
|
||||
of this work is to regain the performance found in
|
||||
&os; 4.<replaceable>X</replaceable>. The cost of context switching
|
||||
to the device driver ithreads and the netisr is still hampering
|
||||
performance.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -140,12 +154,12 @@
|
|||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>buffer/cache: Initial work complete.</para>
|
||||
<para>buffer/cache: Initial work complete on locking the buffer.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Proc: Work on locking the proc structure was ongoing for a
|
||||
while but seems to have stalled.</para>
|
||||
<para>Proc: Initial proc locking is in place, further progress is
|
||||
expected for &os; 5.2.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -159,8 +173,7 @@
|
|||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Pipes: complete with the exception of VM-related
|
||||
optimizations.</para>
|
||||
<para>Pipes: complete</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -181,12 +194,13 @@
|
|||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>kernel encryption: crypto drivers and core &man.crypto.4; framework are
|
||||
Giant-free. KAME IPsec and FAST IPSec have not been locked.</para>
|
||||
<para>kernel encryption: crypto drivers and core &man.crypto.4;
|
||||
framework are Giant-free. KAME IPsec has not been locked.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Sound subsystem: complete</para>
|
||||
<para>Sound subsystem: complete, but lock order reversal problems seem
|
||||
to persist.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -199,158 +213,128 @@
|
|||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>Another issue with SMPng is interrupt latency. The overhead of
|
||||
doing a complete context switch to a kernel interrupt thread is high
|
||||
and shows noticeable latency. Work is ongoing to implement lazy
|
||||
context switching on all platforms. Fine grained locking of drivers
|
||||
will also help this, as will converting drivers to be as efficient as
|
||||
possible in their interrupt routines.</para>
|
||||
</sect2>
|
||||
|
||||
<para>Next, the state of KSE must resolved for &t.releng.5;. Work on it has
|
||||
slowed noticeably in the past 6 months but appears to be picking up
|
||||
again. There are a number of issues that must be addressed:</para>
|
||||
<sect2 id="interrupts">
|
||||
<title>Interrupt latency and servicing</title>
|
||||
|
||||
<para>SMPng introduced the concept of dedicating kernel threads, known as
|
||||
ithreads, to servicing interrupts. With this, driver interrupt
|
||||
service routines are allowed to block for mutexes, memory allocations,
|
||||
etc. While this makes writing drivers easier, it introduces considerable
|
||||
latency into the system due to the complete process context switch must
|
||||
be performed in order to service the ithread. This is aggravated by the
|
||||
extensive coverage over the kernel by the Giant mutex, and often results
|
||||
in multiple sleeps and context switches in order to service an interrupt.
|
||||
Drivers that register their interrupt as INTR_MPSAFE are less likely to
|
||||
feel these aggravating effects, but the overhead of doing a context
|
||||
switch remains. Interrupt service routines that are registered as
|
||||
INTR_FAST are run directly from the interrupt context and do not suffer
|
||||
these problems at all. However, the INTR_FAST property forces the
|
||||
interrupt line to be exclusive; no sharing can occur on it. The
|
||||
proliferation of shared interrupts on PC systems makes this
|
||||
undesirable.</para>
|
||||
|
||||
<para>Several ideas have been proposed to help combat this problem:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The userland threading library, currently called libkse, is
|
||||
immature and has not been used for any significant threaded
|
||||
application.</para>
|
||||
<para>Special casing ithreads to be lightweight is a possibility. This
|
||||
might involve reducing the amount of saved context for the ithread,
|
||||
stack-borrowing from another kthread, and/or creating a new fast-path
|
||||
to avoid the mi_switch() routine.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>KSE has the potential to uncover latent race conditions and
|
||||
create new ones. An audit needs to be performed to ensure that no
|
||||
obvious problems exist.</para>
|
||||
<para>A new interrupt model can be introduced to allow drivers to
|
||||
register an 'interrupt filter' along with a normal service routine.
|
||||
This would be similar to the Mac OSX model in use today. Interrupt
|
||||
filter routines would allow the driver to determine if it is
|
||||
interested in servicing the interrupt, allow it to squelch the
|
||||
interrupt source, and possibly determine and schedule service
|
||||
actions. It would run in the same context as the low-level interrupt
|
||||
service routine, so sleeping would be strictly forbidden. If actions
|
||||
that result in sleeping or blocking for long periods are required,
|
||||
the filter would signal to the caller that its normal ithread routine
|
||||
should be scheduled.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>According to the release schedule below, KSE kernel and userland
|
||||
components must be functionality complete by June 2003 in order to be
|
||||
included in the &t.releng.5; branch. For security and stability reasons,
|
||||
if KSE cannot be finished in time then, by default, all KSE-specific
|
||||
syscalls should be modified to return ENOSYS and all other KSE-specific
|
||||
interfaces disabled. Deprecating KSE from &t.releng.5; but keeping it in
|
||||
the &t.releng.head; branch will pose problems in porting bugfixes and features
|
||||
between the two branches, so every effort should be made to finish it
|
||||
on time.</para>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="KSE">
|
||||
<title>Kernel-supported application threads</title>
|
||||
|
||||
<para>The FreeBSD 5.1 development cycle saw the KSE package jump into a
|
||||
highly usable state. THR, an alternate threading package based on some
|
||||
of the KSE kernel primitives but implementing purely 1:1 scheduling
|
||||
semantics also appeared and is in a similarly experimental but usable
|
||||
state. Users may interchange these two libraries along with the legacy
|
||||
libc_r library via relinking their apps or by using the new libmap
|
||||
feature of the runtime linker. This excellent progress must be driven
|
||||
to completion before the &t.releng.5; branch point so that the libc_r
|
||||
package can be deprecated.</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The kernel and userland components for KSE and THR must be
|
||||
completed for all Tier-1 platforms. The decision on which thread
|
||||
package to sanction as the default will likely be made on a
|
||||
per-platform basis depending on the stability and completeness of
|
||||
each package.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>KSE must pass the ACE test suite on all Tier-1 platforms.
|
||||
Additional real-world testing must also be performed to ensure
|
||||
that the libraries are indeed useful. At a minimum, the following
|
||||
packages should be tested:</para>
|
||||
<itemizedlist>
|
||||
<listitem><para>OpenOffice</para></listitem>
|
||||
<listitem><para>KDE Desktop</para></listitem>
|
||||
<listitem><para>Apache 2.x</para></listitem>
|
||||
<listitem><para>BIND 9.2.x</para></listitem>
|
||||
<listitem><para>MySQL</para></listitem>
|
||||
<listitem><para>&java; 1.4.x</para></listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="goals">
|
||||
<title>Goals for 5-STABLE</title>
|
||||
<title>Requirements for 5-STABLE</title>
|
||||
|
||||
<para>The goals for the &t.releng.5; branch point are:</para>
|
||||
<para>The &t.releng.5 branch must offers users the same stability and
|
||||
performance that is currently enjoyed in the &t.releng.4 branch.
|
||||
While the goal of SMPng is to allow performance to far exceed what
|
||||
is found in &t.releng.4; and its siblings BSD's, regaining performance
|
||||
to the basic level is of the upmost importance. The branch must also
|
||||
be mature enough to avoid ABI and API changes while still allowing
|
||||
potential problems to be resolved.</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>All subsystems and interfaces must be mature enough to be
|
||||
maintainable for improvements and bug fixes.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Equal or better stability from &os; 4.8.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>No functional regressions from 4.8. It is important to make
|
||||
sure that users do not avoid upgrading to 5.x because of lost
|
||||
functionality.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Performance on par with &os; 4.8 for most common operations.
|
||||
Both UP and SMP configurations should be evaluated. SMP has the
|
||||
potential to perform much better than
|
||||
4.<replaceable>X</replaceable>, though for the purposes of creating
|
||||
the &t.releng.5; branch, comparable performance between the two should
|
||||
be acceptable.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>It is unrealistic to expect that the SMPng project will be fully
|
||||
complete by &t.releng.5;, or that performance will be significantly better
|
||||
than 4.<replaceable>X</replaceable>. However, focusing on a subset of
|
||||
the outstanding tasks will give enough benefit for the branch to be
|
||||
viable and maintainable. To break it down:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>ABI/API/Infrastructure stability - Enough infrastructure must
|
||||
be in place and stable to allow fixes from &t.releng.head; to easily and
|
||||
safely be merged into &t.releng.5;. Also, we must draw a line as to
|
||||
what subsystems are to be locked down when we go into
|
||||
5-STABLE.</para>
|
||||
<sect2 id="API">
|
||||
<title>ABI/API/Infrastructure stability</title>
|
||||
<para>Enough infrastructure must be in place and stable to allow
|
||||
fixes from &t.releng.head; to easily and safely be merged into
|
||||
&t.releng.5;. Also, we must draw a line as to what subsystems are
|
||||
to be locked down when we go into 5-STABLE.</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>SMPng</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>VM: Most codepaths, others than the ones that interact with
|
||||
VFS, should be Giant-free for &t.releng.5;.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Network: Taking the network stack out from under Giant poses
|
||||
the risk of uncovering latent bugs and races. Locking it down
|
||||
but not removing Giant imposes further performance penalties. A
|
||||
decision on which parts of the network stack should be locked and
|
||||
taken out from under Giant for &t.releng.5; should be made no later
|
||||
than March 15. Work on the IP, TCP, UDP,raw IP, routing sockets,
|
||||
and &unix; domain sockets stands a good chance of being complete in
|
||||
time for &t.releng.5;.</para>
|
||||
|
||||
<para>If the decision is made to not lift Giant from the stack,
|
||||
then the locks in these layers could be optimized out with a
|
||||
kernel config option. Having a Giant-free path from the the
|
||||
hardware layer to the IP queues should be investigated as it
|
||||
could allow significant performance gains in the network
|
||||
benchmarks. If this can be achieved then the hardware interface
|
||||
layer needs to allow for drivers to incrementally become free of
|
||||
Giant. Locking down at least two Ethernet drivers would be
|
||||
highly desirable. If the semantics are too complex to have the
|
||||
stack free of Giant but not the hardware drivers, investigation
|
||||
should be done into making it configurable.</para>
|
||||
|
||||
<para>Lesser-used network stacks like netatalk, netipx, etc, should
|
||||
not break while this work is going on. However, locking them is
|
||||
not a high priority. Special kernel config options might be
|
||||
needed in order for these layers to operate with the rest of the
|
||||
stack being locked and Giant free.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>GEOM: At least 2 block drivers should be locked in order to
|
||||
demonstrate that others can also be locked without changing the
|
||||
interface to GEOM. The ATA driver is a good candidate for this,
|
||||
though caution should be taken as it is also extremely
|
||||
high-profile and any problems with it will affect nearly all
|
||||
users of &os;.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Lazy context switching: sparc64 is the only platform that
|
||||
performs lazy context switching when entering the kernel. The
|
||||
performance gains promised by this are significant enough to
|
||||
require that it be implemented for all other Tier-1
|
||||
platforms.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>KSE: The kernel side of KSE must be functionally complete and
|
||||
have undergone a security audit. libkse must be complete enough to
|
||||
demonstrate a real-world application running correctly on it using
|
||||
the standard &posix; Threads API. Examples would be apache 2.0,
|
||||
&java;, and/or mozilla. A functional regression test suite is also a
|
||||
requirement for &t.releng.5; and should test signal delivery,
|
||||
scheduling, performance, and process security/credentials for both
|
||||
KSE and non-KSE processes. KSE kernel and userland components must
|
||||
also reach the same level of functionality for all Tier-1 platforms
|
||||
<para>KSE: Both kernel and userland components must
|
||||
reach the same level of functionality for all Tier-1 platforms
|
||||
in both UP and SMP configurations. The definition of <quote>Tier-1
|
||||
platforms</quote> can be found in
|
||||
<ulink url="http://www.FreeBSD.org/doc/en_US.ISO8859-1/articles/committers-guide/archs.html"></ulink>.</para>
|
||||
<ulink url="http://www.FreeBSD.org/doc/en_US.ISO8859-1/articles/committers-guide/archs.html"></ulink>. Continued testing against the ACE test
|
||||
suite must be made as the &t.releng.5; branch draws near. KSE
|
||||
must pose no functional regressions for the ongoing &java;
|
||||
certification program. Common desktop and server applications
|
||||
must run seamlessly under KSE. A policy must be decided on as
|
||||
to which platforms will enable KSE as the default threading
|
||||
package, how to allow the user to switch threading packages, and
|
||||
how third-party packages will me made aware of these choices.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -364,11 +348,9 @@
|
|||
tracks the
|
||||
progress of this and should be used to determine which drivers
|
||||
must be converted for &t.releng.5; and which can be left behind.
|
||||
Also, there has been talk by several developers and the original
|
||||
author to give the busdma interface a minor overhaul. If this is
|
||||
to happen, it needs to happen before &t.releng.5;. Otherwise,
|
||||
differences between the old and new API will make driver
|
||||
maintenance difficult.</para>
|
||||
No new storage or network drivers shall be allowed into the
|
||||
&os; source tree. Exceptions for other classes of drivers must
|
||||
be justified in public discussion.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -377,73 +359,57 @@
|
|||
leaving this task solely to the OS. &os; must gain the ability to
|
||||
manage and allocate PCI memory resources on its own. Implementing
|
||||
this should take into account cardbus, PCI-HotPlug, and laptop
|
||||
dockstation requirements. This feature will become increasingly
|
||||
dock station requirements. This feature will become increasingly
|
||||
critical through the lifetime of &t.releng.5;, and therefore is a
|
||||
requirement for the &t.releng.5; branch.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
</sect2>
|
||||
|
||||
<listitem>
|
||||
<para>Performance: most performance gains hinge on the progress of
|
||||
SMPng Areas that should be concentrated on are:</para>
|
||||
<sect2 id="performance">
|
||||
<title>Performance</title>
|
||||
<para>Performance hinges on the progress of SMPng infrastructure and
|
||||
the following areas:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Storage I/O: I/O performance suffers from two problems, too
|
||||
many expensive context switches, and too much work being done
|
||||
in interrupt threads. Specifically, it takes 3 context
|
||||
switches for most drivers to get from the hardware completion
|
||||
interrupt to unblocking the user process: one for the
|
||||
interrupt thread, one for the GEOM g_up thread, and one to get
|
||||
back to the user thread. Drivers that attempt to be efficient
|
||||
and quick in their interrupt handlers (as all should be)
|
||||
usually also schedule a taskqueue, which adds a context switch
|
||||
in between the interrupt thread and the g_up thread and brings
|
||||
the total up to 4. Two things need to be done to attack
|
||||
this:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Make all drivers defer most of their processing out of
|
||||
their interrupt thread. Significant performance gains have
|
||||
been shown recently in the &man.aac.4; driver by making its
|
||||
interrupt handler be <literal>INTR_MPSAFE</literal> and moving
|
||||
all processing to a taskqueue.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>investigate eliminating the taskqueue context switch by
|
||||
adding a callback to the g_up thread that allows a driver to
|
||||
do its interrupt processing there instead of in the
|
||||
taskqueue.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Storage: The GEOM block layer allows storage drivers to
|
||||
run without Giant. All drivers that interface directly with
|
||||
GEOM (as opposed to sitting underneath CAM or another middleware)
|
||||
must be locked and free of Giant in both their strategy and
|
||||
completion paths. Their interrupt handlers must also run free
|
||||
of Giant.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Network: Network drivers suffer from the interrupt latency
|
||||
previously mentioned as well as from the network stack being
|
||||
partially locked down but not free from Giant. Possible
|
||||
strategies for addressing this are described in the previous
|
||||
section.</para>
|
||||
<para>Network: The layers in the IPv4 path below the socket layer
|
||||
must be locked and free of Giant. This includes the protocol,
|
||||
routing, bridging, filtering, and hardware layers. Allowances must
|
||||
be made for protocols that are not locked, especially IPv6.
|
||||
Testing must also be performed to ensure stability, correctness,
|
||||
and performance.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Other locking - XXX?</para>
|
||||
<para>Interrupt and context switching: As discussed above, interrupt
|
||||
latency and context switching have a severe impact of performance.
|
||||
Context switching for ithreads and kthreads must be improved on
|
||||
platforms. New interrupt handling models that allow for faster
|
||||
more flexible handling of both traditional and MSI interrupts must
|
||||
be investigated and implemented.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
</sect2>
|
||||
|
||||
<listitem>
|
||||
<para>Benchmarks and performance testing: Having a source of reliable
|
||||
and useful benchmarks is essential to identifying performance
|
||||
problems and guarding against performance regressions. A
|
||||
<quote>performance team</quote> that is made up of people and
|
||||
resources for formulating, developing, and executing benchmark
|
||||
tests should be put into place soon. Comparisons should be made
|
||||
against both &os; 4.<replaceable>X</replaceable> and Linux 2.4.x.
|
||||
Tests to consider are:</para>
|
||||
<sect2 id="benchmarks">
|
||||
<title>Benchmarks and performance testing</title>
|
||||
<para>Having a source of reliable and useful benchmarks is essential
|
||||
to identifying performance problems and guarding against performance
|
||||
regressions. A <quote>performance team</quote> that is made up of
|
||||
people and resources for formulating, developing, and executing
|
||||
benchmark tests should be put into place soon. Comparisons should
|
||||
be made against both &os; 4.<replaceable>X</replaceable> and Linux
|
||||
2.4/2.6. Tests to consider are:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
|
@ -471,35 +437,16 @@
|
|||
Note: does not compile with gcc 3.x yet.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
</sect2>
|
||||
|
||||
<listitem>
|
||||
<para>Features:</para>
|
||||
<sect2 id="features">
|
||||
<title>Features:</title>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>ACPI: Intel's ACPI power management and device configuration
|
||||
subsystem has become an integral part of &os;'s x86 and ia64
|
||||
device configuration model. However, many bugs exist in Intel's
|
||||
vendor code, our OS-specific code, and motherboard BIOSes, causing
|
||||
many ACPI-enabled systems to fail to boot, misdetect drivers,
|
||||
and/or have many other problems. Fixing these problems seems to
|
||||
be an uphill battle and is often times causing a poor
|
||||
first-impression of &os; 5.0. Most x86 systems can function with
|
||||
ACPI disabled, and logic should be added to the boot loader and
|
||||
sysinstall to allow users to easily and intuitively turn it off.
|
||||
Turning off ACPI by default is prone to problems also as many
|
||||
newer systems rely on it to provide correct interrupt routing
|
||||
information. Also, a centralized resource should be created to
|
||||
track ACPI problems and solutions. Linux uses the same Intel
|
||||
vendor sources as &os;, so we should investigate how they have
|
||||
handled some of the known problems.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>NEWCARD/OLDCARD: The NEWCARD subsystem was made the default
|
||||
for &os; 5.0. Unfortunately, it contains no support for
|
||||
non-Cardbus bridges and falls victim to interrupt routine
|
||||
non-Cardbus bridges and falls victim to interrupt routing
|
||||
problems on some laptops. The classic 16-bit bridge support,
|
||||
OLDCARD, still exists and can be compiled in, but this is highly
|
||||
inconvenient for users of older laptops. If OLDCARD cannot be
|
||||
|
@ -517,7 +464,7 @@
|
|||
|
||||
<listitem>
|
||||
<para>New scheduler framework: The new scheduler framework is in
|
||||
place, and users can select between the classic 44bsd scheduler
|
||||
place, and users can select between the classic 44BSD scheduler
|
||||
and the new ULE scheduler. A scheduler that demonstrates
|
||||
processor affinity, HyperThreading and KSE awareness, and no
|
||||
regressions in performance or interactivity characteristics must
|
||||
|
@ -525,138 +472,66 @@
|
|||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>sparc64 local console: neither syscons nor vt work on
|
||||
sparc64, leaving it with only serial and <quote>fake</quote> OFW
|
||||
console support. This is a major support hole for what is a
|
||||
Tier-1 platform. Whether syscons can be shoe-horned in or
|
||||
wscons be adopted from NetBSD is up for debate. However,
|
||||
sparc64 must have local console support for &t.releng.5;. Having
|
||||
this will also enable the XFree86 server to run, which is also a
|
||||
requirement for &t.releng.5;.</para>
|
||||
<para>GDB: GDB in the base system must work for sparc64, and
|
||||
must also understand KSE thread semantics. GDB 5.3 is available
|
||||
and is reported to address the sparc64 issues.</para>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="documentation">
|
||||
<title>Documentation:</title>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The manual pages, Handbook, and FAQ should be free from
|
||||
content specific to &os; 4.<replaceable>X</replaceable>, i.e. all
|
||||
text should be equally applicable to &os;
|
||||
5.<replaceable>X</replaceable>. The installation section of the
|
||||
handbook needs the most work in this area.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>gcc/toolchain: gcc 3.3 might be available in time for
|
||||
&t.releng.5; and might offer some attractive benefits, but also
|
||||
likely to introduce ABI incompatibility with prior gcc versions.
|
||||
ABI compatibility should be locked down for the &t.releng.5;
|
||||
branch.</para>
|
||||
|
||||
<para>There has also been a request to move /usr/include/g++ to
|
||||
/usr/include/g++-v3 to be more compliant with the stock behavior
|
||||
of gcc. This should also be investigated for &t.releng.5;.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>gdb: gdb from the base system should work for sparc64. It
|
||||
should also understand KSE thread semantics, assuming that KSE
|
||||
is included in the &t.releng.5; branch. gdb 5.3 is available and
|
||||
there are reports that it should address the sparc64 issue.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>&man.disklabel.8; regressions: The biggest casualty of the
|
||||
introduction of GEOM appears to be the disklabel utility. The
|
||||
<option>-r</option> option gives unpredictable results in most
|
||||
cases now and should be removed or fixed. Work is planned for a
|
||||
new unified interface for modifying labels and slices, however
|
||||
this should not preclude disklabel from being fixed.</para>
|
||||
<para>The release documentation needs to be complete and accurate
|
||||
for all Tier-1 architectures. The hardware notes and
|
||||
installation guides need specific attention.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>Documentation:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The manual pages, Handbook, and FAQ should be free from
|
||||
content specific to &os; 4.<replaceable>X</replaceable>, i.e. all
|
||||
text should be equally applicable to &os;
|
||||
5.<replaceable>X</replaceable>. The installation section of the
|
||||
handbook needs the most work in this area.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>The release documentation needs to be complete and accurate
|
||||
for all Tier-1 architectures. The hardware notes and
|
||||
installation guides need specific attention.</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>If &os; 5.1 is not the branch point for &t.releng.5; then the
|
||||
Early Adopters Guide needs to be updated. This document should
|
||||
then be removed just before the release closest to the &t.releng.5;
|
||||
branch point.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="schedule">
|
||||
<title>Schedule</title>
|
||||
|
||||
<para>If branching &t.releng.5; at the 5.1 release is paramount, 5.1 will
|
||||
probably need to move out by at least 3 months. The schedule would
|
||||
be:</para>
|
||||
<para>The original schedule of releasing &os; 5.2 and branching
|
||||
&t.releng.5; in September 2003 is being pushed back due to the
|
||||
complexity of the remaining tasks. The new schedule follows:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Jun 30, 2003: KSE and SMPng feature freeze</para>
|
||||
<para>Nov 5, 2003: 5.2-BETA, general code freeze</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Aug 4, 2003: 5.1-BETA, general code freeze</para>
|
||||
<para>Nov 19, 2003: 5.2-RC1, &t.releng.5.2; branched</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Aug 18, 2003: 5.1-RC1, &t.releng.5; and &t.releng.5.1; branched</para>
|
||||
<para>Nov 27, 2003: 5.2-RC2</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Aug 25, 2003: 5.1-RC2</para>
|
||||
<para>Dec 2, 2003: 5.2-RELEASE</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Sept 1, 2003: 5.1-RELEASE</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>Taking an incremental approach might be more beneficial. Releasing
|
||||
5.1 in time for USENIX ATC 2003 will provide a wide audience for
|
||||
productive feedback and will keep &os; visible. In this scenario, 5.1
|
||||
should offer a significant improvement over 5.0 in terms of bug fixes
|
||||
and performance. Lockdowns and improvements to the storage subsystem
|
||||
and scheduler should be expected, the NEWCARD/OLDCARD issues should be
|
||||
addressed, and all known bugs and regressions from the 5.0 errata list
|
||||
should be fixed. KSE and other SMPng tasks that cannot finish in time
|
||||
for 5.1 should also not reduce the stability of the release. The
|
||||
schedule for this would be:</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>May 5, 2003: 5.1-BETA, general code freeze</para>
|
||||
<para>Mar 1, 2004: 5.3-BETA, general code freeze</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>May 19, 2003: 5.1-RC1, &t.releng.5.1; branched</para>
|
||||
<para>Mar 15, 2004: 5.3-RC1, &t.releng.5; and &t.releng.5.3; branched</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>May 27, 2003: 5.1-RC2</para>
|
||||
<para>Mar 22, 2004: 5.3-RC2</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Jun 2, 2003: 5.1-RELEASE</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Jun 30, 2003: KSE and SMPng feature freeze</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Sept 1, 2003: 5.2-BETA, general code freeze</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Sept 15, 2003: 5.2-RC1, &t.releng.5; and &t.releng.5.2; branched</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Sept 22, 2003: 5.2-RC2</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Sept 29, 2003: 5.2-RELEASE</para>
|
||||
<para>Mar 29, 2004: 5.3-RELEASE</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
|
Loading…
Reference in a new issue