diff --git a/en_US.ISO8859-1/articles/5-roadmap/article.sgml b/en_US.ISO8859-1/articles/5-roadmap/article.sgml index c0e7c1f503..c3277d6d44 100644 --- a/en_US.ISO8859-1/articles/5-roadmap/article.sgml +++ b/en_US.ISO8859-1/articles/5-roadmap/article.sgml @@ -22,13 +22,14 @@ RELENG_5"> RELENG_5_1"> RELENG_5_2"> +RELENG_5_3"> HEAD"> ]>
- The Roadmap for 5-STABLE + The Road Map for 5-STABLE The &os; Release Engineering Team @@ -60,13 +61,12 @@ of 2003. Features like the GEOM block layer, Mandatory Access Controls, ACPI, &sparc64; and ia64 platform support, and UFS snapshots, background filesystem checks, and 64-bit inode sizes make it an exciting operating - system for both desktop and production users. However, some important + system for both desktop and enterprise users. However, some important features are not complete. The foundations for fine-grained locking and preemption in the kernel exist, but much more work is left to be - done. Work on Kernel Schedulable Entities (KSE), similar to Scheduler - Activations, has been ongoing but needs a push to realize its benefit. - Performance compared to &os; 4.X has - declined and must be restored and surpassed. + done. Performance and stability compared to &os; + 4.X has declined and must be restored and + surpassed. This is somewhat similar to the situation that &os; faced in the 3.X series. Work on 3-CURRENT trudged along @@ -97,6 +97,15 @@ Major issues + The success of the 5.X series hinges on + the ability to deliver fine-graned threading and re-entrancy in the + kernel (also known as SMPng) and kernel-supported POSIX threads in + userland, while not sacrificing overall system stability or + performance. + + + SMPng + The state of SMPng and kernel lockdown is the biggest concern for 5.X. To date, few major systems have come out from under the kernel-wide mutex known as Giant. @@ -109,30 +118,35 @@ - VM: the kmem_malloc(M_NOWAIT) path no longer needs Giant held. - The kmem_malloc(M_WAITOK) path is in progress and is expected to be - finished in the coming weeks. Other facets of the VM system, like - the VFS interface, buffer/cache, etc, are largely untouched. + VM: Kernel malloc is locked and free of Giant. The UMA zone + allocator is also free of Giant. vm_object locking is in progress + and is an important step to making the buffer/cache free of + Giant. Pmap locking remains to be started. - GEOM: The GEOM block layer was designed to run free of Giant, - but at this time no block drivers can run without Giant. - Additionally, it has the potential to suffer performance loss due - to its upcall/downcall data paths happening in kernel threads. - Lightweight context switches might help this. + GEOM: The GEOM block layer was designed to run free of Giant + and allow GEOM modules and underlying block drivers to run free + of Giant. Currently, only the &man.ata.4; and &man.aac.4; drivers + are locked and run without Giant. Work on other block drivers is + in progress. Locking the CAM subsystem is required for nearly all + SCSI drivers to run without Giant; this work has not started + yet. + Additionally, GEOM has the potential to suffer performance loss + due to its upcall and downcall data paths happening in kernel threads. + Improved lightweight context switches might help this. - Network: Locking of the TCP and UDP portions of the stack is - complete. Work is in progress to lock up the IP stack, including - the routing tree, ARP code, raw IP, and ifaddr and inet data - structures. IPv6 has been lightly touched during the inp locking - but is hindered by the KAME code being significantly out of date. - Work has not started on any of the other protocols such as - AppleTalk, XNS, or IPX. Locking of the socket layer is in progress - but has been largely untested. None of the hardware drivers or - Ethernet layers have been locked. + Network: Work has restarted on locking the network stack. + Routing tables, ARP, bridge, IPFW, Fast-Forward, TCP, UDP, IP, + Fast IPSEC, and interface layers are being targeted initially, along + with several Ethernet device drivers. The socket layer, IPv6, and + other protocol layers will be targeted later. The primary goal + of this work is to regain the performance found in + &os; 4.X. The cost of context switching + to the device driver ithreads and the netisr is still hampering + performance. @@ -140,12 +154,12 @@ - buffer/cache: Initial work complete. + buffer/cache: Initial work complete on locking the buffer. - Proc: Work on locking the proc structure was ongoing for a - while but seems to have stalled. + Proc: Initial proc locking is in place, further progress is + expected for &os; 5.2. @@ -159,8 +173,7 @@ - Pipes: complete with the exception of VM-related - optimizations. + Pipes: complete @@ -181,12 +194,13 @@ - kernel encryption: crypto drivers and core &man.crypto.4; framework are - Giant-free. KAME IPsec and FAST IPSec have not been locked. + kernel encryption: crypto drivers and core &man.crypto.4; + framework are Giant-free. KAME IPsec has not been locked. - Sound subsystem: complete + Sound subsystem: complete, but lock order reversal problems seem + to persist. @@ -199,158 +213,128 @@ - Another issue with SMPng is interrupt latency. The overhead of - doing a complete context switch to a kernel interrupt thread is high - and shows noticeable latency. Work is ongoing to implement lazy - context switching on all platforms. Fine grained locking of drivers - will also help this, as will converting drivers to be as efficient as - possible in their interrupt routines. + - Next, the state of KSE must resolved for &t.releng.5;. Work on it has - slowed noticeably in the past 6 months but appears to be picking up - again. There are a number of issues that must be addressed: + + Interrupt latency and servicing + SMPng introduced the concept of dedicating kernel threads, known as + ithreads, to servicing interrupts. With this, driver interrupt + service routines are allowed to block for mutexes, memory allocations, + etc. While this makes writing drivers easier, it introduces considerable + latency into the system due to the complete process context switch must + be performed in order to service the ithread. This is aggravated by the + extensive coverage over the kernel by the Giant mutex, and often results + in multiple sleeps and context switches in order to service an interrupt. + Drivers that register their interrupt as INTR_MPSAFE are less likely to + feel these aggravating effects, but the overhead of doing a context + switch remains. Interrupt service routines that are registered as + INTR_FAST are run directly from the interrupt context and do not suffer + these problems at all. However, the INTR_FAST property forces the + interrupt line to be exclusive; no sharing can occur on it. The + proliferation of shared interrupts on PC systems makes this + undesirable. + + Several ideas have been proposed to help combat this problem: - The userland threading library, currently called libkse, is - immature and has not been used for any significant threaded - application. + Special casing ithreads to be lightweight is a possibility. This + might involve reducing the amount of saved context for the ithread, + stack-borrowing from another kthread, and/or creating a new fast-path + to avoid the mi_switch() routine. - KSE has the potential to uncover latent race conditions and - create new ones. An audit needs to be performed to ensure that no - obvious problems exist. + A new interrupt model can be introduced to allow drivers to + register an 'interrupt filter' along with a normal service routine. + This would be similar to the Mac OSX model in use today. Interrupt + filter routines would allow the driver to determine if it is + interested in servicing the interrupt, allow it to squelch the + interrupt source, and possibly determine and schedule service + actions. It would run in the same context as the low-level interrupt + service routine, so sleeping would be strictly forbidden. If actions + that result in sleeping or blocking for long periods are required, + the filter would signal to the caller that its normal ithread routine + should be scheduled. - According to the release schedule below, KSE kernel and userland - components must be functionality complete by June 2003 in order to be - included in the &t.releng.5; branch. For security and stability reasons, - if KSE cannot be finished in time then, by default, all KSE-specific - syscalls should be modified to return ENOSYS and all other KSE-specific - interfaces disabled. Deprecating KSE from &t.releng.5; but keeping it in - the &t.releng.head; branch will pose problems in porting bugfixes and features - between the two branches, so every effort should be made to finish it - on time. + + + + Kernel-supported application threads + + The FreeBSD 5.1 development cycle saw the KSE package jump into a + highly usable state. THR, an alternate threading package based on some + of the KSE kernel primitives but implementing purely 1:1 scheduling + semantics also appeared and is in a similarly experimental but usable + state. Users may interchange these two libraries along with the legacy + libc_r library via relinking their apps or by using the new libmap + feature of the runtime linker. This excellent progress must be driven + to completion before the &t.releng.5; branch point so that the libc_r + package can be deprecated. + + + + The kernel and userland components for KSE and THR must be + completed for all Tier-1 platforms. The decision on which thread + package to sanction as the default will likely be made on a + per-platform basis depending on the stability and completeness of + each package. + + + + KSE must pass the ACE test suite on all Tier-1 platforms. + Additional real-world testing must also be performed to ensure + that the libraries are indeed useful. At a minimum, the following + packages should be tested: + + OpenOffice + KDE Desktop + Apache 2.x + BIND 9.2.x + MySQL + &java; 1.4.x + + + + + + - Goals for 5-STABLE + Requirements for 5-STABLE - The goals for the &t.releng.5; branch point are: + The &t.releng.5 branch must offers users the same stability and + performance that is currently enjoyed in the &t.releng.4 branch. + While the goal of SMPng is to allow performance to far exceed what + is found in &t.releng.4; and its siblings BSD's, regaining performance + to the basic level is of the upmost importance. The branch must also + be mature enough to avoid ABI and API changes while still allowing + potential problems to be resolved. - - - All subsystems and interfaces must be mature enough to be - maintainable for improvements and bug fixes. - - - - Equal or better stability from &os; 4.8. - - - - No functional regressions from 4.8. It is important to make - sure that users do not avoid upgrading to 5.x because of lost - functionality. - - - - Performance on par with &os; 4.8 for most common operations. - Both UP and SMP configurations should be evaluated. SMP has the - potential to perform much better than - 4.X, though for the purposes of creating - the &t.releng.5; branch, comparable performance between the two should - be acceptable. - - - - It is unrealistic to expect that the SMPng project will be fully - complete by &t.releng.5;, or that performance will be significantly better - than 4.X. However, focusing on a subset of - the outstanding tasks will give enough benefit for the branch to be - viable and maintainable. To break it down: - - - - ABI/API/Infrastructure stability - Enough infrastructure must - be in place and stable to allow fixes from &t.releng.head; to easily and - safely be merged into &t.releng.5;. Also, we must draw a line as to - what subsystems are to be locked down when we go into - 5-STABLE. + + ABI/API/Infrastructure stability + Enough infrastructure must be in place and stable to allow + fixes from &t.releng.head; to easily and safely be merged into + &t.releng.5;. Also, we must draw a line as to what subsystems are + to be locked down when we go into 5-STABLE. - SMPng - - - - VM: Most codepaths, others than the ones that interact with - VFS, should be Giant-free for &t.releng.5;. - - - - Network: Taking the network stack out from under Giant poses - the risk of uncovering latent bugs and races. Locking it down - but not removing Giant imposes further performance penalties. A - decision on which parts of the network stack should be locked and - taken out from under Giant for &t.releng.5; should be made no later - than March 15. Work on the IP, TCP, UDP,raw IP, routing sockets, - and &unix; domain sockets stands a good chance of being complete in - time for &t.releng.5;. - - If the decision is made to not lift Giant from the stack, - then the locks in these layers could be optimized out with a - kernel config option. Having a Giant-free path from the the - hardware layer to the IP queues should be investigated as it - could allow significant performance gains in the network - benchmarks. If this can be achieved then the hardware interface - layer needs to allow for drivers to incrementally become free of - Giant. Locking down at least two Ethernet drivers would be - highly desirable. If the semantics are too complex to have the - stack free of Giant but not the hardware drivers, investigation - should be done into making it configurable. - - Lesser-used network stacks like netatalk, netipx, etc, should - not break while this work is going on. However, locking them is - not a high priority. Special kernel config options might be - needed in order for these layers to operate with the rest of the - stack being locked and Giant free. - - - - GEOM: At least 2 block drivers should be locked in order to - demonstrate that others can also be locked without changing the - interface to GEOM. The ATA driver is a good candidate for this, - though caution should be taken as it is also extremely - high-profile and any problems with it will affect nearly all - users of &os;. - - - - Lazy context switching: sparc64 is the only platform that - performs lazy context switching when entering the kernel. The - performance gains promised by this are significant enough to - require that it be implemented for all other Tier-1 - platforms. - - - - - - KSE: The kernel side of KSE must be functionally complete and - have undergone a security audit. libkse must be complete enough to - demonstrate a real-world application running correctly on it using - the standard &posix; Threads API. Examples would be apache 2.0, - &java;, and/or mozilla. A functional regression test suite is also a - requirement for &t.releng.5; and should test signal delivery, - scheduling, performance, and process security/credentials for both - KSE and non-KSE processes. KSE kernel and userland components must - also reach the same level of functionality for all Tier-1 platforms + KSE: Both kernel and userland components must + reach the same level of functionality for all Tier-1 platforms in both UP and SMP configurations. The definition of Tier-1 platforms can be found in - . + . Continued testing against the ACE test + suite must be made as the &t.releng.5; branch draws near. KSE + must pose no functional regressions for the ongoing &java; + certification program. Common desktop and server applications + must run seamlessly under KSE. A policy must be decided on as + to which platforms will enable KSE as the default threading + package, how to allow the user to switch threading packages, and + how third-party packages will me made aware of these choices. @@ -364,11 +348,9 @@ tracks the progress of this and should be used to determine which drivers must be converted for &t.releng.5; and which can be left behind. - Also, there has been talk by several developers and the original - author to give the busdma interface a minor overhaul. If this is - to happen, it needs to happen before &t.releng.5;. Otherwise, - differences between the old and new API will make driver - maintenance difficult. + No new storage or network drivers shall be allowed into the + &os; source tree. Exceptions for other classes of drivers must + be justified in public discussion. @@ -377,73 +359,57 @@ leaving this task solely to the OS. &os; must gain the ability to manage and allocate PCI memory resources on its own. Implementing this should take into account cardbus, PCI-HotPlug, and laptop - dockstation requirements. This feature will become increasingly + dock station requirements. This feature will become increasingly critical through the lifetime of &t.releng.5;, and therefore is a requirement for the &t.releng.5; branch. - + - - Performance: most performance gains hinge on the progress of - SMPng Areas that should be concentrated on are: + + Performance + Performance hinges on the progress of SMPng infrastructure and + the following areas: - Storage I/O: I/O performance suffers from two problems, too - many expensive context switches, and too much work being done - in interrupt threads. Specifically, it takes 3 context - switches for most drivers to get from the hardware completion - interrupt to unblocking the user process: one for the - interrupt thread, one for the GEOM g_up thread, and one to get - back to the user thread. Drivers that attempt to be efficient - and quick in their interrupt handlers (as all should be) - usually also schedule a taskqueue, which adds a context switch - in between the interrupt thread and the g_up thread and brings - the total up to 4. Two things need to be done to attack - this: - - - - Make all drivers defer most of their processing out of - their interrupt thread. Significant performance gains have - been shown recently in the &man.aac.4; driver by making its - interrupt handler be INTR_MPSAFE and moving - all processing to a taskqueue. - - - - investigate eliminating the taskqueue context switch by - adding a callback to the g_up thread that allows a driver to - do its interrupt processing there instead of in the - taskqueue. - - + Storage: The GEOM block layer allows storage drivers to + run without Giant. All drivers that interface directly with + GEOM (as opposed to sitting underneath CAM or another middleware) + must be locked and free of Giant in both their strategy and + completion paths. Their interrupt handlers must also run free + of Giant. - Network: Network drivers suffer from the interrupt latency - previously mentioned as well as from the network stack being - partially locked down but not free from Giant. Possible - strategies for addressing this are described in the previous - section. + Network: The layers in the IPv4 path below the socket layer + must be locked and free of Giant. This includes the protocol, + routing, bridging, filtering, and hardware layers. Allowances must + be made for protocols that are not locked, especially IPv6. + Testing must also be performed to ensure stability, correctness, + and performance. - Other locking - XXX? + Interrupt and context switching: As discussed above, interrupt + latency and context switching have a severe impact of performance. + Context switching for ithreads and kthreads must be improved on + platforms. New interrupt handling models that allow for faster + more flexible handling of both traditional and MSI interrupts must + be investigated and implemented. - + - - Benchmarks and performance testing: Having a source of reliable - and useful benchmarks is essential to identifying performance - problems and guarding against performance regressions. A - performance team that is made up of people and - resources for formulating, developing, and executing benchmark - tests should be put into place soon. Comparisons should be made - against both &os; 4.X and Linux 2.4.x. - Tests to consider are: + + Benchmarks and performance testing + Having a source of reliable and useful benchmarks is essential + to identifying performance problems and guarding against performance + regressions. A performance team that is made up of + people and resources for formulating, developing, and executing + benchmark tests should be put into place soon. Comparisons should + be made against both &os; 4.X and Linux + 2.4/2.6. Tests to consider are: @@ -471,35 +437,16 @@ Note: does not compile with gcc 3.x yet. - + - - Features: + + Features: - - ACPI: Intel's ACPI power management and device configuration - subsystem has become an integral part of &os;'s x86 and ia64 - device configuration model. However, many bugs exist in Intel's - vendor code, our OS-specific code, and motherboard BIOSes, causing - many ACPI-enabled systems to fail to boot, misdetect drivers, - and/or have many other problems. Fixing these problems seems to - be an uphill battle and is often times causing a poor - first-impression of &os; 5.0. Most x86 systems can function with - ACPI disabled, and logic should be added to the boot loader and - sysinstall to allow users to easily and intuitively turn it off. - Turning off ACPI by default is prone to problems also as many - newer systems rely on it to provide correct interrupt routing - information. Also, a centralized resource should be created to - track ACPI problems and solutions. Linux uses the same Intel - vendor sources as &os;, so we should investigate how they have - handled some of the known problems. - - NEWCARD/OLDCARD: The NEWCARD subsystem was made the default for &os; 5.0. Unfortunately, it contains no support for - non-Cardbus bridges and falls victim to interrupt routine + non-Cardbus bridges and falls victim to interrupt routing problems on some laptops. The classic 16-bit bridge support, OLDCARD, still exists and can be compiled in, but this is highly inconvenient for users of older laptops. If OLDCARD cannot be @@ -517,7 +464,7 @@ New scheduler framework: The new scheduler framework is in - place, and users can select between the classic 44bsd scheduler + place, and users can select between the classic 44BSD scheduler and the new ULE scheduler. A scheduler that demonstrates processor affinity, HyperThreading and KSE awareness, and no regressions in performance or interactivity characteristics must @@ -525,138 +472,66 @@ - sparc64 local console: neither syscons nor vt work on - sparc64, leaving it with only serial and fake OFW - console support. This is a major support hole for what is a - Tier-1 platform. Whether syscons can be shoe-horned in or - wscons be adopted from NetBSD is up for debate. However, - sparc64 must have local console support for &t.releng.5;. Having - this will also enable the XFree86 server to run, which is also a - requirement for &t.releng.5;. + GDB: GDB in the base system must work for sparc64, and + must also understand KSE thread semantics. GDB 5.3 is available + and is reported to address the sparc64 issues. + + + + + + + Documentation: + + + + The manual pages, Handbook, and FAQ should be free from + content specific to &os; 4.X, i.e. all + text should be equally applicable to &os; + 5.X. The installation section of the + handbook needs the most work in this area. - gcc/toolchain: gcc 3.3 might be available in time for - &t.releng.5; and might offer some attractive benefits, but also - likely to introduce ABI incompatibility with prior gcc versions. - ABI compatibility should be locked down for the &t.releng.5; - branch. - - There has also been a request to move /usr/include/g++ to - /usr/include/g++-v3 to be more compliant with the stock behavior - of gcc. This should also be investigated for &t.releng.5;. - - - - gdb: gdb from the base system should work for sparc64. It - should also understand KSE thread semantics, assuming that KSE - is included in the &t.releng.5; branch. gdb 5.3 is available and - there are reports that it should address the sparc64 issue. - - - - &man.disklabel.8; regressions: The biggest casualty of the - introduction of GEOM appears to be the disklabel utility. The - option gives unpredictable results in most - cases now and should be removed or fixed. Work is planned for a - new unified interface for modifying labels and slices, however - this should not preclude disklabel from being fixed. + The release documentation needs to be complete and accurate + for all Tier-1 architectures. The hardware notes and + installation guides need specific attention. - - - - Documentation: - - - - The manual pages, Handbook, and FAQ should be free from - content specific to &os; 4.X, i.e. all - text should be equally applicable to &os; - 5.X. The installation section of the - handbook needs the most work in this area. - - - - The release documentation needs to be complete and accurate - for all Tier-1 architectures. The hardware notes and - installation guides need specific attention. - - - - If &os; 5.1 is not the branch point for &t.releng.5; then the - Early Adopters Guide needs to be updated. This document should - then be removed just before the release closest to the &t.releng.5; - branch point. - - - - + Schedule - If branching &t.releng.5; at the 5.1 release is paramount, 5.1 will - probably need to move out by at least 3 months. The schedule would - be: + The original schedule of releasing &os; 5.2 and branching + &t.releng.5; in September 2003 is being pushed back due to the + complexity of the remaining tasks. The new schedule follows: - Jun 30, 2003: KSE and SMPng feature freeze + Nov 5, 2003: 5.2-BETA, general code freeze - Aug 4, 2003: 5.1-BETA, general code freeze + Nov 19, 2003: 5.2-RC1, &t.releng.5.2; branched - Aug 18, 2003: 5.1-RC1, &t.releng.5; and &t.releng.5.1; branched + Nov 27, 2003: 5.2-RC2 - Aug 25, 2003: 5.1-RC2 + Dec 2, 2003: 5.2-RELEASE - Sept 1, 2003: 5.1-RELEASE - - - - Taking an incremental approach might be more beneficial. Releasing - 5.1 in time for USENIX ATC 2003 will provide a wide audience for - productive feedback and will keep &os; visible. In this scenario, 5.1 - should offer a significant improvement over 5.0 in terms of bug fixes - and performance. Lockdowns and improvements to the storage subsystem - and scheduler should be expected, the NEWCARD/OLDCARD issues should be - addressed, and all known bugs and regressions from the 5.0 errata list - should be fixed. KSE and other SMPng tasks that cannot finish in time - for 5.1 should also not reduce the stability of the release. The - schedule for this would be: - - - - May 5, 2003: 5.1-BETA, general code freeze + Mar 1, 2004: 5.3-BETA, general code freeze - May 19, 2003: 5.1-RC1, &t.releng.5.1; branched + Mar 15, 2004: 5.3-RC1, &t.releng.5; and &t.releng.5.3; branched - May 27, 2003: 5.1-RC2 + Mar 22, 2004: 5.3-RC2 - Jun 2, 2003: 5.1-RELEASE - - - Jun 30, 2003: KSE and SMPng feature freeze - - - Sept 1, 2003: 5.2-BETA, general code freeze - - - Sept 15, 2003: 5.2-RC1, &t.releng.5; and &t.releng.5.2; branched - - - Sept 22, 2003: 5.2-RC2 - - - Sept 29, 2003: 5.2-RELEASE + Mar 29, 2004: 5.3-RELEASE