<?xml version="1.0" encoding="ISO8859-1" standalone="no"?>
<!--
     The FreeBSD Documentation Project

     $FreeBSD$
-->

<chapter id="config-tuning">
  <chapterinfo>
    <authorgroup>
      <author>
	<firstname>Chern</firstname>
	<surname>Lee</surname>
	<contrib>Written by </contrib>
      </author>
    </authorgroup>
    <authorgroup>
      <author>
	<firstname>Mike</firstname>
	<surname>Smith</surname>
	<contrib>Based on a tutorial written by </contrib>
      </author>
    </authorgroup>
    <authorgroup>
      <author>
	<firstname>Matt</firstname>
	<surname>Dillon</surname>
	<contrib>Also based on tuning(7) written by </contrib>
      </author>
    </authorgroup>
  </chapterinfo>

  <title>Configuration and Tuning</title>

  <sect1 id="config-synopsis">
    <title>Synopsis</title>

    <indexterm><primary>system configuration</primary></indexterm>
    <indexterm><primary>system optimization</primary></indexterm>

    <para>One of the important aspects of &os; is system
      configuration.  Correct system configuration will help prevent
      headaches during future upgrades.  This chapter will explain
      much of the &os; configuration process, including some of the
      parameters which can be set to tune a &os; system.</para>

    <para>After reading this chapter, you will know:</para>

    <itemizedlist>
      <listitem>
	<para>How to efficiently work with
	  file systems and swap partitions.</para>
      </listitem>

      <listitem>
	<para>The basics of <filename>rc.conf</filename> configuration
	  and <filename
	    class="directory">/usr/local/etc/rc.d</filename> startup
	  systems.</para>
      </listitem>

      <listitem>
	<para>How to configure and test a network card.</para>
      </listitem>

      <listitem>
	<para>How to configure virtual hosts on your network
	  devices.</para>
      </listitem>

      <listitem>
	<para>How to use the various configuration files in
	  <filename class="directory">/etc</filename>.</para>
      </listitem>

      <listitem>
	<para>How to tune &os; using <command>sysctl</command>
	  variables.</para>
      </listitem>

      <listitem>
	<para>How to tune disk performance and modify kernel
	  limitations.</para>
      </listitem>
    </itemizedlist>

    <para>Before reading this chapter, you should:</para>

    <itemizedlist>
      <listitem>
	<para>Understand &unix; and &os; basics (<xref
	    linkend="basics"/>).</para>
      </listitem>

      <listitem>
	<para>Be familiar with the basics of kernel
	  configuration/compilation
	  (<xref linkend="kernelconfig"/>).</para>
      </listitem>
    </itemizedlist>
  </sect1>

  <sect1 id="configtuning-initial">
    <title>Initial Configuration</title>

    <sect2>
      <title>Partition Layout</title>

      <indexterm><primary>partition layout</primary></indexterm>
      <indexterm>
	<primary><filename class="directory">/etc</filename></primary>
      </indexterm>
      <indexterm>
	<primary><filename class="directory">/var</filename></primary>
      </indexterm>
      <indexterm>
	<primary><filename class="directory">/usr</filename></primary>
      </indexterm>

      <sect3>
	<title>Base Partitions</title>

	<para>When laying out file systems with &man.bsdlabel.8; or
	  &man.sysinstall.8;, remember that hard drives transfer data
	  faster from the outer tracks to the inner.  Thus smaller and
	  heavier-accessed file systems should be closer to the
	  outside of the drive, while larger partitions like
	  <filename class="directory">/usr</filename> should be placed
	  toward the inner parts of the disk.  It is a good idea to
	  create partitions in an order similar to: root, swap,
	  <filename class="directory">/var</filename>,
	  <filename class="directory">/usr</filename>.</para>

	<para>The size of the
	  <filename class="directory">/var</filename> partition
	  reflects the intended machine usage.  The
	  <filename class="directory">/var</filename> file system is
	  used to hold mailboxes, log files, and printer spools.
	  Mailboxes and log files can grow to unexpected sizes
	  depending on how many users exist and how long log files are
	  kept.  Most users will rarely need more than about a
	  gigabyte of free disk space in
	  <filename class="directory">/var</filename>.</para>

	<note>
	  <para>There are a few times that a lot of disk space is
	    required in
	    <filename class="directory">/var/tmp</filename>.  When new
	    software is installed with &man.pkg.add.1; the packaging
	    tools extract a temporary copy of the packages under
	    <filename class="directory">/var/tmp</filename>.  Large
	    software packages, like
	    <application>Firefox</application>,
	    <application>OpenOffice</application> or
	    <application>LibreOffice</application> may be tricky to
	    install if there is not enough disk space under
	    <filename class="directory">/var/tmp</filename>.</para>
	</note>

	<para>The <filename class="directory">/usr</filename>
	  partition holds many of the files required to support the
	  system, including the &man.ports.7; collection (recommended)
	  and the source code (optional).  Both the ports and the
	  sources of the base system are optional at install time, but
	  we recommend at least 2 gigabytes for this partition.</para>

	<para>When selecting partition sizes, keep the space
	  requirements in mind.  Running out of space in
	  one partition while barely using another can be a
	  hassle.</para>

	<note>
	  <para>Some users have found that &man.sysinstall.8;'s
	    <literal>Auto-defaults</literal> partition sizer will
	    sometimes select smaller than adequate
	    <filename class="directory">/var</filename> and
	    <filename class="directory">/</filename> partitions.
	    Partition wisely and generously.</para>
	</note>
      </sect3>

      <sect3 id="swap-design">
	<title>Swap Partition</title>

	<indexterm><primary>swap sizing</primary></indexterm>
	<indexterm><primary>swap partition</primary></indexterm>

	<para>As a rule of thumb, the swap partition should be about
	  double the size of system memory (RAM).  For example, if the
	  machine has 128&nbsp;megabytes of memory, the swap file
	  should be 256&nbsp;megabytes.  Systems with less memory may
	  perform better with more swap.  Less than 256&nbsp;megabytes
	  of swap is not recommended and memory expansion should be
	  considered. The kernel's VM paging algorithms are tuned to
	  perform best when the swap partition is at least two times
	  the size of main memory.  Configuring too little swap can
	  lead to inefficiencies in the VM page scanning code and
	  might create issues later if more memory is added.</para>

	<para>On larger systems with multiple SCSI disks (or multiple
	  IDE disks operating on different controllers), it is
	  recommend that a swap is configured on each drive (up to
	  four drives).  The swap partitions should be approximately
	  the same size.  The kernel can handle arbitrary sizes but
	  internal data structures scale to 4 times the largest swap
	  partition.  Keeping the swap partitions near the same size
	  will allow the kernel to optimally stripe swap space across
	  disks.  Large swap sizes are fine, even if swap is not used
	  much.  It might be easier to recover from a runaway program
	  before being forced to reboot.</para>
      </sect3>

      <sect3>
	<title>Why Partition?</title>

	<para>Several users think a single large partition will be
	  fine, but there are several reasons why this is a bad idea.
	  First, each partition has different operational
	  characteristics and separating them allows the file system
	  to tune accordingly.  For example, the root and
	  <filename class="directory">/usr</filename> partitions are
	  read-mostly, without much writing.  While a lot of reading
	  and writing could occur in
	  <filename class="directory">/var</filename> and
	  <filename class="directory">/var/tmp</filename>.</para>

	<para>By properly partitioning a system, fragmentation
	  introduced in the smaller write heavy partitions will not
	  bleed over into the mostly-read partitions.  Keeping the
	  write-loaded partitions closer to the disk's edge, will
	  increase I/O performance in the partitions where it occurs
	  the most.  Now while I/O performance in the larger
	  partitions may be needed, shifting them more toward the edge
	  of the disk will not lead to a significant performance
	  improvement over moving
	  <filename class="directory">/var</filename> to the edge.
	  Finally, there are safety concerns.  A smaller, neater root
	  partition which is mostly read-only has a greater chance of
	  surviving a bad crash.</para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="configtuning-core-configuration">
    <title>Core Configuration</title>

    <indexterm>
      <primary>rc files</primary>
      <secondary><filename>rc.conf</filename></secondary>
    </indexterm>

    <para>The principal location for system configuration information
      is within <filename>/etc/rc.conf</filename>.  This file contains
      a wide range of configuration information, principally used at
      system startup to configure the system.  Its name directly
      implies this; it is configuration information for the
      <filename>rc*</filename> files.</para>

    <para>An administrator should make entries in the
      <filename>rc.conf</filename> file to override the default
      settings from <filename>/etc/defaults/rc.conf</filename>.  The
      defaults file should not be copied verbatim to
      <filename class="directory">/etc</filename> - it contains
      default values, not examples.  All system-specific changes
      should be made in the <filename>rc.conf</filename> file
      itself.</para>

    <para>A number of strategies may be applied in clustered
      applications to separate site-wide configuration from
      system-specific configuration in order to keep administration
      overhead down.  The recommended approach is to place
      system-specific configuration into the
      <filename>/etc/rc.conf.local</filename> file.  For
      example:</para>

    <itemizedlist>
      <listitem>
	<para><filename>/etc/rc.conf</filename>:</para>

	<programlisting>sshd_enable="YES"
keyrate="fast"
defaultrouter="10.1.1.254"</programlisting>

      </listitem>

      <listitem>
	<para><filename>/etc/rc.conf.local</filename>:</para>

	<programlisting>hostname="node1.example.org"
ifconfig_fxp0="inet 10.1.1.1/8"</programlisting>

      </listitem>
    </itemizedlist>

    <para>The <filename>rc.conf</filename> file can then be
      distributed to every system using <command>rsync</command> or a
      similar program, while the <filename>rc.conf.local</filename>
      file remains unique.</para>

    <para>Upgrading the system using &man.sysinstall.8; or
      <command>make world</command> will not overwrite the
      <filename>rc.conf</filename> file, so system configuration
      information will not be lost.</para>

    <tip>
      <para>The <filename>/etc/rc.conf</filename> configuration file
	is parsed by &man.sh.1;.  This allows system operators to
	add a certain amount of logic to this file, which may help to
	create very complex configuration scenarios.  Please see
	&man.rc.conf.5; for further information on this topic.</para>
    </tip>
  </sect1>

  <sect1 id="configtuning-appconfig">
    <title>Application Configuration</title>

    <para>Typically, installed applications have their own
      configuration files, with their own syntax, etc.  It is
      important that these files be kept separate from the base
      system, so that they may be easily located and managed by the
      package management tools.</para>

    <indexterm><primary>/usr/local/etc</primary></indexterm>

    <para>Typically, these files are installed in
      <filename class="directory">/usr/local/etc</filename>.  In the
      case where an application has a large number of configuration
      files, a subdirectory will be created to hold them.</para>

    <para>Normally, when a port or package is installed, sample
      configuration files are also installed.  These are usually
      identified with a <filename>.default</filename> suffix.  If
      there are no existing configuration files for the application,
      they will be created by copying the
      <filename>.default</filename> files.</para>

    <para>For example, consider the contents of the directory
      <filename
	class="directory">/usr/local/etc/apache</filename>:</para>

    <literallayout class="monospaced">-rw-r--r--  1 root  wheel   2184 May 20  1998 access.conf
-rw-r--r--  1 root  wheel   2184 May 20  1998 access.conf.default
-rw-r--r--  1 root  wheel   9555 May 20  1998 httpd.conf
-rw-r--r--  1 root  wheel   9555 May 20  1998 httpd.conf.default
-rw-r--r--  1 root  wheel  12205 May 20  1998 magic
-rw-r--r--  1 root  wheel  12205 May 20  1998 magic.default
-rw-r--r--  1 root  wheel   2700 May 20  1998 mime.types
-rw-r--r--  1 root  wheel   2700 May 20  1998 mime.types.default
-rw-r--r--  1 root  wheel   7980 May 20  1998 srm.conf
-rw-r--r--  1 root  wheel   7933 May 20  1998 srm.conf.default</literallayout>

    <para>The file sizes show that only the
      <filename>srm.conf</filename> file has been changed.  A later
      update of the <application>Apache</application> port would not
      overwrite this changed file.</para>
  </sect1>

  <sect1 id="configtuning-starting-services">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	  <contrib>Contributed by </contrib>
	</author>
      </authorgroup>
    </sect1info>

    <title>Starting Services</title>

    <indexterm><primary>services</primary></indexterm>

    <para>Many users choose to install third party software on &os;
      from the Ports Collection.  In many of these situations it may
      be necessary to configure the software in a manner which will
      allow it to be started upon system initialization.  Services,
      such as <filename role="package">mail/postfix</filename> or
      <filename role="package">www/apache22</filename> are just two of
      the many software packages which may be started during system
      initialization.  This section explains the procedures available
      for starting third party software.</para>

    <para>In &os;, most included services, such as &man.cron.8;, are
      started through the system start up scripts.  These scripts may
      differ depending on &os; or vendor version; however, the most
      important aspect to consider is that their start up
      configuration can be handled through simple startup
      scripts.</para>

    <sect2>
      <title>Extended Application Configuration</title>

      <para>Now that &os; includes <filename>rc.d</filename>,
	configuration of application startup has become easier, and
	more featureful.  Using the key words discussed in the
	<link linkend="configtuning-rcd">rc.d</link> section,
	applications may now be set to start after certain other
	services for example <acronym>DNS</acronym>; may permit extra
	flags to be passed through <filename>rc.conf</filename> in
	place of hard coded flags in the start up script, etc.  A
	basic script may look similar to the following:</para>

      <programlisting>#!/bin/sh
#
# PROVIDE: utility
# REQUIRE: DAEMON
# KEYWORD: shutdown

. /etc/rc.subr

name=utility
rcvar=utility_enable

command="/usr/local/sbin/utility"

load_rc_config $name

#
# DO NOT CHANGE THESE DEFAULT VALUES HERE
# SET THEM IN THE /etc/rc.conf FILE
#
utility_enable=${utility_enable-"NO"}
pidfile=${utility_pidfile-"/var/run/utility.pid"}

run_rc_command "$1"</programlisting>

      <para>This script will ensure that the provided
	<application>utility</application> will be started after the
	<literal>DAEMON</literal> pseudo-service.  It also provides a
	method for setting and tracking the <acronym>PID</acronym>, or
	process <acronym>ID</acronym> file.</para>

      <para>This application could then have the following line placed
	in <filename>/etc/rc.conf</filename>:</para>

      <programlisting>utility_enable="YES"</programlisting>

      <para>This method also allows for easier manipulation of the
	command line arguments, inclusion of the default functions
	provided in <filename>/etc/rc.subr</filename>, compatibility
	with the &man.rcorder.8; utility and provides for easier
	configuration via the <filename>rc.conf</filename>
	file.</para>
    </sect2>

    <sect2>
      <title>Using Services to Start Services</title>

      <para>Other services, such as <acronym>POP</acronym>3 server
	daemons, <acronym>IMAP</acronym>, etc. could be started using
	&man.inetd.8;.  This involves installing the service utility
	from the Ports Collection with a configuration line added to
	the <filename>/etc/inetd.conf</filename> file, or by
	uncommenting one of the current configuration lines.  Working
	with <application>inetd</application> and its configuration is
	described in depth in the
	<link linkend="network-inetd">inetd</link> section.</para>

      <para>In some cases it may make more sense to use the
	&man.cron.8; daemon to start system services.  This approach
	has a number of advantages because <command>cron</command>
	runs these processes as the <filename>crontab</filename>'s
	file owner.  This allows regular users to start and maintain
	some applications.</para>

      <para>The <command>cron</command> utility provides a unique
	feature, <literal>@reboot</literal>, which may be used in
	place of the time specification.  This will cause the job to
	be run when &man.cron.8; is started, normally during system
	initialization.</para>
    </sect2>
  </sect1>

  <sect1 id="configtuning-cron">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	  <contrib>Contributed by </contrib>
	  <!-- 20 May 2003 -->
	</author>
      </authorgroup>
    </sect1info>
    <title>Configuring the <command>cron</command> Utility</title>

    <indexterm><primary>cron</primary>
      <secondary>configuration</secondary></indexterm>

    <para>One of the most useful utilities in &os; is &man.cron.8;.
      The <command>cron</command> utility runs in the background and
      constantly checks the <filename>/etc/crontab</filename> file.
      The <command>cron</command> utility also checks the
      <filename class="directory">/var/cron/tabs</filename> directory,
      in search of new <filename>crontab</filename> files.  These
      <filename>crontab</filename> files store information about
      specific functions which <command>cron</command> is supposed to
      perform at certain times.</para>

    <para>The <command>cron</command> utility uses two different types
      of configuration files, the system crontab and user crontabs.
      These formats only differ in the sixth field and later.  In the
      system crontab, <command>cron</command> will run the command as
      the user specified in the sixth field.  In a user crontab, all
      commands run as the user who created the crontab, so the sixth
      field is the last field; this is an important security feature.
      The final field is always the command to run.</para>

    <note>
      <para>User crontabs allow individual users to schedule tasks
	without the need for <username>root</username> privileges.
	Commands in a user's crontab run with the permissions of the
	user who owns the crontab.</para>

      <para>The <username>root</username> user can have a user crontab
	just like any other user.  The <username>root</username> user
	crontab is separate from <filename>/etc/crontab</filename>
	(the system crontab).  Because the system crontab effectively
	invokes the specified commands as root there is usually no
	need to create a user crontab for
	<username>root</username>.</para>
    </note>

    <para>Let us take a look at the <filename>/etc/crontab</filename>
      file (the system crontab):</para>

    <programlisting># /etc/crontab - root's crontab for &os;
#
# &dollar;&os;: src/etc/crontab,v 1.32 2002/11/22 16:13:39 tom Exp &dollar;
# <co id="co-comments"/>
#
SHELL=/bin/sh
PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin <co id="co-env"/>
HOME=/var/log
#
#
#minute	hour	mday	month	wday	who	command <co id="co-field-descr"/>
#
#
*/5	*	*	*	*	root	/usr/libexec/atrun <co id="co-main"/></programlisting>

    <calloutlist>
      <callout arearefs="co-comments">
	<para>Like most &os; configuration files, the
	  <literal>#</literal> character represents a comment.  A
	  comment can be placed in the file as a reminder of what and
	  why a desired action is performed. Comments cannot be on the
	  same line as a command or else they will be interpreted as
	  part of the command; they must be on a new line.  Blank
	  lines are ignored.</para>
      </callout>

      <callout arearefs="co-env">
	<para>First, the environment must be defined.  The equals
	  (<literal>=</literal>) character is used to define any
	  environment settings, as with this example where it is used
	  for the <envar>SHELL</envar>, <envar>PATH</envar>, and
	  <envar>HOME</envar> options.  If the shell line is omitted,
	  <command>cron</command> will use the default, which is
	  <command>sh</command>.  If the <envar>PATH</envar> variable
	  is omitted, no default will be used and file locations will
	  need to be absolute.  If <envar>HOME</envar> is omitted,
	  <command>cron</command> will use the invoking users home
	  directory.</para>
      </callout>

      <callout arearefs="co-field-descr">
	<para>This line defines a total of seven fields.  Listed here
	  are the values <literal>minute</literal>,
	  <literal>hour</literal>, <literal>mday</literal>,
	  <literal>month</literal>, <literal>wday</literal>,
	  <literal>who</literal>, and <literal>command</literal>.
	  These are almost all self explanatory.
	  <literal>minute</literal> is the time in minutes the command
	  will be run.  <literal>hour</literal> is similar to the
	  <literal>minute</literal> option, just in hours.
	  <literal>mday</literal> stands for day of the month.
	  <literal>month</literal> is similar to
	  <literal>hour</literal> and <literal>minute</literal>, as it
	  designates the month.  The <literal>wday</literal> option
	  stands for day of the week.  All these fields must be
	  numeric values, and follow the twenty-four hour clock.  The
	  <literal>who</literal> field is special, and only exists in
	  the <filename>/etc/crontab</filename> file.  This field
	  specifies which user the command should be run as.  The last
	  field is the command to be executed.</para>
      </callout>

      <callout arearefs="co-main">
	<para>This last line will define the values discussed above.
	  Notice here we have a <literal>*/5</literal> listing,
	  followed by several more <literal>*</literal> characters.
	  These <literal>*</literal> characters mean
	  <quote>first-last</quote>, and can be interpreted as
	  <emphasis>every</emphasis> time.  So, judging by this line,
	  it is apparent that the <command>atrun</command> command is
	  to be invoked by <username>root</username> every five
	  minutes regardless of what day or month it is.  For more
	  information on the <command>atrun</command> command, see the
	  &man.atrun.8; manual page.</para>

	<para>Commands can have any number of flags passed to them;
	  however, commands which extend to multiple lines need to be
	  broken with the backslash <quote>\</quote> continuation
	  character.</para>
      </callout>
    </calloutlist>

    <para>This is the basic setup for every
      <filename>crontab</filename> file, although there is one thing
      different about this one.  Field number six, where we specified
      the username, only exists in the system
      <filename>/etc/crontab</filename> file.  This field should be
      omitted for individual user <filename>crontab</filename>
      files.</para>

    <sect2 id="configtuning-installcrontab">
      <title>Installing a Crontab</title>

      <important>
	<para>Do not use the procedure described here to edit and
	  install the system crontab,
	  <filename>/etc/crontab</filename>.  Just use your favorite
	  editor: the <command>cron</command> utility will notice that
	  the file has changed and immediately begin using the updated
	  version.  See <ulink
	    url="&url.books.faq;/admin.html#ROOT-NOT-FOUND-CRON-ERRORS">
	    this FAQ entry</ulink> for more information.</para>
      </important>

      <para>To install a freshly written user
	<filename>crontab</filename>, first use your favorite editor
	to create a file in the proper format, and then use the
	<command>crontab</command> utility.  The most common usage
	is:</para>

      <screen>&prompt.user; <userinput>crontab crontab-file</userinput></screen>

      <para>In this example, <filename>crontab-file</filename> is the
	filename of a <filename>crontab</filename> that was previously
	created.</para>

      <para>There is also an option to list installed
	<filename>crontab</filename> files: just pass the
	<option>-l</option> option to <command>crontab</command> and
	look over the output.</para>

      <para>For users who wish to begin their own crontab file from
	scratch, without the use of a template, the
	<command>crontab -e</command> option is available.  This will
	invoke the selected editor with an empty file.  When the file
	is saved, it will be automatically installed by the
	<command>crontab</command> command.</para>

      <para>In order to remove a user <filename>crontab</filename>
	completely, use <command>crontab</command> with the
	<option>-r</option> option.</para>
    </sect2>
  </sect1>

  <sect1 id="configtuning-rcd">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	  <contrib>Contributed by </contrib>
	  <!-- 16 May 2003 -->
	</author>
      </authorgroup>
    </sect1info>

    <title>Using &man.rc.8; Under &os;</title>

    <para>In 2002 &os; integrated the NetBSD <filename>rc.d</filename>
      system for system initialization.  Users should notice the files
      listed in the <filename class="directory">/etc/rc.d</filename>
      directory.  Many of these files are for basic services which can
      be controlled with the <option>start</option>,
      <option>stop</option>, and <option>restart</option> options.
      For instance, &man.sshd.8; can be restarted with the following
      command:</para>

    <screen>&prompt.root; <userinput>/etc/rc.d/sshd restart</userinput></screen>

    <para>This procedure is similar for other services.  Of course,
      services are usually started automatically at boot time as
      specified in &man.rc.conf.5;.  For example, enabling the Network
      Address Translation daemon at startup is as simple as adding the
      following line to <filename>/etc/rc.conf</filename>:</para>

    <programlisting>natd_enable="YES"</programlisting>

    <para>If a <option>natd_enable="NO"</option> line is already
      present, then simply change the <option>NO</option> to
      <option>YES</option>.  The rc scripts will automatically load
      any other dependent services during the next reboot, as
      described below.</para>

    <para>Since the <filename>rc.d</filename> system is primarily
      intended to start/stop services at system startup/shutdown time,
      the standard <option>start</option>, <option>stop</option> and
      <option>restart</option> options will only perform their action
      if the appropriate <filename>/etc/rc.conf</filename> variables
      are set.  For instance the above <command>sshd restart</command>
      command will only work if <varname>sshd_enable</varname> is set
      to <option>YES</option> in <filename>/etc/rc.conf</filename>.
      To <option>start</option>, <option>stop</option> or
      <option>restart</option> a service regardless of the settings in
      <filename>/etc/rc.conf</filename>, the commands should be
      prefixed with <quote>one</quote>.  For instance to restart
      <command>sshd</command> regardless of the current
      <filename>/etc/rc.conf</filename> setting, execute the following
      command:</para>

    <screen>&prompt.root; <userinput>/etc/rc.d/sshd onerestart</userinput></screen>

    <para>It is easy to check if a service is enabled in
      <filename>/etc/rc.conf</filename> by running the appropriate
      <filename>rc.d</filename> script with the option
      <option>rcvar</option>.  Thus, an administrator can check that
      <command>sshd</command> is in fact enabled in
      <filename>/etc/rc.conf</filename> by running:</para>

    <screen>&prompt.root; <userinput>/etc/rc.d/sshd rcvar</userinput>
# sshd
$sshd_enable=YES</screen>

    <note>
      <para>The second line (<literal># sshd</literal>) is the output
	from the <command>sshd</command> command, not a
	<username>root</username> console.</para>
    </note>

    <para>To determine if a service is running, a
      <option>status</option> option is available.  For instance to
      verify that <command>sshd</command> is actually started:</para>

    <screen>&prompt.root; <userinput>/etc/rc.d/sshd status</userinput>
sshd is running as pid 433.</screen>

    <para>In some cases it is also possible to <option>reload</option>
      a service.  This will attempt to send a signal to an individual
      service, forcing the service to reload its configuration files.
      In most cases this means sending the service a
      <literal>SIGHUP</literal> signal.  Support for this feature is
      not included for every service.</para>

    <para>The <filename>rc.d</filename> system is not only used for
      network services, it also contributes to most of the system
      initialization.  For instance, consider the
      <filename>bgfsck</filename> file.  When this script is executed,
      it will print out the following message:</para>

    <screen>Starting background file system checks in 60 seconds.</screen>

    <para>Therefore this file is used for background file system
      checks, which are done only during system initialization.</para>

    <para>Many system services depend on other services to function
      properly.  For example, NIS and other RPC-based services may
      fail to start until after the <command>rpcbind</command>
      (portmapper) service has started.  To resolve this issue,
      information about dependencies and other meta-data is included
      in the comments at the top of each startup script.  The
      &man.rcorder.8; program is then used to parse these comments
      during system initialization to determine the order in which
      system services should be invoked to satisfy the
      dependencies.</para>

    <para>The following words must be included in all startup scripts
      (they are required by &man.rc.subr.8; to <quote>enable</quote>
      the startup script):</para>

    <itemizedlist>
      <listitem>
	<para><literal>PROVIDE</literal>: Specifies the services this
	  file provides.</para>
      </listitem>
    </itemizedlist>

    <para>The following words may be included at the top of each
      startup file.  They are not strictly necessary, but they are
      useful as hints to &man.rcorder.8;:</para>

    <itemizedlist>
      <listitem>
	<para><literal>REQUIRE</literal>: Lists services which are
	  required for this service.  This file will run
	  <emphasis>after</emphasis> the specified services.</para>
      </listitem>

      <listitem>
	<para><literal>BEFORE</literal>: Lists services which depend
	  on this service.  This file will run
	  <emphasis>before</emphasis> the specified services.</para>
      </listitem>
    </itemizedlist>

    <para>By carefully setting these keywords for each startup script,
      an administrator has a very fine-grained level of control of the
      startup order of the scripts, without the hassle of
      <quote>runlevels</quote> like some other &unix; operating
      systems.</para>

    <para>Additional information about the <filename>rc.d</filename>
      system can be found in the &man.rc.8; and &man.rc.subr.8; manual
      pages.  If you are interested in writing your own
      <filename>rc.d</filename> scripts or improving the existing
      ones, you may find <ulink url="&url.articles.rc-scripting">this
	article</ulink> also useful.</para>
  </sect1>

  <sect1 id="config-network-setup">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Marc</firstname>
	  <surname>Fonvieille</surname>
	  <contrib>Contributed by </contrib>
	  <!-- 6 October 2002 -->
	</author>
      </authorgroup>
    </sect1info>

    <title>Setting Up Network Interface Cards</title>

    <indexterm>
      <primary>network cards</primary>
      <secondary>configuration</secondary>
    </indexterm>

    <para>Nowadays we can not think about a computer without thinking
      about a network connection.  Adding and configuring a network
      card is a common task for any &os; administrator.</para>

    <sect2>
      <title>Locating the Correct Driver</title>

      <indexterm>
	<primary>network cards</primary>
	<secondary>driver</secondary>
      </indexterm>

      <para>Before you begin, you should know the model of the card
	you have, the chip it uses, and whether it is a PCI or ISA
	card.  &os; supports a wide variety of both PCI and ISA cards.
	Check the Hardware Compatibility List for your release to see
	if your card is supported.</para>

      <para>Once you are sure your card is supported, you need to
	determine the proper driver for the card.
	<filename>/usr/src/sys/conf/NOTES</filename> and
	<filename>/usr/src/sys/<replaceable>arch</replaceable>/conf/NOTES</filename>
	will give you the list of network interface drivers with some
	information about the supported chipsets/cards.  If you have
	doubts about which driver is the correct one, read the manual
	page of the driver.  The manual page will give you more
	information about the supported hardware and even the possible
	problems that could occur.</para>

      <para>If you own a common card, most of the time you will not
	have to look very hard for a driver.  Drivers for common
	network cards are present in the <filename>GENERIC</filename>
	kernel, so your card should show up during boot, like
	so:</para>

      <screen>dc0: &lt;82c169 PNIC 10/100BaseTX&gt; port 0xa000-0xa0ff mem 0xd3800000-0xd38
000ff irq 15 at device 11.0 on pci0
miibus0: &lt;MII bus&gt; on dc0
bmtphy0: &lt;BCM5201 10/100baseTX PHY&gt; PHY 1 on miibus0
bmtphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc0: Ethernet address: 00:a0:cc:da:da:da
dc0: [ITHREAD]
dc1: &lt;82c169 PNIC 10/100BaseTX&gt; port 0x9800-0x98ff mem 0xd3000000-0xd30
000ff irq 11 at device 12.0 on pci0
miibus1: &lt;MII bus&gt; on dc1
bmtphy1: &lt;BCM5201 10/100baseTX PHY&gt; PHY 1 on miibus1
bmtphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc1: Ethernet address: 00:a0:cc:da:da:db
dc1: [ITHREAD]</screen>

      <para>In this example, we see that two cards using the
	&man.dc.4; driver are present on the system.</para>

      <para>If the driver for your NIC is not present in
	<filename>GENERIC</filename>, you will need to load the proper
	driver to use your NIC.  This may be accomplished in one of
	two ways:</para>

      <itemizedlist>
	<listitem>
	  <para>The easiest way is to simply load a kernel module for
	    your network card with &man.kldload.8;, or automatically
	    at boot time by adding the appropriate line to the file
	    <filename>/boot/loader.conf</filename>.  Not all NIC
	    drivers are available as modules; notable examples of
	    devices for which modules do not exist are ISA
	    cards.</para>
	</listitem>

	<listitem>
	  <para>Alternatively, you may statically compile the support
	    for your card into your kernel.  Check
	    <filename>/usr/src/sys/conf/NOTES</filename>,
	    <filename>/usr/src/sys/<replaceable>arch</replaceable>/conf/NOTES</filename>
	    and the manual page of the driver to know what to add in
	    your kernel configuration file.  For more information
	    about recompiling your kernel, please see
	    <xref linkend="kernelconfig"/>.  If your card was detected
	    at boot by your kernel (<filename>GENERIC</filename>) you
	    do not have to build a new kernel.</para>
	</listitem>
      </itemizedlist>

      <sect3 id="config-network-ndis">
	<title>Using &windows; NDIS Drivers</title>

	<indexterm><primary>NDIS</primary></indexterm>
	<indexterm><primary>NDISulator</primary></indexterm>
	<indexterm><primary>&windows; drivers</primary></indexterm>
	<indexterm><primary>Microsoft Windows</primary></indexterm>
	<indexterm>
	  <primary>Microsoft Windows</primary>
	  <secondary>device drivers</secondary>
	</indexterm>
	<indexterm>
	  <primary>KLD (kernel loadable object)</primary>
	</indexterm>
<!-- We should probably omit the expanded name, and add a <see> entry
for it.  Whatever is done must also be done to the same indexterm in
linuxemu/chapter.sgml -->

	<para>Unfortunately, there are still many vendors that do not
	  provide schematics for their drivers to the open source
	  community because they regard such information as trade
	  secrets.  Consequently, the developers of &os; and other
	  operating systems are left two choices: develop the drivers
	  by a long and pain-staking process of reverse engineering or
	  using the existing driver binaries available for the
	  &microsoft.windows; platforms.  Most developers, including
	  those involved with &os;, have taken the latter
	  approach.</para>

	<para>Thanks to the contributions of Bill Paul (wpaul) there
	  is <quote>native</quote> support for the Network Driver
	  Interface Specification (NDIS).  The &os; NDISulator
	  (otherwise known as Project Evil) takes a &windows; driver
	  binary and basically tricks it into thinking it is running
	  on &windows;.  Because the &man.ndis.4; driver is using a
	  &windows; binary, it only runs on &i386; and amd64 systems.
	  PCI, CardBus, PCMCIA (PC-Card), and USB devices are
	  supported.</para>

	<para>To use the NDISulator, three things are needed:</para>

	<orderedlist>
	  <listitem>
	    <para>Kernel sources</para>
	  </listitem>

	  <listitem>
	    <para>&windowsxp; driver binary
	      (<filename>.SYS</filename> extension)</para>
	  </listitem>

	  <listitem>
	    <para>&windowsxp; driver configuration file
	      (<filename>.INF</filename> extension)</para>
	  </listitem>
	</orderedlist>

	<para>Locate the files for your specific card.  Generally,
	  they can be found on the included CDs or at the vendor's
	  website.  In the following examples, we will use
	  <filename>W32DRIVER.SYS</filename> and
	  <filename>W32DRIVER.INF</filename>.</para>

	<para>The driver bit width must match the version of &os;.
	  For &os;/i386, use a &windows; 32-bit driver.  For
	  &os;/amd64, a &windows; 64-bit driver is needed.</para>

	<para>The next step is to compile the driver binary into a
	  loadable kernel module.  As <username>root</username>, use
	  &man.ndisgen.8;:</para>

	<screen>&prompt.root; <userinput>ndisgen <replaceable>/path/to/W32DRIVER.INF</replaceable> <replaceable>/path/to/W32DRIVER.SYS</replaceable></userinput></screen>

	<para>&man.ndisgen.8; is interactive and prompts for any extra
	  information it requires.  A new kernel module is written in
	  the current directory.  Use &man.kldload.8; to load the new
	  module:</para>

	<screen>&prompt.root; <userinput>kldload <replaceable>./W32DRIVER_SYS.ko</replaceable></userinput></screen>

	<para>In addition to the generated kernel module, you must
	  load the <filename>ndis.ko</filename> and
	  <filename>if_ndis.ko</filename> modules.  This should be
	  automatically done when you load any module that depends on
	  &man.ndis.4;.  If you want to load them manually, use the
	  following commands:</para>

	<screen>&prompt.root; <userinput>kldload ndis</userinput>
&prompt.root; <userinput>kldload if_ndis</userinput></screen>

	<para>The first command loads the NDIS miniport driver
	  wrapper, the second loads the actual network
	  interface.</para>

	<para>Now, check &man.dmesg.8; to see if there were any errors
	  loading.  If all went well, you should get output resembling
	  the following:</para>

	<screen>ndis0: &lt;Wireless-G PCI Adapter&gt; mem 0xf4100000-0xf4101fff irq 3 at device 8.0 on pci1
ndis0: NDIS API version: 5.0
ndis0: Ethernet address: 0a:b1:2c:d3:4e:f5
ndis0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
ndis0: 11g rates: 6Mbps 9Mbps 12Mbps 18Mbps 36Mbps 48Mbps 54Mbps</screen>

	<para>From here you can treat the
	  <devicename>ndis0</devicename> device like any other network
	  interface (e.g., <devicename>dc0</devicename>).</para>

	<para>You can configure the system to load the NDIS modules at
	  boot time in the same way as with any other module.  First,
	  copy the generated module,
	  <filename>W32DRIVER_SYS.ko</filename>, to the <filename
	  class="directory">/boot/modules</filename> directory.  Then,
	  add the following line to
	  <filename>/boot/loader.conf</filename>:</para>

	<programlisting>W32DRIVER_SYS_load="YES"</programlisting>
      </sect3>
    </sect2>

    <sect2>
      <title>Configuring the Network Card</title>

      <indexterm>
	<primary>network cards</primary>
	<secondary>configuration</secondary>
      </indexterm>

      <para>Once the right driver is loaded for the network card, the
	card needs to be configured.  As with many other things, the
	network card may have been configured at installation time by
	<application>sysinstall</application>.</para>

      <para>To display the configuration for the network interfaces on
	your system, enter the following command:</para>

      <screen>&prompt.user; <userinput>ifconfig</userinput>
dc0: flags=8843&lt;UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST&gt; metric 0 mtu 1500
        options=80008&lt;VLAN_MTU,LINKSTATE&gt;
        ether 00:a0:cc:da:da:da
        inet 192.168.1.3 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (100baseTX &lt;full-duplex&gt;)
        status: active
dc1: flags=8802&lt;UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST&gt; metric 0 mtu 1500
        options=80008&lt;VLAN_MTU,LINKSTATE&gt;
        ether 00:a0:cc:da:da:db
        inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255
        media: Ethernet 10baseT/UTP
        status: no carrier
plip0: flags=8810&lt;POINTOPOINT,SIMPLEX,MULTICAST&gt; metric 0 mtu 1500
lo0: flags=8049&lt;UP,LOOPBACK,RUNNING,MULTICAST&gt; metric 0 mtu 16384
        options=3&lt;RXCSUM,TXCSUM&gt;
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
        inet6 ::1 prefixlen 128
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=3&lt;PERFORMNUD,ACCEPT_RTADV&gt;</screen>

      <para>In this example, the following devices were
	displayed:</para>

      <itemizedlist>
	<listitem>
	  <para><devicename>dc0</devicename>: The first Ethernet
	    interface</para>
	</listitem>

	<listitem>
	  <para><devicename>dc1</devicename>: The second Ethernet
	    interface</para>
	</listitem>

	<listitem>
	  <para><devicename>plip0</devicename>: The parallel port
	    interface (if a parallel port is present on the
	    machine)</para>
	</listitem>

	<listitem>
	  <para><devicename>lo0</devicename>: The loopback
	    device</para>
	</listitem>
      </itemizedlist>

      <para>&os; uses the driver name followed by the order in which
	one the card is detected at the kernel boot to name the
	network card.  For example <devicename>sis2</devicename> would
	be the third network card on the system using the &man.sis.4;
	driver.</para>

      <para>In this example, the <devicename>dc0</devicename> device
	is up and running.  The key indicators are:</para>

      <orderedlist>
	<listitem>
	  <para><literal>UP</literal> means that the card is
	    configured and ready.</para>
	</listitem>

	<listitem>
	  <para>The card has an Internet (<literal>inet</literal>)
	    address (in this case
	    <hostid role="ipaddr">192.168.1.3</hostid>).</para>
	</listitem>

	<listitem>
	  <para>It has a valid subnet mask
	    (<literal>netmask</literal>;
	    <hostid role="netmask">0xffffff00</hostid> is the same as
	    <hostid role="netmask">255.255.255.0</hostid>).</para>
	</listitem>

	<listitem>
	  <para>It has a valid broadcast address (in this case,
	    <hostid role="ipaddr">192.168.1.255</hostid>).</para>
	</listitem>

	<listitem>
	  <para>The MAC address of the card (<literal>ether</literal>)
	    is <hostid role="mac">00:a0:cc:da:da:da</hostid></para>
	</listitem>

	<listitem>
	  <para>The physical media selection is on autoselection mode
	    (<literal>media: Ethernet autoselect (100baseTX
	      &lt;full-duplex&gt;)</literal>).  We see that
	    <devicename>dc1</devicename> was configured to run with
	    <literal>10baseT/UTP</literal> media.  For more
	    information on available media types for a driver, please
	    refer to its manual page.</para>
	</listitem>

	<listitem>
	  <para>The status of the link (<literal>status</literal>) is
	    <literal>active</literal>, i.e., the carrier is detected.
	    For <devicename>dc1</devicename>, we see
	    <literal>status: no carrier</literal>.  This is normal
	    when an Ethernet cable is not plugged into the
	    card.</para>
	</listitem>
      </orderedlist>

      <para>If the &man.ifconfig.8; output had shown something similar
	to:</para>

      <screen>dc0: flags=8843&lt;BROADCAST,SIMPLEX,MULTICAST&gt; metric 0 mtu 1500
        options=80008&lt;VLAN_MTU,LINKSTATE&gt;
        ether 00:a0:cc:da:da:da
        media: Ethernet autoselect (100baseTX &lt;full-duplex&gt;)
        status: active</screen>

      <para>it would indicate the card has not been configured.</para>

      <para>To configure your card, you need <username>root</username>
	privileges.  The network card configuration can be done from
	the command line with &man.ifconfig.8; but you would have to
	do it after each reboot of the system.  The file
	<filename>/etc/rc.conf</filename> is where to add the network
	card's configuration.</para>

      <para>Open <filename>/etc/rc.conf</filename> in your favorite
	editor.  You need to add a line for each network card present
	on the system, for example in our case, we added these
	lines:</para>

      <programlisting>ifconfig_dc0="inet 192.168.1.3 netmask 255.255.255.0"
ifconfig_dc1="inet 10.0.0.1 netmask 255.255.255.0 media 10baseT/UTP"</programlisting>

      <para>You have to replace <devicename>dc0</devicename>,
	<devicename>dc1</devicename>, and so on, with the correct
	device for your cards, and the addresses with the proper ones.
	You should read the card driver and &man.ifconfig.8; manual
	pages for more details about the allowed options and also
	&man.rc.conf.5; manual page for more information on the syntax
	of <filename>/etc/rc.conf</filename>.</para>

      <para>If you configured the network during installation, some
	lines about the network card(s) may be already present.
	Double check <filename>/etc/rc.conf</filename> before adding
	any lines.</para>

      <para>You will also have to edit the file
	<filename>/etc/hosts</filename> to add the names and the IP
	addresses of various machines of the LAN, if they are not
	already there.  For more information please refer to
	&man.hosts.5; and to
	<filename>/usr/share/examples/etc/hosts</filename>.</para>

      <note>
	<para>If access to the Internet is planned with the machine,
	  you also have to manually set up the default gateway and the
	  nameserver:</para>

	<screen>&prompt.root; <userinput>echo 'defaultrouter="<replaceable>your_default_router</replaceable>"' &gt;&gt; /etc/rc.conf</userinput>
&prompt.root; <userinput>echo 'nameserver <replaceable>your_DNS_server</replaceable>' &gt;&gt; /etc/resolv.conf</userinput></screen>
      </note>
    </sect2>

    <sect2>
      <title>Testing and Troubleshooting</title>

      <para>Once you have made the necessary changes in
	<filename>/etc/rc.conf</filename>, you should reboot your
	system.  This will allow the change(s) to the interface(s) to
	be applied, and verify that the system restarts without any
	configuration errors.  Alternatively you can just relaunch the
	networking system:</para>

      <screen>&prompt.root; <userinput>/etc/rc.d/netif restart</userinput></screen>

      <note>
	<para>If a default gateway has been set in
	  <filename>/etc/rc.conf</filename>, use also this
	  command:</para>

	<screen>&prompt.root; <userinput>/etc/rc.d/routing restart</userinput></screen>
      </note>

      <para>Once the networking system has been relaunched, you should
	test the network interfaces.</para>

      <sect3>
	<title>Testing the Ethernet Card</title>

	<indexterm>
	  <primary>network cards</primary>
	  <secondary>testing</secondary>
	</indexterm>

	<para>To verify that an Ethernet card is configured correctly,
	  you have to try two things.  First, ping the interface
	  itself, and then ping another machine on the LAN.</para>

	<para>First test the local interface:</para>

	<screen>&prompt.user; <userinput>ping -c5 192.168.1.3</userinput>
PING 192.168.1.3 (192.168.1.3): 56 data bytes
64 bytes from 192.168.1.3: icmp_seq=0 ttl=64 time=0.082 ms
64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.074 ms
64 bytes from 192.168.1.3: icmp_seq=2 ttl=64 time=0.076 ms
64 bytes from 192.168.1.3: icmp_seq=3 ttl=64 time=0.108 ms
64 bytes from 192.168.1.3: icmp_seq=4 ttl=64 time=0.076 ms

--- 192.168.1.3 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.074/0.083/0.108/0.013 ms</screen>

	<para>Now we have to ping another machine on the LAN:</para>

	<screen>&prompt.user; <userinput>ping -c5 192.168.1.2</userinput>
PING 192.168.1.2 (192.168.1.2): 56 data bytes
64 bytes from 192.168.1.2: icmp_seq=0 ttl=64 time=0.726 ms
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.766 ms
64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.700 ms
64 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.747 ms
64 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.704 ms

--- 192.168.1.2 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.700/0.729/0.766/0.025 ms</screen>

	<para>You could also use the machine name instead of
	  <hostid role="ipaddr">192.168.1.2</hostid> if you have set
	  up the <filename>/etc/hosts</filename> file.</para>
      </sect3>

      <sect3>
	<title>Troubleshooting</title>

	<indexterm>
	  <primary>network cards</primary>
	  <secondary>troubleshooting</secondary>
	</indexterm>

	<para>Troubleshooting hardware and software configurations is
	  always a pain, and a pain which can be alleviated by
	  checking the simple things first.  Is your network cable
	  plugged in? Have you properly configured the network
	  services?  Did you configure the firewall correctly?  Is the
	  card you are using supported by &os;?  Always check the
	  hardware notes before sending off a bug report.  Update your
	  version of &os; to the latest STABLE version.  Check the
	  mailing list archives, or perhaps search the
	  Internet.</para>

	<para>If the card works, yet performance is poor, it would be
	  worthwhile to read over the &man.tuning.7; manual page.  You
	  can also check the network configuration as incorrect
	  network settings can cause slow connections.</para>

	<para>Some users experience one or two
	  <errorname>device timeout</errorname> messages, which is
	  normal for some cards.  If they continue, or are bothersome,
	  you may wish to be sure the device is not conflicting with
	  another device.  Double check the cable connections.
	  Perhaps you may just need to get another card.</para>

	<para>At times, users see a few
	  <errorname>watchdog timeout</errorname> errors.  The first
	  thing to do here is to check your network cable.  Many cards
	  require a PCI slot which supports Bus Mastering.  On some
	  old motherboards, only one PCI slot allows it (usually slot
	  0).  Check the network card and the motherboard
	  documentation to determine if that may be the
	  problem.</para>

	<para><errorname>No route to host</errorname> messages occur
	  if the system is unable to route a packet to the destination
	  host.  This can happen if no default route is specified, or
	  if a cable is unplugged.  Check the output of
	  <command>netstat -rn</command> and make sure there is a
	  valid route to the host you are trying to reach.  If there
	  is not, read on to
	  <xref linkend="advanced-networking"/>.</para>

	<para><errorname>ping: sendto: Permission denied</errorname>
	  error messages are often caused by a misconfigured firewall.
	  If <command>ipfw</command> is enabled in the kernel but no
	  rules have been defined, then the default policy is to deny
	  all traffic, even ping requests!  Read on to
	  <xref linkend="firewalls"/> for more information.</para>

	<para>Sometimes performance of the card is poor, or below
	  average.  In these cases it is best to set the media
	  selection mode from <literal>autoselect</literal> to the
	  correct media selection.  While this usually works for most
	  hardware, it may not resolve this issue for everyone.
	  Again, check all the network settings, and read over the
	  &man.tuning.7; manual page.</para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="configtuning-virtual-hosts">
    <title>Virtual Hosts</title>

    <indexterm><primary>virtual hosts</primary></indexterm>
    <indexterm><primary>IP aliases</primary></indexterm>

    <para>A very common use of &os; is virtual site hosting, where one
      server appears to the network as many servers.  This is achieved
      by assigning multiple network addresses to a single
      interface.</para>

    <para>A given network interface has one <quote>real</quote>
      address, and may have any number of <quote>alias</quote>
      addresses.  These aliases are normally added by placing alias
      entries in <filename>/etc/rc.conf</filename>.</para>

    <para>An alias entry for the interface
      <devicename>fxp0</devicename> looks like:</para>

    <programlisting>ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx"</programlisting>

    <para>Note that alias entries must start with
      <literal>alias0</literal> and proceed upwards in order, (for
      example, <literal>_alias1</literal>, <literal>_alias2</literal>,
      and so on).  The configuration process will stop at the first
      missing number.</para>

    <para>The calculation of alias netmasks is important, but
      fortunately quite simple.  For a given interface, there must be
      one address which correctly represents the network's netmask.
      Any other addresses which fall within this network must have a
      netmask of all <literal>1</literal>s (expressed as either
      <hostid role="netmask">255.255.255.255</hostid> or
      <hostid role="netmask">0xffffffff</hostid>).</para>

    <para>For example, consider the case where the
      <devicename>fxp0</devicename> interface is connected to two
      networks, the <hostid role="ipaddr">10.1.1.0</hostid> network
      with a netmask of <hostid role="netmask">255.255.255.0</hostid>
      and the <hostid role="ipaddr">202.0.75.16</hostid> network with
      a netmask of <hostid role="netmask">255.255.255.240</hostid>.
      We want the system to appear at
      <hostid role="ipaddr">10.1.1.1</hostid> through
      <hostid role="ipaddr">10.1.1.5</hostid> and at
      <hostid role="ipaddr">202.0.75.17</hostid> through
      <hostid role="ipaddr">202.0.75.20</hostid>.  As noted above,
      only the first address in a given network range (in this case,
      <hostid role="ipaddr">10.0.1.1</hostid> and
      <hostid role="ipaddr">202.0.75.17</hostid>) should have a real
      netmask; all the rest (<hostid role="ipaddr">10.1.1.2</hostid>
      through <hostid role="ipaddr">10.1.1.5</hostid> and
      <hostid role="ipaddr">202.0.75.18</hostid> through
      <hostid role="ipaddr">202.0.75.20</hostid>) must be configured
      with a netmask of
      <hostid role="netmask">255.255.255.255</hostid>.</para>

    <para>The following <filename>/etc/rc.conf</filename> entries
      configure the adapter correctly for this arrangement:</para>

    <programlisting>ifconfig_fxp0="inet 10.1.1.1 netmask 255.255.255.0"
ifconfig_fxp0_alias0="inet 10.1.1.2 netmask 255.255.255.255"
ifconfig_fxp0_alias1="inet 10.1.1.3 netmask 255.255.255.255"
ifconfig_fxp0_alias2="inet 10.1.1.4 netmask 255.255.255.255"
ifconfig_fxp0_alias3="inet 10.1.1.5 netmask 255.255.255.255"
ifconfig_fxp0_alias4="inet 202.0.75.17 netmask 255.255.255.240"
ifconfig_fxp0_alias5="inet 202.0.75.18 netmask 255.255.255.255"
ifconfig_fxp0_alias6="inet 202.0.75.19 netmask 255.255.255.255"
ifconfig_fxp0_alias7="inet 202.0.75.20 netmask 255.255.255.255"</programlisting>

  </sect1>

  <sect1 id="configtuning-syslog">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Niclas</firstname>
	  <surname>Zeising</surname>
	  <contrib>Contributed by </contrib>
	</author>
      </authorgroup>
    </sect1info>

    <title>Configuring the system logger
      <application>syslogd</application></title>

    <indexterm><primary>system logging</primary></indexterm>
    <indexterm><primary>syslog</primary></indexterm>
    <indexterm><primary>syslogd</primary></indexterm>

    <para>System logging is an important aspect of system
      administration.  It is used both to detect hardware and software
      issues and errors in the system.  It also plays a very
      important role in security auditing and incident response.
      System daemons without a controlling terminal also usually log
      information to a system logging facility or other log
      file.</para>

    <para>This section will describe how to configure and use the &os;
      system logger, &man.syslogd.8;, as well as discuss log rotation
      and log management using &man.newsyslog.8;.  Focus
      will be on setting up and using <command>syslogd</command> on
      a local machine.  For more advanced setups using a separate
      loghost, see <xref linkend="network-syslogd"/>.</para>

    <sect2>
      <title>Using <application>syslogd</application></title>

      <para>In the default &os; configuration &man.syslogd.8; is
	started at boot.  This is controlled by the variable
	<literal>syslogd_enable</literal> in
	<filename>/etc/rc.conf</filename>.  There are numerous
	application arguments that affect the behavior of
	&man.syslogd.8;.  To change them, use
	<literal>syslogd_flags</literal> in
	<filename>/etc/rc.conf</filename>.  Refer to &man.syslogd.8;
	for more information on the arguments, and &man.rc.conf.5;,
	<xref linkend="configtuning-core-configuration"/> and <xref
	linkend="configtuning-rcd"/> for more information about
	<filename>/etc/rc.conf</filename> and the &man.rc.8;
	subsystem.</para>
    </sect2>

    <sect2>
      <title>Configuring <application>syslogd</application></title>

      <indexterm><primary>syslog.conf</primary></indexterm>

      <para>The configuration file, by default
	<filename>/etc/syslog.conf</filename>, controls what
	&man.syslogd.8; does with the log entries once they are
	received.  There are several parameters to control the
	handling of incoming events, of which the most basic are
	<firstterm>facility</firstterm> and
	<firstterm>level</firstterm>.  The facility describes
	which subsystem generated the message, such as the kernel or a
	daemon, and the level describes the severity of the event that
	occurred.  This makes it possible to log the message to
	different log files, or discard it, depending on the facility
	and level.  It is also possible to take action depending on
	the application that sent the message, and in the case of
	remote logging, also the hostname of the machine generating
	the logging event.</para>

      <para>Configuring &man.syslogd.8; is quite straight
	forward.  The configuration file contains one line per action,
	and the syntax for each line is a selector field followed by
	an action field.  The syntax of the selector field is
	<replaceable>facility.level</replaceable> which will match
	log messages from <replaceable>facility</replaceable> at level
	<replaceable>level</replaceable> or higher.  It is also
	possible to add an optional comparison flag before the level
	to specify more precisely what is logged. Multiple
	selector fields can be used for the same action, and are
	separated with a semicolon (<literal>;</literal>).  Using
	<literal>*</literal> will match everything.
	The action field denotes where to send the log message,
	such as a file or a remote log host.  As an example, here is
	the default <filename>syslog.conf</filename> from &os;:</para>

      <programlisting># &dollar;&os;&dollar;
#
#       Spaces ARE valid field separators in this file. However,
#       other *nix-like systems still insist on using tabs as field
#       separators. If you are sharing this file between systems, you
#       may want to use only tabs as field separators here.
#       Consult the &man.syslog.conf.5; manpage.
*.err;kern.warning;auth.notice;mail.crit                /dev/console <co id="co-syslog-many-match"/>
*.notice;authpriv.none;kern.debug;lpr.info;mail.crit;news.err   /var/log/messages
security.*                                      /var/log/security
auth.info;authpriv.info                         /var/log/auth.log
mail.info                                       /var/log/maillog <co id="co-syslog-one-match"/>
lpr.info                                        /var/log/lpd-errs
ftp.info                                        /var/log/xferlog
cron.*                                          /var/log/cron
*.=debug                                        /var/log/debug.log <co id="co-syslog-comparison"/>
*.emerg                                         *
# uncomment this to log all writes to /dev/console to /var/log/console.log
#console.info                                   /var/log/console.log
# uncomment this to enable logging of all log messages to /var/log/all.log
# touch /var/log/all.log and chmod it to mode 600 before it will work
#*.*                                            /var/log/all.log
# uncomment this to enable logging to a remote loghost named loghost
#*.*                                            @loghost
# uncomment these if you're running inn
# news.crit                                     /var/log/news/news.crit
# news.err                                      /var/log/news/news.err
# news.notice                                   /var/log/news/news.notice
!ppp <co id="co-syslog-prog-spec"/>
*.*                                             /var/log/ppp.log
!*</programlisting>

      <calloutlist>
	<callout arearefs="co-syslog-many-match">
	  <para>Match all messages with a level of
	    <literal>err</literal> or higher, as well as
	    <literal>kern.warning</literal>,
	    <literal>auth.notice</literal> and
	    <literal>mail.crit</literal>, and send these log messages
	    to the console (<filename>/dev/console</filename>).</para>
	</callout>

	<callout arearefs="co-syslog-one-match">
	  <para>Match all messages from the <literal>mail</literal>
	    facility at level <literal>info</literal> or above, and
	    log the messages to
	    <filename>/var/log/maillog</filename>.</para>
	</callout>

	<callout arearefs="co-syslog-comparison">
	  <para>This line uses a comparison flag, <literal>=</literal>
	    to only match messages at level <literal>debug</literal>,
	    and log them in
	    <filename>/var/log/debug.log</filename>.</para>
	</callout>

	<callout arearefs="co-syslog-prog-spec">
	  <para>Here is an example usage of a
	    <emphasis>program specification</emphasis>.  This will
	    make the rules following only be valid for the program
	    in the program specification.  In this case
	    this line and the following makes all messages from
	    <command>ppp</command>, but no other programs, end up in
	    <filename>/var/log/ppp.log</filename>.</para>
	</callout>
      </calloutlist>

      <para>This example shows that there are plenty of levels and
	subsystems.  The levels are, in order from most to least
	critical: <literal>emerg</literal>, <literal>alert</literal>,
	<literal>crit</literal>, <literal>err</literal>,
	<literal>warning</literal>, <literal>notice</literal>,
	<literal>info</literal> and <literal>debug</literal>.</para>

      <para>The facilities are, in no particular order:
	<literal>auth</literal>, <literal>authpriv</literal>,
	<literal>console</literal>, <literal>cron</literal>,
	<literal>daemon</literal>, <literal>ftp</literal>,
	<literal>kern</literal>, <literal>lpr</literal>,
	<literal>mail</literal>, <literal>mark</literal>,
	<literal>news</literal>, <literal>security</literal>,
	<literal>syslog</literal>, <literal>user</literal>,
	<literal>uucp</literal> and <literal>local0</literal> through
	<literal>local7</literal>.  Be aware that other operating
	systems might have different facilities.</para>

      <para>With this knowledge it is easy to add a new line to
	<filename>/etc/syslog.conf</filename> to log everything from
	the different daemons on level <literal>notice</literal> and
	higher to <filename>/var/log/daemon.log</filename>.  Just add
	the following:</para>

      <programlisting>daemon.notice                                        /var/log/daemon.log</programlisting>

      <para>For more information about the different levels and
	facilities, refer to &man.syslog.3; and &man.syslogd.8;.
	For more information about <filename>syslog.conf</filename>,
	its syntax, and more advanced usage examples, see
	&man.syslog.conf.5; and <xref
	linkend="network-syslogd"/>.</para>
    </sect2>

    <sect2>
      <title>Log management and rotation with
	<application>newsyslog</application></title>

      <indexterm><primary>newsyslog</primary></indexterm>
      <indexterm><primary>newsyslog.conf</primary></indexterm>
      <indexterm><primary>log rotation</primary></indexterm>
      <indexterm><primary>log management</primary></indexterm>

      <para>Log files tend to grow quickly and accumulate steadily.
	This leads to the files being full of less immediately useful
	information, as well as filling up the hard drive.  To
	mitigate this, log management comes into play.  In &os;,
	&man.newsyslog.8; is the tool used to manage log files.  This
	program is used to periodically rotate and compress log files,
	as well as optionally create missing log files and signal
	programs when log files are moved.  The log files do not
	necessarily have to come from syslog; &man.newsyslog.8; works
	with any logs written from any program.  It is important to
	note that <command>newsyslog</command> is normally run from
	&man.cron.8; and is not a system daemon.  In the default
	configuration it is run every hour.</para>

      <sect3>
	<title>Configuring
	  <application>newsyslog</application></title>

	<para>To know what actions to take, &man.newsyslog.8; reads
	  its configuration file, by default
	  <filename>/etc/newsyslog.conf</filename>.  This
	  configuration file contains one line for each file that
	  &man.newsyslog.8; manages.  Each line states the file
	  owner, permissions, and when to rotate that file, as well as
	  optional flags that affect the log rotation (such as
	  compression) and programs to signal when the log is
	  rotated. As an example, here is the default configuration
	  in &os;:</para>

        <programlisting># configuration file for newsyslog
# &dollar;&os;&dollar;
#
# Entries which do not specify the '/pid_file' field will cause the
# syslogd process to be signalled when that log file is rotated.  This
# action is only appropriate for log files which are written to by the
# syslogd process (ie, files listed in /etc/syslog.conf).  If there
# is no process which needs to be signalled when a given log file is
# rotated, then the entry for that file should include the 'N' flag.
#
# The 'flags' field is one or more of the letters: BCDGJNUXZ or a '-'.
#
# Note: some sites will want to select more restrictive protections than the
# defaults.  In particular, it may be desirable to switch many of the 644
# entries to 640 or 600.  For example, some sites will consider the
# contents of maillog, messages, and lpd-errs to be confidential.  In the
# future, these defaults may change to more conservative ones.
#
# logfilename          [owner:group]    mode count size when  flags [/pid_file] [sig_num]
/var/log/all.log                        600  7     *    @T00  J
/var/log/amd.log                        644  7     100  *     J
/var/log/auth.log                       600  7     100  @0101T JC
/var/log/console.log                    600  5     100  *     J
/var/log/cron                           600  3     100  *     JC
/var/log/daily.log                      640  7     *    @T00  JN
/var/log/debug.log                      600  7     100  *     JC
/var/log/init.log                       644  3     100  *     J
/var/log/kerberos.log                   600  7     100  *     J
/var/log/lpd-errs                       644  7     100  *     JC
/var/log/maillog                        640  7     *    @T00  JC
/var/log/messages                       644  5     100  @0101T JC
/var/log/monthly.log                    640  12    *    $M1D0 JN
/var/log/pflog                          600  3     100  *     JB    /var/run/pflogd.pid
/var/log/ppp.log        root:network    640  3     100  *     JC
/var/log/security                       600  10    100  *     JC
/var/log/sendmail.st                    640  10    *    168   B
/var/log/utx.log                        644  3     *    @01T05 B
/var/log/weekly.log                     640  5     1    $W6D0 JN
/var/log/xferlog                        600  7     100  *     JC</programlisting>

	<para>Each line starts with the name of the file to be
	  rotated, optionally followrd by an owner
	  and group for both rotated and newly created files.
	  The next field, <literal>mode</literal> is the mode of the
	  files and <literal>count</literal> denotes how many rotated
	  log files should be kept.  The <literal>size</literal> and
	  <literal>when</literal> fields tell
	  <command>newsyslog</command> when to rotate the file.
	  A log file is rotated when either its size is larger than
	  the <literal>size</literal> field, or when the time in the
	  <literal>when</literal> filed has passed.
	  <literal>*</literal> means that this field is ignored.  The
	  <replaceable>flags</replaceable> field gives
	  &man.newsyslog.8; further instructions, such as
	  how to compress the rotated file, or to create the log file
	  if it is missing.  The last two fields are optional, and
	  specify the <acronym
	  role="Process Identifier">PID</acronym>-file of a
	  process and a signal number to send to that process with
	  when the file is rotated.  For more information on all
	  fields, valid flags and how to specify the rotation time,
	  refer to &man.newsyslog.conf.5;.  Remember that
	  <command>newsyslog</command> is run from
	  <command>cron</command> and can not rotate files more
	  often than it is run from &man.cron.8;.</para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="configtuning-configfiles">
    <title>Configuration Files</title>

    <sect2>
      <title><filename class="directory">/etc</filename>
	Layout</title>

      <para>There are a number of directories in which configuration
	information is kept.  These include:</para>

      <informaltable frame="none" pgwide="1">
	<tgroup cols="2">
	  <colspec colwidth="1*"/>
	  <colspec colwidth="2*"/>

	  <tbody>
	    <row>
	      <entry><filename
		  class="directory">/etc</filename></entry>
	      <entry>Generic system configuration information; data
		here is system-specific.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/etc/defaults</filename></entry>
	      <entry>Default versions of system configuration
		files.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/etc/mail</filename></entry>
	      <entry>Extra &man.sendmail.8; configuration, other
		MTA configuration files.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/etc/ppp</filename></entry>
	      <entry>Configuration for both user- and kernel-ppp
		programs.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/etc/namedb</filename></entry>
	      <entry>Default location for &man.named.8; data.
		Normally <filename>named.conf</filename> and zone
		files are stored here.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/usr/local/etc</filename></entry>
	      <entry>Configuration files for installed applications.
		May contain per-application subdirectories.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/usr/local/etc/rc.d</filename></entry>
	      <entry>Start/stop scripts for installed
		applications.</entry>
	    </row>

	    <row>
	      <entry><filename
		  class="directory">/var/db</filename></entry>
	      <entry>Automatically generated system-specific database
		files, such as the package database, the locate
		database, and so on</entry>
	    </row>
	  </tbody>
	</tgroup>
      </informaltable>
    </sect2>

    <sect2>
      <title>Hostnames</title>

      <indexterm><primary>hostname</primary></indexterm>
      <indexterm><primary>DNS</primary></indexterm>

      <sect3>
	<title><filename>/etc/resolv.conf</filename></title>

	<indexterm>
	  <primary><filename>resolv.conf</filename></primary>
	</indexterm>

	<para><filename>/etc/resolv.conf</filename> dictates how
	  &os;'s resolver accesses the Internet Domain Name System
	  (DNS).</para>

	<para>The most common entries to
	  <filename>resolv.conf</filename> are:</para>

	<informaltable frame="none" pgwide="1">
	  <tgroup cols="2">
	    <colspec colwidth="1*"/>
	    <colspec colwidth="2*"/>

	    <tbody>
	      <row>
		<entry><literal>nameserver</literal></entry>
		<entry>The IP address of a name server the resolver
		  should query.  The servers are queried in the order
		  listed with a maximum of three.</entry>
	      </row>

	      <row>
		<entry><literal>search</literal></entry>
		<entry>Search list for hostname lookup.  This is
		  normally determined by the domain of the local
		  hostname.</entry>
	      </row>

	      <row>
		<entry><literal>domain</literal></entry>
		<entry>The local domain name.</entry>
	      </row>
	    </tbody>
	  </tgroup>
	</informaltable>

	<para>A typical <filename>resolv.conf</filename>:</para>

	<programlisting>search example.com
nameserver 147.11.1.11
nameserver 147.11.100.30</programlisting>

	<note>
	  <para>Only one of the <literal>search</literal> and
	    <literal>domain</literal> options should be used.</para>
	</note>

	<para>If you are using DHCP, &man.dhclient.8; usually rewrites
	  <filename>resolv.conf</filename> with information received
	  from the DHCP server.</para>
      </sect3>

      <sect3>
	<title><filename>/etc/hosts</filename></title>

	<indexterm><primary>hosts</primary></indexterm>

	<para><filename>/etc/hosts</filename> is a simple text
	  database reminiscent of the old Internet.  It works in
	  conjunction with DNS and NIS providing name to IP address
	  mappings.  Local computers connected via a LAN can be placed
	  in here for simplistic naming purposes instead of setting up
	  a &man.named.8; server.  Additionally,
	  <filename>/etc/hosts</filename> can be used to provide a
	  local record of Internet names, reducing the need to query
	  externally for commonly accessed names.</para>

	<programlisting># &dollar;&os;&dollar;
#
#
# Host Database
#
# This file should contain the addresses and aliases for local hosts that
# share this file.  Replace 'my.domain' below with the domainname of your
# machine.
#
# In the presence of the domain name service or NIS, this file may
# not be consulted at all; see /etc/nsswitch.conf for the resolution order.
#
#
::1			localhost localhost.my.domain
127.0.0.1		localhost localhost.my.domain
#
# Imaginary network.
#10.0.0.2		myname.my.domain myname
#10.0.0.3		myfriend.my.domain myfriend
#
# According to RFC 1918, you can use the following IP networks for
# private nets which will never be connected to the Internet:
#
#	10.0.0.0	-   10.255.255.255
#	172.16.0.0	-   172.31.255.255
#	192.168.0.0	-   192.168.255.255
#
# In case you want to be able to connect to the Internet, you need
# real official assigned numbers.  Do not try to invent your own network
# numbers but instead get one from your network provider (if any) or
# from your regional registry (ARIN, APNIC, LACNIC, RIPE NCC, or AfriNIC.)
#</programlisting>

	<para><filename>/etc/hosts</filename> takes on the simple
	  format of:</para>

	<programlisting>[Internet address] [official hostname] [alias1] [alias2] ...</programlisting>

	<para>For example:</para>

	<programlisting>10.0.0.1 myRealHostname.example.com myRealHostname foobar1 foobar2</programlisting>

	<para>Consult &man.hosts.5; for more information.</para>
      </sect3>
    </sect2>

    <sect2 id="configtuning-sysctlconf">
      <title><filename>sysctl.conf</filename></title>

      <indexterm><primary>sysctl.conf</primary></indexterm>
      <indexterm><primary>sysctl</primary></indexterm>

      <para><filename>sysctl.conf</filename> looks much like
	<filename>rc.conf</filename>.  Values are set in a
	<literal>variable=value</literal> form.  The specified values
	are set after the system goes into multi-user mode.  Not all
	variables are settable in this mode.</para>

      <para>To turn off logging of fatal signal exits and prevent
	users from seeing processes started from other users, the
	following tunables can be set in
	<filename>sysctl.conf</filename>:</para>

      <programlisting># Do not log fatal signal exits (e.g., sig 11)
kern.logsigexit=0

# Prevent users from seeing information about processes that
# are being run under another UID.
security.bsd.see_other_uids=0</programlisting>

    </sect2>
  </sect1>

  <sect1 id="configtuning-sysctl">
    <title>Tuning with &man.sysctl.8;</title>

    <indexterm><primary>sysctl</primary></indexterm>
    <indexterm>
      <primary>tuning</primary>
      <secondary>with sysctl</secondary>
    </indexterm>

    <para>&man.sysctl.8; is an interface that allows you to make
      changes to a running &os; system.  This includes many advanced
      options of the TCP/IP stack and virtual memory system that can
      dramatically improve performance for an experienced system
      administrator.  Over five hundred system variables can be read
      and set using &man.sysctl.8;.</para>

    <para>At its core, &man.sysctl.8; serves two functions: to read
      and to modify system settings.</para>

    <para>To view all readable variables:</para>

    <screen>&prompt.user; <userinput>sysctl -a</userinput></screen>

    <para>To read a particular variable, for example,
      <varname>kern.maxproc</varname>:</para>

    <screen>&prompt.user; <userinput>sysctl kern.maxproc</userinput>
kern.maxproc: 1044</screen>

    <para>To set a particular variable, use the intuitive
      <replaceable>variable</replaceable>=<replaceable>value</replaceable>
      syntax:</para>

    <screen>&prompt.root; <userinput>sysctl kern.maxfiles=5000</userinput>
kern.maxfiles: 2088 -&gt; 5000</screen>

    <para>Settings of sysctl variables are usually either strings,
      numbers, or booleans (a boolean being <literal>1</literal> for
      yes or a <literal>0</literal> for no).</para>

    <para>If you want to set automatically some variables each time
      the machine boots, add them to the
      <filename>/etc/sysctl.conf</filename> file.  For more
      information see the &man.sysctl.conf.5; manual page and the
      <xref linkend="configtuning-sysctlconf"/>.</para>

    <sect2 id="sysctl-readonly">
      <sect2info>
	<authorgroup>
	  <author>
	    <firstname>Tom</firstname>
	    <surname>Rhodes</surname>
	    <contrib>Contributed by </contrib>
	    <!-- 31 January 2003 -->
	  </author>
	</authorgroup>
      </sect2info>
      <title>&man.sysctl.8; Read-only</title>

      <para>In some cases it may be desirable to modify read-only
	&man.sysctl.8; values.  While this is sometimes unavoidable,
	it can only be done on (re)boot.</para>

      <para>For instance on some laptop models the &man.cardbus.4;
	device will not probe memory ranges, and fail with errors
	which look similar to:</para>

      <screen>cbb0: Could not map register memory
device_probe_and_attach: cbb0 attach returned 12</screen>

      <para>Cases like the one above usually require the modification
	of some default &man.sysctl.8; settings which are set read
	only.  To overcome these situations a user can put
	&man.sysctl.8; <quote>OIDs</quote> in their local
	<filename>/boot/loader.conf</filename>.  Default settings are
	located in the <filename>/boot/defaults/loader.conf</filename>
	file.</para>

      <para>Fixing the problem mentioned above would require a user to
	set <option>hw.pci.allow_unsupported_io_range=1</option> in
	the aforementioned file.  Now &man.cardbus.4; will work
	properly.</para>
    </sect2>
  </sect1>

  <sect1 id="configtuning-disk">
    <title>Tuning Disks</title>

    <sect2>
      <title>Sysctl Variables</title>

      <sect3>
	<title><varname>vfs.vmiodirenable</varname></title>

	<indexterm>
	  <primary><varname>vfs.vmiodirenable</varname></primary>
	</indexterm>

	<para>The <varname>vfs.vmiodirenable</varname> sysctl variable
	  may be set to either 0 (off) or 1 (on); it is 1 by default.
	  This variable controls how directories are cached by the
	  system.  Most directories are small, using just a single
	  fragment (typically 1&nbsp;K) in the file system and less
	  (typically 512&nbsp;bytes) in the buffer cache.  With this
	  variable turned off (to 0), the buffer cache will only cache
	  a fixed number of directories even if you have a huge amount
	  of memory.  When turned on (to 1), this sysctl allows the
	  buffer cache to use the VM Page Cache to cache the
	  directories, making all the memory available for caching
	  directories.  However, the minimum in-core memory used to
	  cache a directory is the physical page size (typically
	  4&nbsp;K) rather than 512&nbsp; bytes.  We recommend keeping
	  this option on if you are running any services which
	  manipulate large numbers of files.  Such services can
	  include web caches, large mail systems, and news systems.
	  Keeping this option on will generally not reduce performance
	  even with the wasted memory but you should experiment to
	  find out.</para>
      </sect3>

      <sect3>
	<title><varname>vfs.write_behind</varname></title>

	<indexterm>
	  <primary><varname>vfs.write_behind</varname></primary>
	</indexterm>

	<para>The <varname>vfs.write_behind</varname> sysctl variable
	  defaults to <literal>1</literal> (on).  This tells the file
	  system to issue media writes as full clusters are collected,
	  which typically occurs when writing large sequential files.
	  The idea is to avoid saturating the buffer cache with dirty
	  buffers when it would not benefit I/O performance.  However,
	  this may stall processes and under certain circumstances you
	  may wish to turn it off.</para>
      </sect3>

      <sect3>
	<title><varname>vfs.hirunningspace</varname></title>

	<indexterm>
	  <primary><varname>vfs.hirunningspace</varname></primary>
	</indexterm>

	<para>The <varname>vfs.hirunningspace</varname> sysctl
	  variable determines how much outstanding write I/O may be
	  queued to disk controllers system-wide at any given
	  instance.  The default is usually sufficient but on machines
	  with lots of disks you may want to bump it up to four or
	  five <emphasis>megabytes</emphasis>.  Note that setting too
	  high a value (exceeding the buffer cache's write threshold)
	  can lead to extremely bad clustering performance.  Do not
	  set this value arbitrarily high!  Higher write values may
	  add latency to reads occurring at the same time.</para>

	<para>There are various other buffer-cache and VM page cache
	  related sysctls.  We do not recommend modifying these
	  values, the VM system does an extremely good job of
	  automatically tuning itself.</para>
      </sect3>

      <sect3>
	<title><varname>vm.swap_idle_enabled</varname></title>

	<indexterm>
	  <primary><varname>vm.swap_idle_enabled</varname></primary>
	</indexterm>

	<para>The <varname>vm.swap_idle_enabled</varname> sysctl
	  variable is useful in large multi-user systems where you
	  have lots of users entering and leaving the system and lots
	  of idle processes.  Such systems tend to generate a great
	  deal of continuous pressure on free memory reserves.
	  Turning this feature on and tweaking the swapout hysteresis
	  (in idle seconds) via
	  <varname>vm.swap_idle_threshold1</varname> and
	  <varname>vm.swap_idle_threshold2</varname> allows you to
	  depress the priority of memory pages associated with idle
	  processes more quickly then the normal pageout algorithm.
	  This gives a helping hand to the pageout daemon.  Do not
	  turn this option on unless you need it, because the tradeoff
	  you are making is essentially pre-page memory sooner rather
	  than later; thus eating more swap and disk bandwidth.  In a
	  small system this option will have a determinable effect but
	  in a large system that is already doing moderate paging this
	  option allows the VM system to stage whole processes into
	  and out of memory easily.</para>
      </sect3>

      <sect3>
	<title><varname>hw.ata.wc</varname></title>

	<indexterm>
	  <primary><varname>hw.ata.wc</varname></primary>
	</indexterm>

	<para>&os;&nbsp;4.3 flirted with turning off IDE write
	  caching.  This reduced write bandwidth to IDE disks but was
	  considered necessary due to serious data consistency issues
	  introduced by hard drive vendors.  The problem is that IDE
	  drives lie about when a write completes.  With IDE write
	  caching turned on, IDE hard drives not only write data to
	  disk out of order, but will sometimes delay writing some
	  blocks indefinitely when under heavy disk loads.  A crash or
	  power failure may cause serious file system corruption.
	  &os;'s default was changed to be safe.  Unfortunately, the
	  result was such a huge performance loss that we changed
	  write caching back to on by default after the release.  You
	  should check the default on your system by observing the
	  <varname>hw.ata.wc</varname> sysctl variable.  If IDE write
	  caching is turned off, you can turn it back on by setting
	  the kernel variable back to 1.  This must be done from the
	  boot loader at boot time.  Attempting to do it after the
	  kernel boots will have no effect.</para>

	<para>For more information, please see &man.ata.4;.</para>
      </sect3>

      <sect3>
	<title><literal>SCSI_DELAY</literal>
	  (<varname>kern.cam.scsi_delay</varname>)</title>

	<indexterm>
	  <primary><varname>kern.cam.scsi_delay</varname></primary>
	</indexterm>

	<indexterm>
	  <primary>kernel options</primary>
	  <secondary><literal>SCSI_DELAY</literal></secondary>
	</indexterm>

	<para>The <literal>SCSI_DELAY</literal> kernel config may be
	  used to reduce system boot times.  The defaults are fairly
	  high and can be responsible for <literal>15</literal>
	  seconds of delay in the boot process.  Reducing it to
	  <literal>5</literal> seconds usually works (especially with
	  modern drives).  The <varname>kern.cam.scsi_delay</varname>
	  boot time tunable should be used.  The tunable, and kernel
	  config option accept values in terms of
	  <emphasis>milliseconds</emphasis> and
	  <emphasis>not</emphasis>
	  <emphasis>seconds</emphasis>.</para>
      </sect3>
    </sect2>

    <sect2 id="soft-updates">
      <title>Soft Updates</title>

      <indexterm><primary>Soft Updates</primary></indexterm>
      <indexterm><primary>tunefs</primary></indexterm>

      <para>The &man.tunefs.8; program can be used to fine-tune a
	file system.  This program has many different options, but for
	now we are only concerned with toggling Soft Updates on and
	off, which is done by:</para>

      <screen>&prompt.root; <userinput>tunefs -n enable /filesystem</userinput>
&prompt.root; <userinput>tunefs -n disable /filesystem</userinput></screen>

      <para>A filesystem cannot be modified with &man.tunefs.8; while
	it is mounted.  A good time to enable Soft Updates is before
	any partitions have been mounted, in single-user mode.</para>

      <para>Soft Updates drastically improves meta-data performance,
	mainly file creation and deletion, through the use of a memory
	cache.  We recommend to use Soft Updates on all of your file
	systems.  There are two downsides to Soft Updates that you
	should be aware of: First, Soft Updates guarantees filesystem
	consistency in the case of a crash but could very easily be
	several seconds (even a minute!) behind updating the physical
	disk.  If your system crashes you may lose more work than
	otherwise.  Secondly, Soft Updates delays the freeing of
	filesystem blocks.  If you have a filesystem (such as the root
	filesystem) which is almost full, performing a major update,
	such as <command>make installworld</command>, can cause the
	filesystem to run out of space and the update to fail.</para>

      <sect3>
	<title>More Details About Soft Updates</title>

	<indexterm>
	  <primary>Soft Updates</primary>
	  <secondary>details</secondary>
	</indexterm>

	<para>There are two traditional approaches to writing a file
	  systems meta-data back to disk.  (Meta-data updates are
	  updates to non-content data like inodes or
	  directories.)</para>

	<para>Historically, the default behavior was to write out
	  meta-data updates synchronously.  If a directory had been
	  changed, the system waited until the change was actually
	  written to disk.  The file data buffers (file contents) were
	  passed through the buffer cache and backed up to disk later
	  on asynchronously.  The advantage of this implementation is
	  that it operates safely.  If there is a failure during an
	  update, the meta-data are always in a consistent state.  A
	  file is either created completely or not at all.  If the
	  data blocks of a file did not find their way out of the
	  buffer cache onto the disk by the time of the crash,
	  &man.fsck.8; is able to recognize this and repair the
	  filesystem by setting the file length to 0.  Additionally,
	  the implementation is clear and simple.  The disadvantage is
	  that meta-data changes are slow.  An
	  <command>rm -r</command>, for instance, touches all the
	  files in a directory sequentially, but each directory change
	  (deletion of a file) will be written synchronously to the
	  disk.  This includes updates to the directory itself, to the
	  inode table, and possibly to indirect blocks allocated by
	  the file.  Similar considerations apply for unrolling large
	  hierarchies (<command>tar -x</command>).</para>

	<para>The second case is asynchronous meta-data updates.  This
	  is the default for Linux/ext2fs and
	  <command>mount -o async</command> for *BSD ufs.  All
	  meta-data updates are simply being passed through the buffer
	  cache too, that is, they will be intermixed with the updates
	  of the file content data.  The advantage of this
	  implementation is there is no need to wait until each
	  meta-data update has been written to disk, so all operations
	  which cause huge amounts of meta-data updates work much
	  faster than in the synchronous case.  Also, the
	  implementation is still clear and simple, so there is a low
	  risk for bugs creeping into the code.  The disadvantage is
	  that there is no guarantee at all for a consistent state of
	  the filesystem.  If there is a failure during an operation
	  that updated large amounts of meta-data (like a power
	  failure, or someone pressing the reset button), the
	  filesystem will be left in an unpredictable state.  There is
	  no opportunity to examine the state of the filesystem when
	  the system comes up again; the data blocks of a file could
	  already have been written to the disk while the updates of
	  the inode table or the associated directory were not.  It is
	  actually impossible to implement a <command>fsck</command>
	  which is able to clean up the resulting chaos (because the
	  necessary information is not available on the disk).  If the
	  filesystem has been damaged beyond repair, the only choice
	  is to use &man.newfs.8; on it and restore it from
	  backup.</para>

	<para>The usual solution for this problem was to implement
	  <emphasis>dirty region logging</emphasis>, which is also
	  referred to as <emphasis>journaling</emphasis>, although
	  that term is not used consistently and is occasionally
	  applied to other forms of transaction logging as well.
	  Meta-data updates are still written synchronously, but only
	  into a small region of the disk.  Later on they will be
	  moved to their proper location.  Because the logging area is
	  a small, contiguous region on the disk, there are no long
	  distances for the disk heads to move, even during heavy
	  operations, so these operations are quicker than synchronous
	  updates.  Additionally the complexity of the implementation
	  is fairly limited, so the risk of bugs being present is low.
	  A disadvantage is that all meta-data are written twice (once
	  into the logging region and once to the proper location) so
	  for normal work, a performance <quote>pessimization</quote>
	  might result.  On the other hand, in case of a crash, all
	  pending meta-data operations can be quickly either
	  rolled-back or completed from the logging area after the
	  system comes up again, resulting in a fast filesystem
	  startup.</para>

	<para>Kirk McKusick, the developer of Berkeley FFS, solved
	  this problem with Soft Updates: all pending meta-data
	  updates are kept in memory and written out to disk in a
	  sorted sequence (<quote>ordered meta-data updates</quote>).
	  This has the effect that, in case of heavy meta-data
	  operations, later updates to an item <quote>catch</quote>
	  the earlier ones if the earlier ones are still in memory and
	  have not already been written to disk.  So all operations
	  on, say, a directory are generally performed in memory
	  before the update is written to disk (the data blocks are
	  sorted according to their position so that they will not be
	  on the disk ahead of their meta-data).  If the system
	  crashes, this causes an implicit <quote>log rewind</quote>:
	  all operations which did not find their way to the disk
	  appear as if they had never happened.  A consistent
	  filesystem state is maintained that appears to be the one of
	  30 to 60 seconds earlier.  The algorithm used guarantees
	  that all resources in use are marked as such in their
	  appropriate bitmaps: blocks and inodes.  After a crash, the
	  only resource allocation error that occurs is that resources
	  are marked as <quote>used</quote> which are actually
	  <quote>free</quote>. &man.fsck.8; recognizes this situation,
	  and frees the resources that are no longer used.  It is safe
	  to ignore the dirty state of the filesystem after a crash by
	  forcibly mounting it with <command>mount -f</command>.  In
	  order to free resources that may be unused, &man.fsck.8;
	  needs to be run at a later time.  This is the idea behind
	  the <emphasis>background fsck</emphasis>: at system startup
	  time, only a <emphasis>snapshot</emphasis> of the filesystem
	  is recorded.  The <command>fsck</command> can be run later
	  on.  All file systems can then be mounted
	  <quote>dirty</quote>, so the system startup proceeds in
	  multiuser mode.  Then, background <command>fsck</command>s
	  will be scheduled for all file systems where this is
	  required, to free resources that may be unused.  (File
	  systems that do not use Soft Updates still need the usual
	  foreground <command>fsck</command> though.)</para>

	<para>The advantage is that meta-data operations are nearly
	  as fast as asynchronous updates (i.e., faster than with
	  <emphasis>logging</emphasis>, which has to write the
	  meta-data twice).  The disadvantages are the complexity of
	  the code (implying a higher risk for bugs in an area that is
	  highly sensitive regarding loss of user data), and a higher
	  memory consumption.  Additionally there are some
	  idiosyncrasies one has to get used to.  After a crash, the
	  state of the filesystem appears to be somewhat
	  <quote>older</quote>.  In situations where the standard
	  synchronous approach would have caused some zero-length
	  files to remain after the <command>fsck</command>, these
	  files do not exist at all with a Soft Updates filesystem
	  because neither the meta-data nor the file contents have
	  ever been written to disk.  Disk space is not released until
	  the updates have been written to disk, which may take place
	  some time after running <command>rm</command>.  This may
	  cause problems when installing large amounts of data on a
	  filesystem that does not have enough free space to hold all
	  the files twice.</para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="configtuning-kernel-limits">
    <title>Tuning Kernel Limits</title>

    <indexterm>
      <primary>tuning</primary>
      <secondary>kernel limits</secondary>
    </indexterm>

    <sect2 id="file-process-limits">
      <title>File/Process Limits</title>

      <sect3 id="kern-maxfiles">
	<title><varname>kern.maxfiles</varname></title>

	<indexterm>
	  <primary><varname>kern.maxfiles</varname></primary>
	</indexterm>

	<para><varname>kern.maxfiles</varname> can be raised or
	  lowered based upon your system requirements.  This variable
	  indicates the maximum number of file descriptors on your
	  system.  When the file descriptor table is full,
	  <errorname>file: table is full</errorname> will show up
	  repeatedly in the system message buffer, which can be viewed
	  with the <command>dmesg</command> command.</para>

	<para>Each open file, socket, or fifo uses one file
	  descriptor.  A large-scale production server may easily
	  require many thousands of file descriptors, depending on the
	  kind and number of services running concurrently.</para>

	<para>In older FreeBSD releases, the default value of
	  <varname>kern.maxfiles</varname> is derived from the
	  <option>maxusers</option> option in your kernel
	  configuration file.  <varname>kern.maxfiles</varname> grows
	  proportionally to the value of <option>maxusers</option>.
	  When compiling a custom kernel, it is a good idea to set
	  this kernel configuration option according to the uses of
	  your system.  From this number, the kernel is given most of
	  its pre-defined limits.  Even though a production machine
	  may not actually have 256 users connected at once, the
	  resources needed may be similar to a high-scale web
	  server.</para>

	<para>The variable <varname>kern.maxusers</varname> is
	  automatically sized at boot based on the amount of memory
	  available in the system, and may be determined at run-time
	  by inspecting the value of the read-only
	  <varname>kern.maxusers</varname> sysctl.  Some sites will
	  require larger or smaller values of
	  <varname>kern.maxusers</varname> and may set it as a loader
	  tunable; values of 64, 128, and 256 are not uncommon.  We do
	  not recommend going above 256 unless you need a huge number
	  of file descriptors; many of the tunable values set to their
	  defaults by <varname>kern.maxusers</varname> may be
	  individually overridden at boot-time or run-time in
	  <filename>/boot/loader.conf</filename> (see the
	  &man.loader.conf.5; manual page or the
	  <filename>/boot/defaults/loader.conf</filename> file for
	  some hints) or as described elsewhere in this
	  document.</para>

	<para>In older releases, the system will auto-tune
	  <literal>maxusers</literal> for you if you explicitly set it
	  to <literal>0</literal>
	  <footnote><para>The auto-tuning algorithm sets
	      <literal>maxusers</literal> equal to the amount of
	      memory in the system, with a minimum of 32, and a
	      maximum of 384.</para></footnote>.  When setting this
	  option, you will want to set <literal>maxusers</literal> to
	  at least 4, especially if you are using the X Window System
	  or compiling software.  The reason is that the most
	  important table set by <literal>maxusers</literal> is the
	  maximum number of processes, which is set to
	  <literal>20 + 16 * maxusers</literal>, so if you set
	  <literal>maxusers</literal> to 1, then you can only have 36
	  simultaneous processes, including the 18 or so that the
	  system starts up at boot time and the 15 or so you will
	  probably create when you start the X Window System.  Even a
	  simple task like reading a manual page will start up nine
	  processes to filter, decompress, and view it.  Setting
	  <literal>maxusers</literal> to 64 will allow you to have up
	  to 1044 simultaneous processes, which should be enough for
	  nearly all uses.  If, however, you see the dreaded
	  <errortype>proc table full</errortype> error when trying to
	  start another program, or are running a server with a large
	  number of simultaneous users (like
	  <hostid role="fqdn">ftp.FreeBSD.org</hostid>), you can
	  always increase the number and rebuild.</para>

	<note>
	  <para><literal>maxusers</literal> does
	    <emphasis>not</emphasis> limit the number of users which
	    can log into your machine.  It simply sets various table
	    sizes to reasonable values considering the maximum number
	    of users you will likely have on your system and how many
	    processes each of them will be running.</para>
	</note>
      </sect3>

      <sect3>
	<title><varname>kern.ipc.somaxconn</varname></title>

	<indexterm>
	  <primary><varname>kern.ipc.somaxconn</varname></primary>
	</indexterm>

	<para>The <varname>kern.ipc.somaxconn</varname> sysctl
	  variable limits the size of the listen queue for accepting
	  new TCP connections.  The default value of
	  <literal>128</literal> is typically too low for robust
	  handling of new connections in a heavily loaded web server
	  environment.  For such environments, it is recommended to
	  increase this value to <literal>1024</literal> or higher.
	  The service daemon may itself limit the listen queue size
	  (e.g., &man.sendmail.8;, or
	  <application>Apache</application>) but will often have a
	  directive in its configuration file to adjust the queue
	  size.  Large listen queues also do a better job of avoiding
	  Denial of Service (<abbrev>DoS</abbrev>) attacks.</para>
      </sect3>
    </sect2>

    <sect2 id="nmbclusters">
      <title>Network Limits</title>

      <para>The <literal>NMBCLUSTERS</literal> kernel configuration
	option dictates the amount of network Mbufs available to the
	system.  A heavily-trafficked server with a low number of
	Mbufs will hinder &os;'s ability.  Each cluster represents
	approximately 2&nbsp;K of memory, so a value of 1024
	represents 2 megabytes of kernel memory reserved for network
	buffers.  A simple calculation can be done to figure out how
	many are needed.  If you have a web server which maxes out at
	1000 simultaneous connections, and each connection eats a
	16&nbsp;K receive and 16&nbsp;K send buffer, you need
	approximately 32&nbsp;MB worth of network buffers to cover the
	web server.  A good rule of thumb is to multiply by 2, so
	2x32&nbsp;MB&nbsp;/&nbsp;2&nbsp;KB&nbsp;=
	64&nbsp;MB&nbsp;/&nbsp;2&nbsp;kB&nbsp;= 32768.  We recommend
	values between 4096 and 32768 for machines with greater
	amounts of memory.  Under no circumstances should you specify
	an arbitrarily high value for this parameter as it could lead
	to a boot time crash.  The <option>-m</option> option to
	&man.netstat.1; may be used to observe network cluster
	use.</para>

      <para><varname>kern.ipc.nmbclusters</varname> loader tunable
	should be used to tune this at boot time.  Only older versions
	of &os; will require you to use the
	<literal>NMBCLUSTERS</literal> kernel &man.config.8;
	option.</para>

      <para>For busy servers that make extensive use of the
	&man.sendfile.2; system call, it may be necessary to increase
	the number of &man.sendfile.2; buffers via the
	<literal>NSFBUFS</literal> kernel configuration option or by
	setting its value in <filename>/boot/loader.conf</filename>
	(see &man.loader.8; for details).  A common indicator that
	this parameter needs to be adjusted is when processes are seen
	in the <literal>sfbufa</literal> state.  The sysctl variable
	<varname>kern.ipc.nsfbufs</varname> is a read-only glimpse at
	the kernel configured variable.  This parameter nominally
	scales with <varname>kern.maxusers</varname>, however it may
	be necessary to tune accordingly.</para>

      <important>
	<para>Even though a socket has been marked as non-blocking,
	  calling &man.sendfile.2; on the non-blocking socket may
	  result in the &man.sendfile.2; call blocking until enough
	  <literal>struct sf_buf</literal>'s are made
	  available.</para>
      </important>

      <sect3>
	<title><varname>net.inet.ip.portrange.*</varname></title>

	<indexterm>
	  <primary>net.inet.ip.portrange.*</primary>
	</indexterm>

	<para>The <varname>net.inet.ip.portrange.*</varname> sysctl
	  variables control the port number ranges automatically bound
	  to TCP and UDP sockets.  There are three ranges: a low
	  range, a default range, and a high range.  Most network
	  programs use the default range which is controlled by the
	  <varname>net.inet.ip.portrange.first</varname> and
	  <varname>net.inet.ip.portrange.last</varname>, which default
	  to 1024 and 5000, respectively.  Bound port ranges are used
	  for outgoing connections, and it is possible to run the
	  system out of ports under certain circumstances.  This most
	  commonly occurs when you are running a heavily loaded web
	  proxy.  The port range is not an issue when running servers
	  which handle mainly incoming connections, such as a normal
	  web server, or has a limited number of outgoing connections,
	  such as a mail relay.  For situations where you may run
	  yourself out of ports, it is recommended to increase
	  <varname>net.inet.ip.portrange.last</varname> modestly.  A
	  value of <literal>10000</literal>, <literal>20000</literal>
	  or <literal>30000</literal> may be reasonable.  You should
	  also consider firewall effects when changing the port range.
	  Some firewalls may block large ranges of ports (usually
	  low-numbered ports) and expect systems to use higher ranges
	  of ports for outgoing connections &mdash; for this reason it
	  is not recommended that
	  <varname>net.inet.ip.portrange.first</varname> be
	  lowered.</para>
      </sect3>

      <sect3>
	<title>TCP Bandwidth Delay Product</title>

	<indexterm>
	  <primary>TCP Bandwidth Delay Product Limiting</primary>
	  <secondary><varname>net.inet.tcp.inflight.enable</varname></secondary>
	</indexterm>

	<para>The TCP Bandwidth Delay Product Limiting is similar to
	  TCP/Vegas in NetBSD.  It can be enabled by setting
	  <varname>net.inet.tcp.inflight.enable</varname> sysctl
	  variable to <literal>1</literal>.  The system will attempt
	  to calculate the bandwidth delay product for each connection
	  and limit the amount of data queued to the network to just
	  the amount required to maintain optimum throughput.</para>

	<para>This feature is useful if you are serving data over
	  modems, Gigabit Ethernet, or even high speed WAN links (or
	  any other link with a high bandwidth delay product),
	  especially if you are also using window scaling or have
	  configured a large send window.  If you enable this option,
	  you should also be sure to set
	  <varname>net.inet.tcp.inflight.debug</varname> to
	  <literal>0</literal> (disable debugging), and for production
	  use setting <varname>net.inet.tcp.inflight.min</varname> to
	  at least <literal>6144</literal> may be beneficial.
	  However, note that setting high minimums may effectively
	  disable bandwidth limiting depending on the link.  The
	  limiting feature reduces the amount of data built up in
	  intermediate route and switch packet queues as well as
	  reduces the amount of data built up in the local host's
	  interface queue.  With fewer packets queued up, interactive
	  connections, especially over slow modems, will also be able
	  to operate with lower <emphasis>Round Trip Times</emphasis>.
	  However, note that this feature only effects data
	  transmission (uploading / server side).  It has no effect on
	  data reception (downloading).</para>

	<para>Adjusting <varname>net.inet.tcp.inflight.stab</varname>
	  is <emphasis>not</emphasis> recommended.  This parameter
	  defaults to 20, representing 2 maximal packets added to the
	  bandwidth delay product window calculation.  The additional
	  window is required to stabilize the algorithm and improve
	  responsiveness to changing conditions, but it can also
	  result in higher ping times over slow links (though still
	  much lower than you would get without the inflight
	  algorithm).  In such cases, you may wish to try reducing
	  this parameter to 15, 10, or 5; and may also have to reduce
	  <varname>net.inet.tcp.inflight.min</varname> (for example,
	  to 3500) to get the desired effect.  Reducing these
	  parameters should be done as a last resort only.</para>
      </sect3>
    </sect2>

    <sect2>
      <title>Virtual Memory</title>

      <sect3>
	<title><varname>kern.maxvnodes</varname></title>

	<para>A vnode is the internal representation of a file or
	  directory.  So increasing the number of vnodes available to
	  the operating system cuts down on disk I/O.  Normally this
	  is handled by the operating system and does not need to be
	  changed.  In some cases where disk I/O is a bottleneck and
	  the system is running out of vnodes, this setting will need
	  to be increased.  The amount of inactive and free RAM will
	  need to be taken into account.</para>

	<para>To see the current number of vnodes in use:</para>

	<screen>&prompt.root; <userinput>sysctl vfs.numvnodes</userinput>
vfs.numvnodes: 91349</screen>

	<para>To see the maximum vnodes:</para>

	<screen>&prompt.root; <userinput>sysctl kern.maxvnodes</userinput>
kern.maxvnodes: 100000</screen>

	<para>If the current vnode usage is near the maximum,
	  increasing <varname>kern.maxvnodes</varname> by a value of
	  1,000 is probably a good idea.  Keep an eye on the number of
	  <varname>vfs.numvnodes</varname>.  If it climbs up to the
	  maximum again, <varname>kern.maxvnodes</varname> will need
	  to be increased further.  A shift in your memory usage as
	  reported by &man.top.1; should be visible.  More memory
	  should be active.</para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="adding-swap-space">
    <title>Adding Swap Space</title>

    <para>No matter how well you plan, sometimes a system does not run
      as you expect.  If you find you need more swap space, it is
      simple enough to add.  You have three ways to increase swap
      space: adding a new hard drive, enabling swap over NFS, and
      creating a swap file on an existing partition.</para>

    <para>For information on how to encrypt swap space, what options
      for this task exist and why it should be done, please refer to
      <xref linkend="swap-encrypting"/> of the Handbook.</para>

    <sect2 id="new-drive-swap">
      <title>Swap on a New or Existing Hard Drive</title>

      <para>Adding a new hard drive for swap gives better performance
	than adding a partition on an existing drive.  Setting up
	partitions and hard drives is explained in
	<xref linkend="disks-adding"/>.
	<xref linkend="configtuning-initial"/> discusses partition
	layouts and swap partition size considerations.</para>

      <para>Use &man.swapon.8; to add a swap partition to the system.
	For example:</para>

      <screen>&prompt.root; <userinput>swapon<replaceable> /dev/ada1s1b</replaceable></userinput></screen>

      <warning>

	<para>It is possible to use any partition not currently
	  mounted, even if it already contains data.  Using
	  &man.swapon.8; on a partition that contains data will
	  overwrite and destroy that data.  Make sure that the
	  partition to be added as swap is really the intended
	  partition before running &man.swapon.8;.</para>
      </warning>

      <para>To automatically add this swap partition on boot, add an
	entry to <filename>/etc/fstab</filename> for the
	partition:</para>

      <programlisting><replaceable>/dev/ada1s1b</replaceable>	none	swap	sw	0	0</programlisting>

      <para>See &man.fstab.5; for an explanation of the entries in
	<filename>/etc/fstab</filename>.</para>
    </sect2>

    <sect2 id="nfs-swap">
      <title>Swapping over NFS</title>

      <para>Swapping over NFS is only recommended if you do not have a
	local hard disk to swap to; NFS swapping will be limited by
	the available network bandwidth and puts an additional burden
	on the NFS server.</para>
    </sect2>

    <sect2 id="create-swapfile">
      <title>Swapfiles</title>

      <para>You can create a file of a specified size to use as a swap
	file.  In our example here we will use a 64MB file called
	<filename>/usr/swap0</filename>.  You can use any name you
	want, of course.</para>

      <example>
	<title>Creating a Swapfile on &os;</title>

	<orderedlist>
	  <listitem>

	    <para>The <filename>GENERIC</filename> kernel already
	      includes the memory disk driver (&man.md.4;) required
	      for this operation.  When building a custom kernel, make
	      sure to include the following line in your custom
	      configuration file:</para>

	    <programlisting>device   md</programlisting>

	    <para>For information on building your own kernel, please
	      refer to <xref linkend="kernelconfig"/>.</para>
	  </listitem>

	  <listitem>
	    <para>Create a swapfile
	      (<filename>/usr/swap0</filename>):</para>

	    <screen>&prompt.root; <userinput>dd if=/dev/zero of=/usr/swap0 bs=1024k count=64</userinput></screen>
	  </listitem>

	  <listitem>
	    <para>Set proper permissions on
	      (<filename>/usr/swap0</filename>):</para>

	    <screen>&prompt.root; <userinput>chmod 0600 /usr/swap0</userinput></screen>
	  </listitem>

	  <listitem>
	    <para>Enable the swap file in
	      <filename>/etc/rc.conf</filename>:</para>

	    <programlisting>swapfile="/usr/swap0"   # Set to name of swapfile if aux swapfile desired.</programlisting>
	  </listitem>

	  <listitem>
	    <para>Reboot the machine or to enable the swap file
	      immediately, type:</para>

	    <screen>&prompt.root; <userinput>mdconfig -a -t vnode -f /usr/swap0 -u 0 &amp;&amp; swapon /dev/md0</userinput></screen>
	  </listitem>
	</orderedlist>
      </example>
    </sect2>
  </sect1>

  <sect1 id="acpi-overview">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Hiten</firstname>
	  <surname>Pandya</surname>
	  <contrib>Written by </contrib>
	</author>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	</author>
      </authorgroup>
    </sect1info>

    <title>Power and Resource Management</title>

    <para>It is important to utilize hardware resources in an
      efficient manner.  Before <acronym>ACPI</acronym> was
      introduced, it was difficult and inflexible for operating
      systems to manage the power usage and thermal properties of a
      system.  The hardware was managed by the <acronym>BIOS</acronym>
      and thus the user had less control and visibility into the power
      management settings.  Some limited configurability was available
      via <emphasis>Advanced Power Management (APM)</emphasis>.  Power
      and resource management is one of the key components of a modern
      operating system.  For example, you may want an operating system
      to monitor system limits (and possibly alert you) in case your
      system temperature increased unexpectedly.</para>

    <para>In this section of the &os; Handbook, we will provide
      comprehensive information about <acronym>ACPI</acronym>.
      References will be provided for further reading at the
      end.</para>

    <sect2 id="acpi-intro">
      <title>What Is ACPI?</title>

      <indexterm>
	<primary>ACPI</primary>
      </indexterm>

      <indexterm>
	<primary>APM</primary>
      </indexterm>

      <para>Advanced Configuration and Power Interface
	(<acronym>ACPI</acronym>) is a standard written by an alliance
	of vendors to provide a standard interface for hardware
	resources and power management (hence the name).  It is a key
	element in <emphasis>Operating System-directed configuration
	and Power Management</emphasis>, i.e.: it provides more
	control and flexibility to the operating system
	(<acronym>OS</acronym>).  Modern systems
	<quote>stretched</quote> the limits of the current Plug and
	Play interfaces prior to the introduction of
	<acronym>ACPI</acronym>.  <acronym>ACPI</acronym> is the
	direct successor to <acronym>APM</acronym> (Advanced Power
	Management).</para>
    </sect2>

    <sect2 id="acpi-old-spec">
      <title>Shortcomings of Advanced Power Management (APM)</title>

      <para>The <emphasis>Advanced Power Management (APM)</emphasis>
	facility controls the power usage of a system based on its
	activity.  The APM BIOS is supplied by the (system) vendor and
	it is specific to the hardware platform.  An APM driver in the
	OS mediates access to the
	<emphasis>APM Software Interface</emphasis>, which allows
	management of power levels.  APM should still be used for
	systems manufactured at or before the year 2000.</para>

      <para>There are four major problems in APM.  Firstly, power
	management is done by the (vendor-specific) BIOS, and the OS
	does not have any knowledge of it.  One example of this, is
	when the user sets idle-time values for a hard drive in the
	APM BIOS, that when exceeded, it (BIOS) would spin down the
	hard drive, without the consent of the OS.  Secondly, the APM
	logic is embedded in the BIOS, and it operates outside the
	scope of the OS.  This means users can only fix problems in
	their APM BIOS by flashing a new one into the ROM; which is a
	very dangerous procedure with the potential to leave the
	system in an unrecoverable state if it fails.  Thirdly, APM is
	a vendor-specific technology, which means that there is a lot
	of parity (duplication of efforts) and bugs found in one
	vendor's BIOS, may not be solved in others.  Last but not the
	least, the APM BIOS did not have enough room to implement a
	sophisticated power policy, or one that can adapt very well to
	the purpose of the machine.</para>

      <para><emphasis>Plug and Play BIOS (PNPBIOS)</emphasis> was
	unreliable in many situations.  PNPBIOS is 16-bit technology,
	so the OS has to use 16-bit emulation in order to
	<quote>interface</quote> with PNPBIOS methods.</para>

      <para>The &os; <acronym>APM</acronym> driver is documented in
	the &man.apm.4; manual page.</para>
    </sect2>

    <sect2 id="acpi-config">
      <title>Configuring <acronym>ACPI</acronym></title>

      <para>The <filename>acpi.ko</filename> driver is loaded by
	default at start up by the &man.loader.8; and should
	<emphasis>not</emphasis> be compiled into the kernel.  The
	reasoning behind this is that modules are easier to work with,
	say if switching to another <filename>acpi.ko</filename>
	without doing a kernel rebuild.  This has the advantage of
	making testing easier.  Another reason is that starting
	<acronym>ACPI</acronym> after a system has been brought up
	often doesn't work well.  If you are experiencing problems,
	you can disable <acronym>ACPI</acronym> altogether.  This
	driver should not and can not be unloaded because the system
	bus uses it for various hardware interactions.
	<acronym>ACPI</acronym> can be disabled by setting
	<literal>hint.acpi.0.disabled="1"</literal> in
	<filename>/boot/loader.conf</filename> or at the
	&man.loader.8; prompt.</para>

      <note>
	<para><acronym>ACPI</acronym> and <acronym>APM</acronym>
	  cannot coexist and should be used separately.  The last one
	  to load will terminate if the driver notices the other
	  running.</para>
      </note>

      <para><acronym>ACPI</acronym> can be used to put the system into
	a sleep mode with &man.acpiconf.8;, the <option>-s</option>
	flag, and a <literal>1-5</literal> option.  Most users will
	only need <literal>1</literal> or <literal>3</literal>
	(suspend to RAM).  Option <literal>5</literal> will do a
	soft-off which is the same action as:</para>

      <screen>&prompt.root; <userinput>halt -p</userinput></screen>

      <para>Other options are available via &man.sysctl.8;.  Check out
	the &man.acpi.4; and &man.acpiconf.8; manual pages for more
	information.</para>
    </sect2>
  </sect1>

  <sect1 id="ACPI-debug">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Nate</firstname>
	  <surname>Lawson</surname>
	  <contrib>Written by </contrib>
	</author>
      </authorgroup>
      <authorgroup>
	<author>
	  <firstname>Peter</firstname>
	  <surname>Schultz</surname>
	  <contrib>With contributions from </contrib>
	</author>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	</author>
      </authorgroup>
    </sect1info>

    <title>Using and Debugging &os; <acronym>ACPI</acronym></title>

    <indexterm>
      <primary>ACPI</primary>
      <secondary>problems</secondary>
    </indexterm>

    <para><acronym>ACPI</acronym> is a fundamentally new way of
      discovering devices, managing power usage, and providing
      standardized access to various hardware previously managed by
      the <acronym>BIOS</acronym>.  Progress is being made toward
      <acronym>ACPI</acronym> working on all systems, but bugs in some
      motherboards' <firstterm><acronym>ACPI</acronym> Machine
      Language</firstterm> (<acronym>AML</acronym>) bytecode,
      incompleteness in &os;'s kernel subsystems, and bugs in the
      &intel; <acronym>ACPI-CA</acronym> interpreter continue to
      appear.</para>

    <para>This document is intended to help you assist the &os;
      <acronym>ACPI</acronym> maintainers in identifying the root
      cause of problems you observe and debugging and developing a
      solution.  Thanks for reading this and we hope we can solve your
      system's problems.</para>

    <sect2 id="ACPI-submitdebug">
      <title>Submitting Debugging Information</title>

      <note>
	<para>Before submitting a problem, be sure you are running the
	  latest <acronym>BIOS</acronym> version and, if available,
	  embedded controller firmware version.</para>
      </note>

      <para>For those of you that want to submit a problem right away,
	please send the following information to
	<ulink url="mailto:freebsd-acpi@FreeBSD.org">
	  freebsd-acpi@FreeBSD.org</ulink>:</para>

      <itemizedlist>
	<listitem>
	  <para>Description of the buggy behavior, including system
	    type and model and anything that causes the bug to appear.
	    Also, please note as accurately as possible when the bug
	    began occurring if it is new for you.</para>
	</listitem>

	<listitem>
	  <para>The &man.dmesg.8; output after
	    <command>boot -v</command>, including any error messages
	    generated by you exercising the bug.</para>
	</listitem>

	<listitem>
	  <para>The &man.dmesg.8; output from
	    <command>boot -v</command> with <acronym>ACPI</acronym>
	    disabled, if disabling it helps fix the problem.</para>
	</listitem>

	<listitem>
	  <para>Output from <command>sysctl hw.acpi</command>.  This
	    is also a good way of figuring out what features your
	    system offers.</para>
	</listitem>

	<listitem>
	  <para><acronym>URL</acronym> where your
	    <firstterm><acronym>ACPI</acronym> Source
	      Language</firstterm> (<acronym>ASL</acronym>) can be
	    found.  Do <emphasis>not</emphasis> send the
	    <acronym>ASL</acronym> directly to the list as it can be
	    very large.  Generate a copy of your
	    <acronym>ASL</acronym> by running this command:</para>

	  <screen>&prompt.root; <userinput>acpidump -dt &gt; <replaceable>name</replaceable>-<replaceable>system</replaceable>.asl</userinput></screen>

	  <para>(Substitute your login name for
	    <replaceable>name</replaceable> and manufacturer/model for
	    <replaceable>system</replaceable>.  Example:
	    <filename>njl-FooCo6000.asl</filename>)</para>
	</listitem>
      </itemizedlist>

      <para>Most of the developers watch the &a.current;
	but please submit problems to &a.acpi.name; to be sure it is
	seen.  Please be patient, all of us have full-time jobs
	elsewhere.  If your bug is not immediately apparent, we will
	probably ask you to submit a <acronym>PR</acronym> via
	&man.send-pr.1;.  When entering a <acronym>PR</acronym>,
	please include the same information as requested above.  This
	will help us track the problem and resolve it.  Do not send a
	<acronym>PR</acronym> without emailing &a.acpi.name; first as
	we use <acronym>PR</acronym>s as reminders of existing
	problems, not a reporting mechanism.  It is likely that your
	problem has been reported by someone before.</para>
    </sect2>

    <sect2 id="ACPI-background">
      <title>Background</title>

      <indexterm>
	<primary>ACPI</primary>
      </indexterm>

      <para><acronym>ACPI</acronym> is present in all modern computers
	that conform to the ia32 (x86), ia64 (Itanium), and amd64
	(AMD) architectures.  The full standard has many features
	including <acronym>CPU</acronym> performance management, power
	planes control, thermal zones, various battery systems,
	embedded controllers, and bus enumeration.  Most systems
	implement less than the full standard.  For instance, a
	desktop system usually only implements the bus enumeration
	parts while a laptop might have cooling and battery management
	support as well.  Laptops also have suspend and resume, with
	their own associated complexity.</para>

      <para>An <acronym>ACPI</acronym>-compliant system has various
	components.  The <acronym>BIOS</acronym> and chipset vendors
	provide various fixed tables (e.g., <acronym>FADT</acronym>)
	in memory that specify things like the <acronym>APIC</acronym>
	map (used for <acronym>SMP</acronym>), config registers, and
	simple configuration values.  Additionally, a table of
	bytecode (the <firstterm>Differentiated System Description
	  Table</firstterm> <acronym>DSDT</acronym>) is provided that
	specifies a tree-like name space of devices and
	methods.</para>

      <para>The <acronym>ACPI</acronym> driver must parse the fixed
	tables, implement an interpreter for the bytecode, and modify
	device drivers and the kernel to accept information from the
	<acronym>ACPI</acronym> subsystem.  For &os;, &intel; has
	provided an interpreter (<acronym>ACPI-CA</acronym>) that is
	shared with Linux and NetBSD.  The path to the
	<acronym>ACPI-CA</acronym> source code is <filename
	  class="directory">src/sys/contrib/dev/acpica</filename>.
	The glue code that allows <acronym>ACPI-CA</acronym> to work
	on &os; is in
	<filename class="directory">src/sys/dev/acpica/Osd</filename>.
	Finally, drivers that implement various
	<acronym>ACPI</acronym> devices are found in <filename
	  class="directory">src/sys/dev/acpica</filename>.</para>
    </sect2>

    <sect2 id="ACPI-comprob">
      <title>Common Problems</title>

      <indexterm>
	<primary>ACPI</primary>
	<secondary>problems</secondary>
      </indexterm>

      <para>For <acronym>ACPI</acronym> to work correctly, all the
	parts have to work correctly.  Here are some common problems,
	in order of frequency of appearance, and some possible
	workarounds or fixes.</para>

      <sect3>
	<title>Mouse Issues</title>

	<para>In some cases, resuming from a suspend operation will
	  cause the mouse to fail.  A known work around is to add
	  <literal>hint.psm.0.flags="0x3000"</literal> to the
	  <filename>/boot/loader.conf</filename> file.  If this does
	  not work then please consider sending a bug report as
	  described above.</para>
      </sect3>

      <sect3>
	<title>Suspend/Resume</title>

	<para><acronym>ACPI</acronym> has three suspend to
	  <acronym>RAM</acronym> (<acronym>STR</acronym>) states,
	  <literal>S1</literal>-<literal>S3</literal>, and one suspend
	  to disk state (<literal>STD</literal>), called
	  <literal>S4</literal>.  <literal>S5</literal> is
	  <quote>soft off</quote> and is the normal state your system
	  is in when plugged in but not powered up.
	  <literal>S4</literal> can actually be implemented two
	  separate ways.  <literal>S4</literal><acronym>BIOS</acronym>
	  is a <acronym>BIOS</acronym>-assisted suspend to disk.
	  <literal>S4</literal><acronym>OS</acronym> is implemented
	  entirely by the operating system.</para>

	<para>Start by checking <command>sysctl hw.acpi</command>
	  for the suspend-related items.  Here are the results for a
	  Thinkpad:</para>

	<screen>hw.acpi.supported_sleep_state: S3 S4 S5
hw.acpi.s4bios: 0</screen>

	<para>This means that we can use
	  <command>acpiconf -s</command> to test
	  <literal>S3</literal>,
	  <literal>S4</literal><acronym>OS</acronym>, and
	  <literal>S5</literal>.  If <option>s4bios</option> was one
	  (<literal>1</literal>), we would have
	  <literal>S4</literal><acronym>BIOS</acronym> support instead
	  of <literal>S4</literal> <acronym>OS</acronym>.</para>

	<para>When testing suspend/resume, start with
	  <literal>S1</literal>, if supported.  This state is most
	  likely to work since it does not require much driver
	  support.  No one has implemented <literal>S2</literal> but
	  if you have it, it is similar to <literal>S1</literal>.  The
	  next thing to try is <literal>S3</literal>.  This is the
	  deepest <acronym>STR</acronym> state and requires a lot of
	  driver support to properly reinitialize your hardware.  If
	  you have problems resuming, feel free to email the
	  &a.acpi.name; list but do not expect the problem to be
	  resolved since there are a lot of drivers/hardware that need
	  more testing and work.</para>

	<para>A common problem with suspend/resume is that many device
	  drivers do not save, restore, or reinitialize their
	  firmware, registers, or device memory properly.  As a first
	  attempt at debugging the problem, try:</para>

	<screen>&prompt.root; <userinput>sysctl debug.bootverbose=1</userinput>
&prompt.root; <userinput>sysctl debug.acpi.suspend_bounce=1</userinput>
&prompt.root; <userinput>acpiconf -s 3</userinput></screen>

	<para>This test emulates suspend/resume cycle of all device
	  drivers without actually going into <literal>S3</literal>
	  state.  In some cases, you can easily catch problems with
	  this method (e.g., losing firmware state, device watchdog
	  time out, and retrying forever).  Note that the system will
	  not really enter <literal>S3</literal> state, which means
	  devices may not lose power, and many will work fine even if
	  suspend/resume methods are totally missing, unlike real
	  <literal>S3</literal> state.</para>

	<para>Harder cases require additional hardware, i.e., serial
	  port/cable for serial console or Firewire port/cable for
	  &man.dcons.4;, and kernel debugging skills.</para>

	<para>To help isolate the problem, remove as many drivers from
	  your kernel as possible.  If it works, you can narrow down
	  which driver is the problem by loading drivers until it
	  fails again.  Typically binary drivers like
	  <filename>nvidia.ko</filename>, X11 display drivers, and
	  <acronym>USB</acronym> will have the most problems while
	  Ethernet interfaces usually work fine.  If you can properly
	  load/unload the drivers, you can automate this by putting
	  the appropriate commands in
	  <filename>/etc/rc.suspend</filename> and
	  <filename>/etc/rc.resume</filename>.  There is a
	  commented-out example for unloading and loading a driver.
	  Try setting <option>hw.acpi.reset_video</option> to zero
	  (<literal>0</literal>) if your display is messed up after
	  resume.  Try setting longer or shorter values for
	  <option>hw.acpi.sleep_delay</option> to see if that
	  helps.</para>

	<para>Another thing to try is load a recent Linux distribution
	  with <acronym>ACPI</acronym> support and test their
	  suspend/resume support on the same hardware.  If it works on
	  Linux, it is likely a &os; driver problem and narrowing down
	  which driver causes the problems will help us fix the
	  problem.  Note that the <acronym>ACPI</acronym> maintainers
	  do not usually maintain other drivers (e.g., sound,
	  <acronym>ATA</acronym>, etc.) so any work done on tracking
	  down a driver problem should probably eventually be posted
	  to the &a.current.name; list and mailed to the driver
	  maintainer.  If you are feeling adventurous, go ahead and
	  start putting some debugging &man.printf.3;s in a
	  problematic driver to track down where in its resume
	  function it hangs.</para>

	<para>Finally, try disabling <acronym>ACPI</acronym> and
	  enabling <acronym>APM</acronym> instead.  If suspend/resume
	  works with <acronym>APM</acronym>, you may be better off
	  sticking with <acronym>APM</acronym>, especially on older
	  hardware (pre-2000).  It took vendors a while to get
	  <acronym>ACPI</acronym> support correct and older hardware
	  is more likely to have <acronym>BIOS</acronym> problems with
	  <acronym>ACPI</acronym>.</para>
      </sect3>

      <sect3>
	<title>System Hangs (Temporary or Permanent)</title>

	<para>Most system hangs are a result of lost interrupts or an
	  interrupt storm.  Chipsets have a lot of problems based on
	  how the <acronym>BIOS</acronym> configures interrupts before
	  boot, correctness of the <acronym>APIC</acronym>
	  (<acronym>MADT</acronym>) table, and routing of the
	  <firstterm>System Control Interrupt</firstterm>
	  (<acronym>SCI</acronym>).</para>

	<indexterm>
	  <primary>interrupt storms</primary>
	</indexterm>

	<para>Interrupt storms can be distinguished from lost
	  interrupts by checking the output of

	  <command>vmstat -i</command> and looking at the line that
	  has <literal>acpi0</literal>.  If the counter is increasing
	  at more than a couple per second, you have an interrupt
	  storm.  If the system appears hung, try breaking to
	  <acronym>DDB</acronym> (<keycombo action="simul">
	    <keycap>CTRL</keycap>
	    <keycap>ALT</keycap>
	    <keycap>ESC</keycap>
	  </keycombo> on console) and type
	  <literal>show interrupts</literal>.</para>

	<indexterm>
	  <primary>APIC</primary>
	  <secondary>disabling</secondary>
	</indexterm>

	<para>Your best hope when dealing with interrupt problems is
	  to try disabling <acronym>APIC</acronym> support with
	  <literal>hint.apic.0.disabled="1"</literal> in
	  <filename>loader.conf</filename>.</para>
      </sect3>

      <sect3>
	<title>Panics</title>

	<para>Panics are relatively rare for <acronym>ACPI</acronym>
	  and are the top priority to be fixed.  The first step is to
	  isolate the steps to reproduce the panic (if possible) and
	  get a backtrace.  Follow the advice for enabling
	  <literal>options DDB</literal> and setting up a serial
	  console (see <xref linkend="serialconsole-ddb"/>) or setting
	  up a &man.dump.8; partition.  You can get a backtrace in
	  <acronym>DDB</acronym> with <literal>tr</literal>.  If you
	  have to handwrite the backtrace, be sure to at least get the
	  lowest five (5) and top five (5) lines in the trace.</para>

	<para>Then, try to isolate the problem by booting with
	  <acronym>ACPI</acronym> disabled.  If that works, you can
	  isolate the <acronym>ACPI</acronym> subsystem by using
	  various values of <option>debug.acpi.disable</option>.  See
	  the &man.acpi.4; manual page for some examples.</para>
      </sect3>

      <sect3>
	<title>System Powers Up After Suspend or Shutdown</title>

	<para>First, try setting
	  <literal>hw.acpi.disable_on_poweroff="0"</literal>
	  in &man.loader.conf.5;.  This keeps <acronym>ACPI</acronym>
	  from disabling various events during the shutdown process.
	  Some systems need this value set to <literal>1</literal>
	  (the default) for the same reason.  This usually fixes the
	  problem of a system powering up spontaneously after a
	  suspend or poweroff.</para>
      </sect3>

      <sect3>
	<title>Other Problems</title>

	<para>If you have other problems with <acronym>ACPI</acronym>
	  (working with a docking station, devices not detected,
	  etc.), please email a description to the mailing list as
	  well; however, some of these issues may be related to
	  unfinished parts of the <acronym>ACPI</acronym> subsystem so
	  they might take a while to be implemented.  Please be
	  patient and prepared to test patches we may send you.</para>
      </sect3>
    </sect2>

    <sect2 id="ACPI-aslanddump">
      <title><acronym>ASL</acronym>, <command>acpidump</command>, and
	<acronym>IASL</acronym></title>

      <indexterm>
	<primary>ACPI</primary>
	<secondary>ASL</secondary>
      </indexterm>

      <para>The most common problem is the <acronym>BIOS</acronym>
	vendors providing incorrect (or outright buggy!) bytecode.
	This is usually manifested by kernel console messages like
	this:</para>

      <screen>ACPI-1287: *** Error: Method execution failed [\\_SB_.PCI0.LPC0.FIGD._STA] \\
(Node 0xc3f6d160), AE_NOT_FOUND</screen>

      <para>Often, you can resolve these problems by updating your
	<acronym>BIOS</acronym> to the latest revision.  Most console
	messages are harmless but if you have other problems like
	battery status not working, they are a good place to start
	looking for problems in the <acronym>AML</acronym>.  The
	bytecode, known as <acronym>AML</acronym>, is compiled from a
	source language called <acronym>ASL</acronym>.  The
	<acronym>AML</acronym> is found in the table known as the
	<acronym>DSDT</acronym>.  To get a copy of your
	<acronym>ASL</acronym>, use &man.acpidump.8;.  You should use
	both the <option>-t</option> (show contents of the fixed
	tables) and <option>-d</option> (disassemble
	<acronym>AML</acronym> to <acronym>ASL</acronym>) options.
	See the <link linkend="ACPI-submitdebug">Submitting Debugging
	  Information</link> section for an example syntax.</para>

      <para>The simplest first check you can do is to recompile your
	<acronym>ASL</acronym> to check for errors.  Warnings can
	usually be ignored but errors are bugs that will usually
	prevent <acronym>ACPI</acronym> from working correctly.  To
	recompile your <acronym>ASL</acronym>, issue the following
	command:</para>

      <screen>&prompt.root; <userinput>iasl your.asl</userinput></screen>
    </sect2>

    <sect2 id="ACPI-fixasl">
      <title>Fixing Your <acronym>ASL</acronym></title>

      <indexterm>
	<primary>ACPI</primary>
	<secondary>ASL</secondary>
      </indexterm>

      <para>In the long run, our goal is for almost everyone to have
	<acronym>ACPI</acronym> work without any user intervention.
	At this point, however, we are still developing workarounds
	for common mistakes made by the <acronym>BIOS</acronym>
	vendors.  The &microsoft; interpreter
	(<filename>acpi.sys</filename> and
	<filename>acpiec.sys</filename>) does not strictly check for
	adherence to the standard, and thus many
	<acronym>BIOS</acronym> vendors who only test
	<acronym>ACPI</acronym> under &windows; never fix their
	<acronym>ASL</acronym>.  We hope to continue to identify and
	document exactly what non-standard behavior is allowed by
	&microsoft;'s interpreter and replicate it so &os; can work
	without forcing users to fix the <acronym>ASL</acronym>.  As a
	workaround and to help us identify behavior, you can fix the
	<acronym>ASL</acronym> manually.  If this works for you,
	please send a &man.diff.1; of the old and new
	<acronym>ASL</acronym> so we can possibly work around the
	buggy behavior in <acronym>ACPI-CA</acronym> and thus make
	your fix unnecessary.</para>

      <indexterm>
	<primary>ACPI</primary>
	<secondary>error messages</secondary>
      </indexterm>

      <para>Here is a list of common error messages, their cause, and
	how to fix them:</para>

      <sect3>
	<title>_OS Dependencies</title>

	<para>Some <acronym>AML</acronym> assumes the world consists
	  of various &windows; versions.  You can tell &os; to claim
	  it is any <acronym>OS</acronym> to see if this fixes
	  problems you may have.  An easy way to override this is to
	  set <literal>hw.acpi.osname="Windows 2001"</literal> in
	  <filename>/boot/loader.conf</filename> or other similar
	  strings you find in the <acronym>ASL</acronym>.</para>
      </sect3>

      <sect3>
	<title>Missing Return Statements</title>

	<para>Some methods do not explicitly return a value as the
	  standard requires.  While <acronym>ACPI-CA</acronym>
	  does not handle this, &os; has a workaround that allows it
	  to return the value implicitly.  You can also add explicit
	  Return statements where required if you know what value
	  should be returned.  To force <command>iasl</command> to
	  compile the <acronym>ASL</acronym>, use the
	  <option>-f</option> flag.</para>
      </sect3>

      <sect3>
	<title>Overriding the Default <acronym>AML</acronym></title>

	<para>After you customize <filename>your.asl</filename>, you
	  will want to compile it, run:</para>

	<screen>&prompt.root; <userinput>iasl your.asl</userinput></screen>

	<para>You can add the <option>-f</option> flag to force
	  creation of the <acronym>AML</acronym>, even if there are
	  errors during compilation.  Remember that some errors (e.g.,
	  missing Return statements) are automatically worked around
	  by the interpreter.</para>

	<para><filename>DSDT.aml</filename> is the default output
	  filename for <command>iasl</command>.  You can load this
	  instead of your <acronym>BIOS</acronym>'s buggy copy (which
	  is still present in flash memory) by editing
	  <filename>/boot/loader.conf</filename> as
	  follows:</para>

	<programlisting>acpi_dsdt_load="YES"
acpi_dsdt_name="/boot/DSDT.aml"</programlisting>

	<para>Be sure to copy your <filename>DSDT.aml</filename> to
	  the <filename class="directory">/boot</filename>
	  directory.</para>
      </sect3>
    </sect2>

    <sect2 id="ACPI-debugoutput">
      <title>Getting Debugging Output from
	<acronym>ACPI</acronym></title>

      <indexterm>
	<primary>ACPI</primary>
	<secondary>problems</secondary>
      </indexterm>

      <indexterm>
	<primary>ACPI</primary>
	<secondary>debugging</secondary>
      </indexterm>

      <para>The <acronym>ACPI</acronym> driver has a very flexible
	debugging facility.  It allows you to specify a set of
	subsystems as well as the level of verbosity.  The subsystems
	you wish to debug are specified as <quote>layers</quote> and
	are broken down into <acronym>ACPI-CA</acronym> components
	(ACPI_ALL_COMPONENTS) and <acronym>ACPI</acronym> hardware
	support (ACPI_ALL_DRIVERS).  The verbosity of debugging output
	is specified as the <quote>level</quote> and ranges from
	ACPI_LV_ERROR (just report errors) to ACPI_LV_VERBOSE
	(everything).  The <quote>level</quote> is a bitmask so
	multiple options can be set at once, separated by spaces.  In
	practice, you will want to use a serial console to log the
	output if it is so long it flushes the console message buffer.
	A full list of the individual layers and levels is found in
	the &man.acpi.4; manual page.</para>

      <para>Debugging output is not enabled by default.  To enable it,
	add <literal>options ACPI_DEBUG</literal> to your kernel
	configuration file if <acronym>ACPI</acronym> is compiled into
	the kernel.  You can add <literal>ACPI_DEBUG=1</literal> to
	your <filename>/etc/make.conf</filename> to enable it
	globally.  If it is a module, you can recompile just your
	<filename>acpi.ko</filename> module as follows:</para>

      <screen>&prompt.root; <userinput>cd /sys/modules/acpi/acpi
&amp;&amp; make clean &amp;&amp;
make ACPI_DEBUG=1</userinput></screen>

      <para>Install <filename>acpi.ko</filename> in
	<filename class="directory">/boot/kernel</filename> and add
	your desired level and layer to
	<filename>loader.conf</filename>.  This example enables debug
	messages for all <acronym>ACPI-CA</acronym> components and all
	<acronym>ACPI</acronym> hardware drivers
	(<acronym>CPU</acronym>, <acronym>LID</acronym>, etc.).  It
	will only output error messages, the least verbose
	level.</para>

      <programlisting>debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
debug.acpi.level="ACPI_LV_ERROR"</programlisting>

      <para>If the information you want is triggered by a specific
	event (say, a suspend and then resume), you can leave out
	changes to <filename>loader.conf</filename> and instead use
	<command>sysctl</command> to specify the layer and level after
	booting and preparing your system for the specific event.  The
	<command>sysctl</command>s are named the same as the tunables
	in <filename>loader.conf</filename>.</para>
    </sect2>

    <sect2 id="ACPI-References">
      <title>References</title>

      <para>More information about <acronym>ACPI</acronym> may be
	found in the following locations:</para>

      <itemizedlist>
	<listitem>
	  <para>The &a.acpi;</para>
	</listitem>

	<listitem>
	  <para>The <acronym>ACPI</acronym> Mailing List Archives
	    <ulink
	      url="http://lists.freebsd.org/pipermail/freebsd-acpi/"></ulink></para>
	</listitem>

	<listitem>
	  <para>The old <acronym>ACPI</acronym> Mailing List Archives
	    <ulink
	      url="http://home.jp.FreeBSD.org/mail-list/acpi-jp/"></ulink></para>
	</listitem>

	<listitem>
	  <para>The <acronym>ACPI</acronym> 2.0 Specification
	    <ulink url="http://acpi.info/spec.htm"></ulink></para>
	</listitem>

	<listitem>
	  <para>&os; Manual pages: &man.acpi.4;,
	    &man.acpi.thermal.4;, &man.acpidump.8;, &man.iasl.8;,
	    &man.acpidb.8;</para>
	</listitem>

	<listitem>
	  <para><ulink
	      url="http://www.cpqlinux.com/acpi-howto.html#fix_broken_dsdt">
	    <acronym>DSDT</acronym> debugging resource</ulink>.
	    (Uses Compaq as an example but generally useful.)</para>
	</listitem>
      </itemizedlist>
    </sect2>
  </sect1>
</chapter>