1221 lines
47 KiB
Text
1221 lines
47 KiB
Text
<!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
|
|
<!ENTITY % articles.ent PUBLIC "-//FreeBSD//ENTITIES DocBook FreeBSD Articles Entity Set//EN">
|
|
%articles.ent;
|
|
]>
|
|
|
|
<article>
|
|
<articleinfo>
|
|
<title>Package Building Procedures</title>
|
|
|
|
<authorgroup>
|
|
<corpauthor>The &os; Ports Management Team</corpauthor>
|
|
</authorgroup>
|
|
|
|
<pubdate>$FreeBSD$</pubdate>
|
|
|
|
<copyright>
|
|
<year>2003</year>
|
|
<year>2004</year>
|
|
<year>2005</year>
|
|
<year>2006</year>
|
|
<year>2007</year>
|
|
<year>2008</year>
|
|
<holder role="mailto:portmgr@FreeBSD.org">The &os; Ports
|
|
Management Team</holder>
|
|
</copyright>
|
|
|
|
<legalnotice id="trademarks" role="trademarks">
|
|
&tm-attrib.freebsd;
|
|
&tm-attrib.intel;
|
|
&tm-attrib.sparc;
|
|
&tm-attrib.general;
|
|
</legalnotice>
|
|
</articleinfo>
|
|
|
|
<sect1 id="intro">
|
|
<title>Introduction and Conventions</title>
|
|
|
|
<para>In order to provide pre-compiled binaries of third-party
|
|
applications for &os;, the Ports Collection is regularly
|
|
built on one of the <quote>Package Building Clusters.</quote>
|
|
Currently, the main cluster in use is at
|
|
<ulink url="http://pointyhat.FreeBSD.org"></ulink>.</para>
|
|
|
|
<para>Most of the package building magic occurs under the
|
|
<filename>/var/portbuild</filename> directory. Unless
|
|
otherwise specified, all paths will be relative to
|
|
this location. <replaceable>${arch}</replaceable> will
|
|
be used to specify one of the package architectures
|
|
(amd64, &i386;, and &sparc64;), and
|
|
<replaceable>${branch}</replaceable> will be used
|
|
to specify the build branch (6, 6-exp, 7, 7-exp, 8, 8-exp).
|
|
</para>
|
|
|
|
<note>
|
|
<para>Packages are no longer built for Release 4 or 5, nor
|
|
for the alpha nor ia64 architectures.</para>
|
|
</note>
|
|
|
|
<para>The scripts that control all of this live in
|
|
<filename>/var/portbuild/scripts/</filename>. These are the
|
|
checked-out copies from
|
|
<filename>/usr/ports/Tools/portbuild/scripts/</filename>.</para>
|
|
|
|
<para>Typically, incremental builds are done that use previous
|
|
packages as dependendencies; this takes less time, and puts less
|
|
load on the mirrors. Full builds are usually only done:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem><para>right after release time, for the
|
|
<literal>-STABLE</literal> branches</para></listitem>
|
|
|
|
<listitem><para>every month or so, for <literal>-CURRENT</literal>
|
|
</para></listitem>
|
|
|
|
<listitem><para>for experimental builds</para></listitem>
|
|
</itemizedlist>
|
|
</sect1>
|
|
|
|
<sect1 id="management">
|
|
<title>Build Client Management</title>
|
|
|
|
<para>The &i386; clients currently
|
|
netboot from <hostid>pointyhat</hostid>; the other clients
|
|
are self-hosted. In all cases they set themselves
|
|
up at boot-time to prepare to build packages.</para>
|
|
|
|
<para>Although connected nodes are supported,
|
|
<replaceable>disconnected</replaceable> cluster node support has
|
|
been added. A disconnected node is
|
|
one that does not mount the cluster master via NFS. It could be
|
|
a remote node, for example. The cluster master rsync's the
|
|
interesting data (ports and src trees, bindist tarballs,
|
|
scripts, etc.) to disconnected nodes during the node-setup
|
|
phase. Then, the disconnected portbuild directory is
|
|
nullfs-mounted for chroot builds.</para>
|
|
|
|
<para>The
|
|
<username>ports-<replaceable>${arch}</replaceable></username>
|
|
user can &man.ssh.1; to the client nodes to monitor them.
|
|
Use <command>sudo</command> and check the
|
|
<hostid>portbuild.<replaceable>hostname</replaceable>.conf</hostid>
|
|
for the user and access details.</para>
|
|
|
|
<para>The <command>scripts/allgohans</command> script can
|
|
be used to run a command on all of the
|
|
<replaceable>${arch}</replaceable> clients.</para>
|
|
|
|
<para>The <command>scripts/checkmachines</command> script
|
|
is used to monitor the load on all the nodes of the
|
|
build cluster, and schedule which nodes build which ports.
|
|
This script is not very robust, and has a tendency to die.
|
|
It is best to start up this script on the build master
|
|
(e.g. <hostid>pointyhat</hostid>)
|
|
after boot time using a &man.while.1; loop.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="setup">
|
|
<title>Chroot Build Environment Setup</title>
|
|
|
|
<para>Package builds are performed in a
|
|
<literal>chroot</literal> populated by the
|
|
<filename>portbuild</filename> script using the
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/builds/<replaceable>${buildid}</replaceable>/bindist.tar</filename>
|
|
file.</para>
|
|
|
|
<para>The following command builds a world from the
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/src</filename>
|
|
tree and installs it into
|
|
<replaceable>${worlddir}</replaceable>. The tree will
|
|
be updated first unless <literal>-nocvs</literal> is
|
|
specified.</para>
|
|
|
|
<screen>/var/portbuild&prompt.root; <userinput>scripts/makeworld <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable> <replaceable>${buildid}</replaceable> [-nocvs]</userinput></screen>
|
|
|
|
<para>The <filename>bindist.tar</filename> tarball is created from the
|
|
previously installed world by the <command>mkbindist</command>
|
|
script. It should be run as <username>root</username> with the following
|
|
command:</para>
|
|
|
|
<screen>/var/portbuild&prompt.root; <userinput>scripts/mkbindist <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable> <replaceable>${buildid}</replaceable></userinput></screen>
|
|
|
|
<para>The per-machine tarballs are located in
|
|
<filename><replaceable>${arch}</replaceable>/clients</filename>.</para>
|
|
|
|
<para>The <filename>bindist.tar</filename> file is extracted
|
|
onto each client at client boot time, and at the start of
|
|
each pass of the <command>dopackages</command>
|
|
script.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="starting">
|
|
<title>Starting the Build</title>
|
|
|
|
<para>Several separate builds for each architecture - branch combination
|
|
are supported. All data private to a build (ports tree, src tree,
|
|
packages, distfiles, log files, bindist, Makefile, etc) are located under
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/builds/<replaceable>${buildid}</replaceable></filename>.
|
|
The last created build can be alternatively referenced under buildid
|
|
<literal>latest</literal>, the one before is called
|
|
<literal>previous</literal>.</para>
|
|
|
|
<para>New builds are cloned from the <literal>latest</literal>, which is
|
|
fast since it uses ZFS.</para>
|
|
|
|
<sect2 id="build-dopackages">
|
|
<title><command>dopackages</command> scripts</title>
|
|
|
|
<para>The <filename>scripts/dopackages*</filename> scripts
|
|
are used to perform the builds. Most useful are:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><command>dopackages.6</command> - Perform
|
|
a 6.X build
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><command>dopackages.6-exp</command> - Perform
|
|
a 6.X build with experimental patches
|
|
(6-exp branch)
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><command>dopackages.7</command> - Perform
|
|
a 7.X build
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><command>dopackages.7-exp</command> - Perform
|
|
a 7.X build with experimental patches
|
|
(7-exp branch)
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><command>dopackages.8</command> - Perform
|
|
a 8.X build
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><command>dopackages.8-exp</command> - Perform
|
|
a 8.X build with experimental patches
|
|
(8-exp branch)
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>These are wrappers around <command>dopackages</command>,
|
|
and are all symlinked to <command>dopackages.wrapper</command>.
|
|
New branch wrapper scripts can be created by symlinking
|
|
<command>dopackages.${branch}</command> to
|
|
<command>dopackages.wrapper</command>. These scripts
|
|
take a number of arguments. For example:</para>
|
|
|
|
<screen><command>dopackages.6 <replaceable>${arch}</replaceable> <literal>[-options]</literal></command></screen>
|
|
|
|
<para><literal>[-options]</literal> may be zero or more of the
|
|
following:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><literal>-keep</literal> - Do not delete this build in the
|
|
future, when it would be normally deleted as part of the
|
|
<literal>latest</literal> - <literal>previous</literal> cycle.
|
|
Don't forget to clean it up manually when you no longer need it.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-nofinish</literal> - Do not perform
|
|
post-processing once the build is complete. Useful
|
|
if you expect that the build will need to be restarted
|
|
once it finishes. If you use this option, don't forget to cleanup
|
|
the clients when you don't need the build anymore.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-finish</literal> - Perform
|
|
post-processing only.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-nocleanup</literal> - By default, when the
|
|
<literal>-finish</literal> stage of the build is complete, the build
|
|
data will be deleted from the clients. This option will prevent
|
|
that.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-restart</literal> - Restart an interrupted
|
|
(or non-<literal>finish</literal>ed) build from the
|
|
beginning. Ports that failed on the previous build will
|
|
be rebuilt.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-continue</literal> - Restart an interrupted
|
|
(or non-<literal>finish</literal>ed) build. Will not
|
|
rebuild ports that failed on the previous build.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-incremental</literal> - Compare the
|
|
interesting fields of the new
|
|
<literal>INDEX</literal> with the previous one,
|
|
remove packages and log files for the old ports that
|
|
have changed, and rebuild the rest. This
|
|
cuts down on build times substantially since
|
|
unchanged ports do not get rebuilt every time.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-cdrom</literal> - This package build is
|
|
intended to end up on a CD-ROM, so
|
|
<literal>NO_CDROM</literal> packages and distfiles
|
|
should be deleted in post-processing.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-nobuild</literal> - Perform all
|
|
the preprocessing steps, but do not actually do
|
|
the package build.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-noindex</literal> - Do not rebuild
|
|
<filename>INDEX</filename> during preprocessing.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-noduds</literal> - Do not rebuild the
|
|
<filename>duds</filename> file (ports that are never
|
|
built, e.g. those marked <literal>IGNORE</literal>,
|
|
<literal>NO_PACKAGE</literal>, etc.) during
|
|
preprocessing.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-trybroken</literal> - Try to build
|
|
<literal>BROKEN</literal> ports (off by default
|
|
because the amd64/&i386; clusters are fast enough now
|
|
that when doing incremental builds, more time
|
|
was spent rebuilding things that were going to
|
|
fail anyway. Conversely, the other clusters
|
|
are slow enough that it would be a waste of time
|
|
to try and build <literal>BROKEN</literal> ports).
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-nosrc</literal> - Do not update the
|
|
<literal>src</literal> tree from the ZFS snapshot, keep the tree from
|
|
previous build instead.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-srccvs</literal> - Do not update the
|
|
<literal>src</literal> tree from the ZFS snapshot, update it with
|
|
<literal>cvs update</literal> instead.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-noports</literal> - Do not update the
|
|
<literal>ports</literal> tree from the ZFS snapshot, keep the tree from
|
|
previous build instead.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-portscvs</literal> - Do not update the
|
|
<literal>ports</literal> tree from the ZFS snapshot, update it with
|
|
<literal>cvs update</literal> instead.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-norestr</literal> - Do not attempt to build
|
|
<literal>RESTRICTED</literal> ports.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-plistcheck</literal> - Make it fatal for
|
|
ports to leave behind files after deinstallation.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-nodistfiles</literal> - Do not collect distfiles
|
|
that pass <command>make checksum</command> for later
|
|
uploading to <hostid>ftp-master</hostid>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>-fetch-original</literal> - Fetch the
|
|
distfile from the original <literal>MASTER_SITES</literal>
|
|
rather than <hostid>ftp-master</hostid>.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>If the last build finished cleanly you do not need to delete
|
|
anything. If it was interrupted, or you selected
|
|
<literal>-nocleanup</literal>, you need to clean up clients by running
|
|
</para>
|
|
|
|
<para><command>build cleanup <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable> <replaceable>${buildid}</replaceable> -full</command></para>
|
|
|
|
<para><filename>errors/</filename>,
|
|
<filename>logs/</filename>, <filename>packages/</filename>, and so
|
|
forth, are cleaned by the scripts. If you are short of space,
|
|
you can also clean out <filename>ports/distfiles/</filename>.
|
|
Leave the <filename>latest/</filename> directory alone; it is
|
|
a symlink for the webserver.</para>
|
|
|
|
<note>
|
|
<para><literal>dosetupnodes</literal> is supposed to be run from
|
|
the <literal>dopackages</literal> script in the
|
|
<literal>-restart</literal> case, but it can be a good idea to
|
|
run it by hand and then verify that the clients all have the
|
|
expected job load. Sometimes,
|
|
<literal>dosetupnode</literal> cannot clean up a build and you
|
|
need to do it by hand. (This is a bug.)</para>
|
|
</note>
|
|
|
|
<para>Make sure the <replaceable>${arch}</replaceable> build
|
|
is run as the ports-<replaceable>${arch}</replaceable> user
|
|
or it will complain loudly.</para>
|
|
|
|
<note><para>The actual package build itself occurs in two
|
|
identical phases. The reason for this is that sometimes
|
|
transient problems (e.g. NFS failures, FTP sites being
|
|
unreachable, etc.) may halt a build. Doing things
|
|
in two phases is a workaround for these types of
|
|
problems.</para></note>
|
|
|
|
<para>Be careful that <filename>ports/Makefile</filename>
|
|
does not specify any empty subdirectories. This is especially
|
|
important if you are doing an -exp build. If the build
|
|
process encounters an empty subdirectory, both package build
|
|
phases will stop short, and an error similar to the following
|
|
will be written to
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/make.[0|1]</filename>:
|
|
</para>
|
|
|
|
<screen><literal>don't know how to make dns-all(continuing)</literal></screen>
|
|
|
|
<para>To correct this problem, simply comment out or remove
|
|
the <literal>SUBDIR</literal> entries that point to empty
|
|
subdirectories. After doing this, you can restart the build
|
|
by running the proper <command>dopackages</command> command
|
|
with the <literal>-restart</literal> option.
|
|
</para>
|
|
|
|
<note>
|
|
<para>This problem also appears if you create a new category
|
|
<filename>Makefile</filename> with no <makevar>SUBDIR</makevar>s
|
|
in it. This is probably a bug.</para>
|
|
</note>
|
|
|
|
<example>
|
|
<title>Update the i386-6 tree and do a complete build</title>
|
|
|
|
<para><command>dopackages.6 i386 -nosrc -norestr -nofinish</command></para>
|
|
</example>
|
|
|
|
<example>
|
|
<title>Restart an interrupted amd64-8 build without updating</title>
|
|
|
|
<para><command>dopackages.8 amd64 -nosrc -noports -norestr -continue -noindex -noduds -nofinish</command></para>
|
|
</example>
|
|
|
|
<example>
|
|
<title>Post-process a completed sparc64-7 tree</title>
|
|
|
|
<para><command>dopackages.7 sparc64 -finish</command></para>
|
|
</example>
|
|
</sect2>
|
|
|
|
<sect2 id="build-command">
|
|
<title><command>build</command> command</title>
|
|
|
|
<para>You may need to manipulate the build data before starting it,
|
|
especially for experimental builds. This is done with
|
|
<command>build</command> command.</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><literal>build list <replaceable>arch</replaceable>
|
|
<replaceable>branch</replaceable></literal> - Shows the current set
|
|
of build ids.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>build clone <replaceable>arch</replaceable>
|
|
<replaceable>branch</replaceable> <replaceable>oldid</replaceable>
|
|
[<replaceable>newid</replaceable>]</literal> - Clones
|
|
<replaceable>oldid</replaceable> to
|
|
<replaceable>newid</replaceable> (or a datestamp if not specified).
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>build srcupdate <replaceable>arch</replaceable>
|
|
<replaceable>branch</replaceable>
|
|
<replaceable>buildid</replaceable></literal> - Replaces the src
|
|
tree with a new ZFS snapshot. Don't forget to use
|
|
<literal>-nosrc</literal> flag to <command>dopackages</command>
|
|
later!
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>build portsupdate <replaceable>arch</replaceable>
|
|
<replaceable>branch</replaceable></literal> - Replaces the ports
|
|
tree with a new ZFS snapshot. Don't forget to use
|
|
<literal>-noports</literal> flag to <command>dopackages</command>
|
|
later!
|
|
</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
</sect2>
|
|
|
|
<sect2 id="build-one">
|
|
<title>Building a single package</title>
|
|
|
|
<para>Sometimes there is a need to rebuild a single package from the
|
|
package set. This can be accomplished with the following
|
|
invocation:</para>
|
|
|
|
<para><command>/var/portbuild/evil/qmanager/packagebuild <replaceable>amd64</replaceable> <replaceable>7-exp</replaceable> <replaceable>20080904212103</replaceable> <replaceable>aclock-0.2.3_2</replaceable></command></para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="anatomy">
|
|
<title>Anatomy of a Build</title>
|
|
|
|
<para>A full build without any <literal>-no</literal>
|
|
options performs the following operations in the
|
|
specified order:</para>
|
|
|
|
<orderedlist>
|
|
<listitem>
|
|
<para>An update of the current <literal>ports</literal>
|
|
tree from the ZFS snapshot [*]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>An update of the running branch's
|
|
<literal>src</literal> tree from the ZFS snapshot [*]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Checks which ports do not have a
|
|
<literal>SUBDIR</literal> entry in their respective
|
|
category's <filename>Makefile</filename> [*]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Creates the <filename>duds</filename> file, which
|
|
is a list of ports not to build [*] [+]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Generates a fresh <filename>INDEX</filename>
|
|
file [*] [+]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Sets up the nodes that will be used in the
|
|
build [*] [+]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Builds a list of restricted ports [*] [+]</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Builds packages (phase 1) [++]</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Performs another node setup [+]</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Builds packages (phase 2) [++]</para>
|
|
</listitem>
|
|
</orderedlist>
|
|
|
|
<para>[*] Status of these steps can be found in
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/build.log</filename>
|
|
as well as on stderr of the tty running the
|
|
<command>dopackages</command> command.</para>
|
|
|
|
<para>[+] If any of these steps fail, the build will stop
|
|
cold in its tracks.</para>
|
|
|
|
<para>[++] Status of these steps can be found in
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/make.[0|1]</filename>,
|
|
where <filename>make.0</filename> is the log file used by
|
|
phase 1 of the package build and <filename>make.1</filename>
|
|
is the log file used by phase 2. Individual ports will write
|
|
their build logs to
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/logs</filename>
|
|
and their error logs to
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/errors</filename>.
|
|
</para>
|
|
|
|
<para>Formerly the docs tree was also checked out, however, it has
|
|
been found to be unnecessary.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="interrupting">
|
|
<title>Interrupting a Build</title>
|
|
|
|
<para>Interrupting a build is a bit messy. First you need to
|
|
identify the tty in which it's running (either record the output
|
|
of &man.tty.1; when you start the build, or use <command>ps x</command>
|
|
to identify it. You need to make sure that nothing else important
|
|
is running in this tty, e.g. <command>ps -t p1</command> or whatever.
|
|
If there is not, you can just kill off the whole term easily with
|
|
<command>pkill -t p1</command>; otherwise issue a
|
|
<command>kill -HUP</command> in there by, for example,
|
|
<command>ps -t p1 -o pid= | xargs kill -HUP</command>. Replace
|
|
<replaceable>p1</replaceable> by whatever the tty is, of course.</para>
|
|
|
|
<para>The
|
|
package builds dispatched by <command>make</command> to
|
|
the client machines will clean themselves up after a
|
|
few minutes (check with <command>ps x</command> until they
|
|
all go away).</para>
|
|
|
|
<para>If you do not kill &man.make.1;, then it will spawn more jobs.
|
|
If you do not kill <command>dopackages</command>, then it will restart
|
|
the entire build. If you do not kill the <command>pdispatch</command>
|
|
processes, they'll keep going (or respawn) until they've built their
|
|
package.</para>
|
|
|
|
<para>To free up resources, you will need to clean up client machines by
|
|
running <command>build cleanup</command> command. For example:
|
|
<screen>&prompt.user; <userinput>/var/portbuild/scripts/build cleanup i386 6-exp 20080714120411 -full</userinput></screen>
|
|
|
|
<para>If you forget to do this, then the old build
|
|
<literal>chroot</literal>s will not be cleaned up for 24 hours, and no
|
|
new jobs will be dispatched in their place since
|
|
<hostid>pointyhat</hostid> thinks the job slot is still occupied.</para>
|
|
|
|
<para>To check, <command>cat ~/loads/*</command> to display the
|
|
status of client machines; the first column is the number of jobs
|
|
it thinks is running, and this should be roughly concordant
|
|
with the load average. <literal>loads</literal> is refreshed
|
|
every 2 minutes. If you do <command>ps x | grep pdispatch</command>
|
|
and it's less than the number of jobs that <literal>loads</literal>
|
|
thinks are in use, you're in trouble.</para>
|
|
|
|
<para>You may have problem with the <command>umount</command>
|
|
commands hanging. If so, you are going to have to use the
|
|
<command>allgohans</command> script to run an &man.ssh.1;
|
|
command across all clients for that buildenv. For example:
|
|
<screen>ssh -l root gohan24 df</screen>
|
|
|
|
will get you a df, and
|
|
|
|
<screen>allgohans "umount -f pointyhat.freebsd.org:/var/portbuild/i386/6-exp/ports"
|
|
allgohans "umount -f pointyhat.freebsd.org:/var/portbuild/i386/6-exp/src"</screen>
|
|
|
|
are supposed to get rid of the hanging mounts. You will have to
|
|
keep doing them since there can be multiple mounts.</para>
|
|
|
|
<note>
|
|
<para>Ignore the following:
|
|
|
|
<screen>umount: pointyhat.freebsd.org:/var/portbuild/i386/6-exp/ports: statfs: No such file or directory
|
|
umount: pointyhat.freebsd.org:/var/portbuild/i386/6-exp/ports: unknown file system
|
|
umount: Cleanup of /x/tmp/6-exp/chroot/53837/compat/linux/proc failed!
|
|
/x/tmp/6-exp/chroot/53837/compat/linux/proc: not a file system root directory</screen>
|
|
|
|
The former 2 mean that that client did not have those mounted;
|
|
the latter 2 are a bug.</para>
|
|
|
|
<para>You may also see messages about <literal>procfs</literal>.</para>
|
|
</note>
|
|
|
|
<para>After you have done all the above, remove the
|
|
<filename><replaceable>${arch}</replaceable>/lock</filename>
|
|
file before trying to restart the build. If you do not,
|
|
<filename>dopackages</filename> will simply exit.
|
|
</para>
|
|
|
|
<para>If you have to do a <command>cvs update</command> before
|
|
restarting, you may have to rebuild either <filename>duds</filename>,
|
|
<filename>INDEX</filename>, or both. If you are doing the latter
|
|
manually, you will also have to rebuild
|
|
<filename>packages/All/Makefile</filename> via the
|
|
<command>makeparallel</command> script.</para>
|
|
</sect1>
|
|
|
|
<sect1 id="monitoring">
|
|
<title>Monitoring the Build</title>
|
|
|
|
<para>You can use <command>qclient</command> command to monitor the status
|
|
of build nodes, and to list the currently scheduled jobs:</para>
|
|
|
|
<para><command>python /var/portbuild/evil/qmanager/qclient jobs</command></para>
|
|
<para><command>python /var/portbuild/evil/qmanager/qclient status</command></para>
|
|
|
|
<para>The
|
|
<command>scripts/stats <replaceable>${branch}</replaceable></command>
|
|
command shows the number of packages already built.</para>
|
|
|
|
<para>Running <command>cat /var/portbuild/*/loads/*</command>
|
|
shows the client loads and number of concurrent builds in
|
|
progress. The files that have been recently updated are the clients
|
|
that are online; the others are the offline clients.</para>
|
|
|
|
<note>
|
|
<para>The <command>pdispatch</command> command does the dispatching
|
|
of work onto the client, and post-processing.
|
|
<command>ptimeout.host</command> is a watchdog that kills a build
|
|
after timeouts. So, having 50 <command>pdispatch</command>
|
|
processes but only 4 &man.ssh.1; processes means 46
|
|
<command>pdispatch</command>es are idle, waiting to get an
|
|
idle node.</para>
|
|
</note>
|
|
|
|
<para>Running <command>tail -f <replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/build.log</command>
|
|
shows the overall build progress.</para>
|
|
|
|
<para>If a port build is failing, and it is not immediately obvious
|
|
from the log as to why, you can preserve the
|
|
<literal>WRKDIR</literal> for further analysis. To do this,
|
|
touch a file called <filename>.keep</filename> in the port's
|
|
directory. The next time the cluster tries to build this port,
|
|
it will tar, compress, and copy the <literal>WRKDIR</literal>
|
|
to
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/wrkdirs</filename>.
|
|
</para>
|
|
|
|
<para>If you find that the system is looping trying to build the
|
|
same package over and over again, you may be able to fix the
|
|
problem by rebuilding the offending package by hand.</para>
|
|
|
|
<para>If all the builds start failing with complaints that they
|
|
cannot load the dependent packages, check to see that
|
|
<application>httpd</application> is still running, and restart
|
|
it if not.</para>
|
|
|
|
<para>Keep an eye on &man.df.1; output. If the
|
|
<filename>/var/portbuild</filename> file system becomes full
|
|
then <trademark>Bad Things</trademark> happen.
|
|
</para>
|
|
|
|
<para>The status of all current builds is generated twice an hour
|
|
and posted to
|
|
<ulink url="http://pointyhat.FreeBSD.org/errorlogs/packagestats.html"></ulink>.
|
|
For each <literal>buildenv</literal>, the following is displayed:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><literal>cvs date</literal> is the contents of
|
|
<filename>cvsdone</filename>. This is why we recommend that you
|
|
update <filename>cvsdone</filename> for <literal>-exp</literal>
|
|
runs (see below).</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>date of <literal>latest log</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>number of lines in <literal>INDEX</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>the number of current <literal>build logs</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>the number of completed <literal>packages</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>the number of <literal>errors</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>the number of duds (shown as <literal>skipped</literal>)</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>missing</literal> shows the difference between
|
|
<filename>INDEX</filename> and the other columns. If you have
|
|
restarted a run after a <command>cvs update</command>, there
|
|
will likely be duplicates in the packages and error columns,
|
|
and this column will be meaningless. (The script is naive).</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>running</literal> and <literal>completed</literal>
|
|
are guesses based on a &man.grep.1; of <filename>build.log</filename>.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</sect1>
|
|
|
|
<sect1 id="errors">
|
|
<title>Dealing With Build Errors</title>
|
|
|
|
<para>The easiest way to track build failures is to receive
|
|
the emailed logs and sort them to a folder, so you can maintain a
|
|
running list of current failures and detect new ones easily.
|
|
To do this, add an email address to
|
|
<filename><replaceable>${branch}</replaceable>/portbuild.conf</filename>.
|
|
You can easily bounce the new ones to maintainers.</para>
|
|
|
|
<para>After a port appears broken on every build combination
|
|
multiple times, it is time to mark it <literal>BROKEN</literal>.
|
|
Two weeks' notification for the maintainers seems fair.</para>
|
|
|
|
<note>
|
|
<para>To avoid build errors with ports that need to be manually
|
|
fetched, put the distfiles into
|
|
<filename>~ftp/pub/FreeBSD/distfiles</filename>.</para>
|
|
</note>
|
|
</sect1>
|
|
|
|
<sect1 id="release">
|
|
<title>Release Builds</title>
|
|
|
|
<para>When building packages for a release, it may be
|
|
necessary to manually update the <literal>ports</literal>
|
|
and <literal>src</literal> trees to the release tag and use
|
|
<literal>-nocvs</literal> and
|
|
<literal>-noportscvs</literal>.</para>
|
|
|
|
<para>To build package sets intended for use on a CD-ROM,
|
|
use the <literal>-cdrom</literal> option to
|
|
<command>dopackages</command>.</para>
|
|
|
|
<para>If the disk space is not available on the cluster, use
|
|
<literal>-nodistfiles</literal> to avoid collecting distfiles.</para>
|
|
|
|
<para>After the initial build completes, restart the build
|
|
with
|
|
<literal>-restart -fetch-original</literal>
|
|
to collect updated distfiles as well. Then, once the
|
|
build is post-processed, take an inventory of the list
|
|
of files fetched:</para>
|
|
|
|
<screen>&prompt.user; <userinput>cd <replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable></userinput>
|
|
&prompt.user; <userinput>find distfiles > distfiles-<replaceable>${release}</replaceable></userinput></screen>
|
|
|
|
<para>This inventory file typically lives in
|
|
<filename>i386/<replaceable>${branch}</replaceable></filename>
|
|
on the cluster master.</para>
|
|
|
|
<para>This is useful to aid in periodically cleaning out
|
|
the distfiles from <hostid>ftp-master</hostid>. When space
|
|
gets tight, distfiles from recent releases can be kept while
|
|
others can be thrown away.</para>
|
|
|
|
<para>Once the distfiles have been uploaded (see below),
|
|
the final release package set must be created. Just to be
|
|
on the safe side, run the
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/cdrom.sh</filename>
|
|
script by hand to make sure all the CD-ROM restricted packages
|
|
and distfiles have been pruned. Then, copy the
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/packages</filename>
|
|
directory to
|
|
<filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/packages-<replaceable>${release}</replaceable></filename>.
|
|
Once the packages are safely moved off, contact the &a.re;
|
|
and inform them of the release package location.</para>
|
|
|
|
<para>Remember to coordinate with the &a.re; about the timing
|
|
and status of the release builds.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="uploading">
|
|
<title>Uploading Packages</title>
|
|
|
|
<para>Once a build has completed, packages and/or distfiles
|
|
can be transferred to <hostid>ftp-master</hostid> for
|
|
propagation to the FTP mirror network. If the build was
|
|
run with <literal>-nofinish</literal>, then make sure to
|
|
follow up with
|
|
<command>dopackages -finish</command> to post-process the
|
|
packages (removes <literal>RESTRICTED</literal> and
|
|
<literal>NO_CDROM</literal> packages where appropriate,
|
|
prunes packages not listed in <filename>INDEX</filename>,
|
|
removes from <filename>INDEX</filename>
|
|
references to packages not built, and generates a
|
|
<filename>CHECKSUM.MD5</filename>
|
|
summary); and distfiles (moves them from the temporary
|
|
<filename>distfiles/.pbtmp</filename> directory into
|
|
<filename>distfiles/</filename> and removes
|
|
<literal>RESTRICTED</literal> and <literal>NO_CDROM</literal>
|
|
distfiles).</para>
|
|
|
|
<para>It is usually a good idea to run the
|
|
<command>restricted.sh</command> and/or
|
|
<command>cdrom.sh</command> scripts by hand after
|
|
<command>dopackages</command> finishes just to be safe.
|
|
Run the <command>restricted.sh</command> script before
|
|
uploading to <hostid>ftp-master</hostid>, then run
|
|
<command>cdrom.sh</command> before preparing
|
|
the final package set for a release.</para>
|
|
|
|
<para>The package subdirectories are named by whether they are for
|
|
<literal>release</literal>, <literal>stable</literal>, or
|
|
<literal>current</literal>. Examples:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><literal>packages-6.3-release</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>packages-6-stable</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>packages-7.0-release</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>packages-7-stable</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>packages-8-current</literal></para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<note><para>Some of the directories on
|
|
<hostid>ftp-master</hostid> are, in fact, symlinks. Examples:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><literal>packages-stable</literal></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para><literal>packages-current</literal></para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para> Be sure
|
|
you move the new packages directory over the
|
|
<emphasis>real</emphasis> destination directory, and not
|
|
one of the symlinks that points to it.</para>
|
|
</note>
|
|
|
|
<para>If you are doing a completely new package set (e.g. for
|
|
a new release), copy packages to the staging area on
|
|
<hostid>ftp-master</hostid> with something like the following:</para>
|
|
|
|
<screen>&prompt.root; <userinput>cd /var/portbuild/<replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable></userinput>
|
|
&prompt.root; <userinput>tar cfv - packages/ | ssh portmgr@ftp-master tar xfC - w/ports/<replaceable>${arch}</replaceable>/tmp/<replaceable>${subdir}</replaceable></userinput></screen>
|
|
|
|
<para>Then log into <hostid>ftp-master</hostid>, verify that
|
|
the package set was transferred successfully, remove the
|
|
package set that the new package set is to replace (in
|
|
<filename>~/w/ports/<replaceable>${arch}</replaceable></filename>),
|
|
and move the new set into place. (<literal>w/</literal> is
|
|
merely a shortcut.)</para>
|
|
|
|
<para>For incremental builds, packages should be uploaded
|
|
using <command>rsync</command> so we do not put too much
|
|
strain on the mirrors.</para>
|
|
|
|
<para><emphasis>ALWAYS</emphasis> use <literal>-n</literal>
|
|
first with <command>rsync</command> and check the output
|
|
to make sure it is sane. If it looks good, re-run the
|
|
<command>rsync</command> without the <literal>-n</literal>
|
|
option.
|
|
</para>
|
|
|
|
<para>Example <command>rsync</command> command for incremental
|
|
package upload:</para>
|
|
|
|
<screen>&prompt.root; <userinput>rsync -n -r -v -l -t -p --delete packages/ portmgr@ftp-master:w/ports/<replaceable>${arch}</replaceable>/<replaceable>${subdir}</replaceable>/ | tee log</userinput></screen>
|
|
|
|
<para>Distfiles can be transferred with the
|
|
<command>cpdistfiles</command> script:</para>
|
|
|
|
<screen>&prompt.root; <userinput>/var/portbuild/scripts/cpdistfiles <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable></userinput></screen>
|
|
|
|
<para>Or you can do it by hand using <command>rsync</command>
|
|
command:</para>
|
|
|
|
<screen>&prompt.root; <userinput>cd /var/portbuild/<replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable></userinput>
|
|
&prompt.root; <userinput>rsync -n -r -v -l -p -c distfiles/ portmgr@ftp-master:w/ports/distfiles/ | tee log</userinput></screen>
|
|
|
|
<para>Again, run the command without the <literal>-n</literal>
|
|
option after you have checked it.</para>
|
|
</sect1>
|
|
|
|
<sect1 id="expbuilds">
|
|
<title>Experimental Patches Builds</title>
|
|
|
|
<para>Experimental patches builds are run from time to time to
|
|
new features or bugfixes to the ports infrastructure (i.e.
|
|
<literal>bsd.port.mk</literal>), or to test large sweeping
|
|
upgrades. The current experimental patches branch is
|
|
<literal>7-exp</literal> on the &i386;
|
|
architecture.</para>
|
|
|
|
<para>In general, an experimental patches build is run the same
|
|
way as any other build, except that you should first update the
|
|
ports tree to the latest version and then apply your patches.
|
|
To do the former, you can use the following:
|
|
|
|
<screen>&prompt.user; <userinput>cvs -R update -dP > update.out</userinput>
|
|
&prompt.user; <userinput>date > cvsdone</userinput></screen>
|
|
This will most closely simulate what the <literal>dopackages</literal>
|
|
script does. (While <filename>cvsdone</filename> is merely
|
|
informative, it can be a help.)</para>
|
|
|
|
<para>You will need to edit <filename>update.out</filename> to look
|
|
for lines beginning with <literal>^M</literal>, <literal>^C</literal>,
|
|
or <literal>^?</literal> and then deal with them.</para>
|
|
|
|
<para>It is always a good idea to save
|
|
original copies of all changed files, as well as a list of what
|
|
you are changing. You can then look back on this list when doing
|
|
the final commit, to make sure you are committing exactly what you
|
|
tested.</para>
|
|
|
|
<para>Since the machine is shared, someone else may delete your
|
|
changes by mistake, so keep a copy of them in e.g. your home
|
|
directory on <hostid>freefall</hostid>. Do not use
|
|
<filename>tmp/</filename>; since <hostid>pointyhat</hostid>
|
|
itself runs some version of <literal>-CURRENT</literal>, you
|
|
can expect reboots (if nothing else, for updates).</para>
|
|
|
|
<para>In order to have a good control case with which to compare
|
|
failures, you should first do a package build of the branch on
|
|
which the experimental patches branch is based for the &i386;
|
|
architecture (currently this is <literal>6</literal>). Then, when
|
|
preparing for the experimental patches build, checkout a ports
|
|
tree and a src tree with the same date as was used for the control
|
|
build. This will ensure an apples-to-apples comparison
|
|
later.</para>
|
|
|
|
<note><para>One build cluster can do the control build while the other
|
|
does the experimental patches build. This can be a great
|
|
time-saver.</para></note>
|
|
|
|
<para>Once the build finishes, compare the control build failures
|
|
to those of the experimental patches build. Use the following
|
|
commands to facilitate this (this assumes the <literal>6</literal>
|
|
branch is the control branch, and the <literal>6-exp</literal>
|
|
branch is the experimental patches branch):</para>
|
|
|
|
<screen>&prompt.user; <userinput>cd /var/portbuild/i386/6-exp/errors</userinput>
|
|
&prompt.user; <userinput>find . -name \*.log\* | sort > /tmp/6-exp-errs</userinput>
|
|
&prompt.user; <userinput>cd /var/portbuild/i386/6/errors</userinput>
|
|
&prompt.user; <userinput>find . -name \*.log\* | sort > /tmp/6-errs</userinput></screen>
|
|
|
|
<note><para>If it has been a long time since one of the builds
|
|
finished, the logs may have been automatically compressed with
|
|
bzip2. In that case, you must use <literal>sort | sed
|
|
's,\.bz2,,g'</literal> instead.</para></note>
|
|
|
|
<screen>&prompt.user; <userinput>comm -3 /tmp/6-errs /tmp/6-exp-errs | less</userinput></screen>
|
|
|
|
<para>This last command will produce a two-column report. The
|
|
first column is ports that failed on the control build but not in
|
|
the experimental patches build; the second column is vice versa.
|
|
Reasons that the port might be in the first column
|
|
include:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Port was fixed since the control build was run, or was
|
|
upgraded to a newer version that is also broken (thus the
|
|
newer version should appear in the second column)
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Port is fixed by the patches in the experimental patches
|
|
build
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Port did not build under the experimental patches build
|
|
due to a dependency failure
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Reasons for a port appearing in the second column
|
|
include:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Port was broken by the experimental patches [1]</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Port was upgraded since the control build and has become
|
|
broken [2]
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Port was broken due to a transient error (e.g. FTP site
|
|
down, package client error, etc.)
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Both columns should be investigated and the reason for the
|
|
errors understood before committing the experimental patches set.
|
|
To differentiate between [1] and [2] above, you can do a rebuild
|
|
of the affected packages under the control branch:</para>
|
|
|
|
<screen>&prompt.user; <userinput>cd /var/portbuild/i386/6/ports</userinput></screen>
|
|
|
|
<note><para>Be sure to <literal>cvs update</literal> this tree to the same date as
|
|
the experimental patches tree.</para></note>
|
|
|
|
<para>The following command will set up the control branch for
|
|
the partial build:</para>
|
|
|
|
<screen>&prompt.user; <userinput>/var/portbuild/scripts/dopackages.6 -noportscvs -nobuild -nocvs -nofinish</userinput></screen>
|
|
|
|
<para>The builds must be performed from the
|
|
<literal>packages/All</literal> directory. This directory should
|
|
initially be empty except for the Makefile symlink. If this
|
|
symlink does not exist, it must be created:</para>
|
|
|
|
<screen>&prompt.user; <userinput>cd /var/portbuild/i386/6/packages/All</userinput>
|
|
&prompt.user; <userinput>ln -sf ../../Makefile .</userinput>
|
|
&prompt.user; <userinput>make -k -j<#> <list of packages to build></userinput></screen>
|
|
|
|
<note><para><#> is the concurrency of the build to
|
|
attempt. It is usually the sum of the weights listed in
|
|
<filename>/var/portbuild/i386/mlist</filename> unless you have a
|
|
reason to run a heavier or lighter build.</para>
|
|
|
|
<para>The list of packages to build should be a list of package
|
|
names (including versions) as they appear in
|
|
<filename>INDEX</filename>. The <literal>PKGSUFFIX</literal>
|
|
(i.e. .tgz or .tbz) is optional.</para></note>
|
|
|
|
<para>This will build only those packages listed as well as all
|
|
of their dependencies.</para>
|
|
|
|
<para>You can check the progress of this
|
|
partial build the same way you would a regular build.</para>
|
|
|
|
<para>Once all
|
|
the errors have been resolved, you can commit the package set.
|
|
After committing, it is customary to send a <literal>HEADS
|
|
UP</literal> email to <ulink
|
|
url="mailto:ports@FreeBSD.org">ports@FreeBSD.org</ulink> and
|
|
copy <ulink
|
|
url="mailto:ports-developers@FreeBSD.org">ports-developers@FreeBSD.org</ulink>
|
|
informing people of the changes. A summary of all changes
|
|
should also be committed to
|
|
<filename>/usr/ports/CHANGES</filename>.</para>
|
|
</sect1>
|
|
|
|
<sect1 id="disk-failure">
|
|
<title>Procedures for dealing with disk failures</title>
|
|
|
|
<para>When a machine has a disk failure (e.g. panics due to read errors,
|
|
etc), then we should do the following steps:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem><para>Note the time and failure mode (e.g. paste in the
|
|
relevant console output) in
|
|
<filename>/var/portbuild/<replaceable>${arch}</replaceable>/reboots</filename></para></listitem>
|
|
|
|
<listitem><para>For i386 gohan clients, scrub the disk by touching
|
|
<filename>/SCRUB</filename> in the nfsroot (e.g.
|
|
<filename>/a/nfs/8.dir1/SCRUB</filename>) and rebooting. This will
|
|
<command>dd if=/dev/zero of=/dev/ad0</command> and force the drive to
|
|
remap any bad sectors it finds, if it has enough spares left. This is
|
|
a temporary measure to extend the lifetime of a drive that is on the
|
|
way out.</para>
|
|
|
|
<note><para>For the i386 blade systems another signal of a failing
|
|
disk seems to be that the blade will completely hang and be
|
|
unresponsive to either console break, or even NMI.</para></note>
|
|
|
|
<para>For other build systems that don't newfs their disk at boot (e.g.
|
|
amd64 systems) this step has to be skipped.</para></listitem>
|
|
|
|
<listitem><para>If the problem recurs, then the disk is probably toast.
|
|
Take the machine out of <filename>mlist</filename> and (for ata disks)
|
|
run <command>smartctl</command> on the drive:</para>
|
|
|
|
<screen>smartctl -t long /dev/ad0</screen>
|
|
|
|
<para>It will take about 1/2 hour:</para>
|
|
|
|
<screen>gohan51# smartctl -t long /dev/ad0
|
|
smartctl version 5.38 [i386-portbld-freebsd8.0] Copyright (C) 2002-8
|
|
Bruce Allen
|
|
Home page is http://smartmontools.sourceforge.net/
|
|
|
|
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
|
|
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
|
|
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
|
|
Testing has begun.
|
|
Please wait 31 minutes for test to complete.
|
|
Test will complete after Fri Jul 4 03:59:56 2008
|
|
|
|
Use smartctl -X to abort test.</screen>
|
|
|
|
<para>Then <command>smartctl -a /dev/ad0</command> shows the status
|
|
after it finishes:</para>
|
|
|
|
<screen># SMART Self-test log structure revision number 1
|
|
# Num Test_Description Status Remaining
|
|
LifeTime(hours) LBA_of_first_error
|
|
# 1 Extended offline Completed: read failure 80% 15252 319286</screen>
|
|
|
|
<para>It will also display other data including a log of previous drive
|
|
errors. It is possible for the drive to show previous DMA errors
|
|
without failing the self-test though (because of sector
|
|
remapping).</para></listitem>
|
|
</itemizedlist>
|
|
|
|
<para>When a disk has failed, please inform &a.kris; so he can try to get it
|
|
replaced.</para>
|
|
</sect1>
|
|
</article>
|