doc/en_US.ISO8859-1/articles/portbuild/article.sgml

<!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
<!ENTITY % articles.ent PUBLIC "-//FreeBSD//ENTITIES DocBook FreeBSD Articles Entity Set//EN">
%articles.ent;
]>

<article>
  <articleinfo>
    <title>Package Building Procedures</title>

    <authorgroup>
      <corpauthor>The &os; Ports Management Team</corpauthor>
    </authorgroup>

    <pubdate>$FreeBSD$</pubdate>

    <copyright>
      <year>2003</year>
      <year>2004</year>
      <year>2005</year>
      <year>2006</year>
      <year>2007</year>
      <year>2008</year>
      <holder role="mailto:portmgr@FreeBSD.org">The &os; Ports
	Management Team</holder>
    </copyright>

    <legalnotice id="trademarks" role="trademarks">
      &tm-attrib.freebsd;
      &tm-attrib.intel;
      &tm-attrib.sparc;
      &tm-attrib.general;
    </legalnotice>
  </articleinfo>

  <sect1 id="intro">
    <title>Introduction and Conventions</title>

    <para>In order to provide pre-compiled binaries of third-party
      applications for &os;, the Ports Collection is regularly
      built on one of the <quote>Package Building Clusters.</quote>
      Currently, the main cluster in use is at
      <ulink url="http://pointyhat.FreeBSD.org"></ulink>.</para>

    <para>Most of the package building magic occurs under the
      <filename>/var/portbuild</filename> directory.  Unless
      otherwise specified, all paths will be relative to
      this location.  <replaceable>${arch}</replaceable> will
      be used to specify one of the package architectures
      (amd64, &i386;, and &sparc64;), and
      <replaceable>${branch}</replaceable> will be used
      to specify the build branch (6, 6-exp, 7, 7-exp, 8, 8-exp).
    </para>

    <note>
      <para>Packages are no longer built for Release 4 or 5, nor
	for the alpha nor ia64 architectures.</para>
    </note>

    <para>The scripts that control all of this live in
      <filename>/var/portbuild/scripts/</filename>.  These are the
      checked-out copies from
      <filename>/usr/ports/Tools/portbuild/scripts/</filename>.</para>

    <para>Typically, incremental builds are done that use previous
      packages as dependendencies; this takes less time, and puts less
      load on the mirrors.  Full builds are usually only done:</para>

    <itemizedlist>
      <listitem><para>right after release time, for the
	<literal>-STABLE</literal> branches</para></listitem>

      <listitem><para>every month or so, for <literal>-CURRENT</literal>
	</para></listitem>

      <listitem><para>for experimental builds</para></listitem>
    </itemizedlist>
  </sect1>

  <sect1 id="management">
    <title>Build Client Management</title>

    <para>The &i386; clients currently
      netboot from <hostid>pointyhat</hostid>; the other clients
      are self-hosted.  In all cases they set themselves
      up at boot-time to prepare to build packages.</para>

    <para>Although connected nodes are supported,
      <replaceable>disconnected</replaceable> cluster node support has
      been added.  A disconnected node is
      one that does not mount the cluster master via NFS.  It could be
      a remote node, for example.  The cluster master rsync's the
      interesting data (ports and src trees, bindist tarballs,
      scripts, etc.) to disconnected nodes during the node-setup
      phase.  Then, the disconnected portbuild directory is
      nullfs-mounted for chroot builds.</para>

    <para>The
      <username>ports-<replaceable>${arch}</replaceable></username>
      user can &man.ssh.1; to the client nodes to monitor them.
      Use <command>sudo</command> and check the
      <hostid>portbuild.<replaceable>hostname</replaceable>.conf</hostid>
      for the user and access details.</para>

    <para>The <command>scripts/allgohans</command> script can
      be used to run a command on all of the
      <replaceable>${arch}</replaceable> clients.</para>

    <para>The <command>scripts/checkmachines</command> script
      is used to monitor the load on all the nodes of the
      build cluster, and schedule which nodes build which ports.
      This script is not very robust, and has a tendency to die.
      It is best to start up this script on the build master
      (e.g. <hostid>pointyhat</hostid>)
      after boot time using a &man.while.1; loop.
    </para>
  </sect1>

  <sect1 id="setup">
    <title>Chroot Build Environment Setup</title>

    <para>Package builds are performed in a
      <literal>chroot</literal> populated by the
      <filename>portbuild</filename> script using the
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/builds/<replaceable>${buildid}</replaceable>/bindist.tar</filename>
      file.</para>

    <para>The following command builds a world from the
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/src</filename>
      tree and installs it into
      <replaceable>${worlddir}</replaceable>.  The tree will
      be updated first unless <literal>-nocvs</literal> is
      specified.</para>

    <screen>/var/portbuild&prompt.root; <userinput>scripts/makeworld <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable> <replaceable>${buildid}</replaceable> [-nocvs]</userinput></screen>

    <para>The <filename>bindist.tar</filename> tarball is created from the
      previously installed world by the <command>mkbindist</command>
      script.  It should be run as <username>root</username> with the following
      command:</para>

    <screen>/var/portbuild&prompt.root; <userinput>scripts/mkbindist <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable> <replaceable>${buildid}</replaceable></userinput></screen>

    <para>The per-machine tarballs are located in
      <filename><replaceable>${arch}</replaceable>/clients</filename>.</para>

    <para>The <filename>bindist.tar</filename> file is extracted
      onto each client at client boot time, and at the start of
      each pass of the <command>dopackages</command>
      script.
    </para>
  </sect1>

  <sect1 id="starting">
    <title>Starting the Build</title>

    <para>Several separate builds for each architecture - branch combination
      are supported.  All data private to a build (ports tree, src tree,
      packages, distfiles, log files, bindist, Makefile, etc) are located under
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/builds/<replaceable>${buildid}</replaceable></filename>.
      The last created build can be alternatively referenced under buildid
      <literal>latest</literal>, the one before is called
      <literal>previous</literal>.</para>

    <para>New builds are cloned from the <literal>latest</literal>, which is
      fast since it uses ZFS.</para>

    <sect2 id="build-dopackages">
      <title><command>dopackages</command> scripts</title>

    <para>The <filename>scripts/dopackages*</filename> scripts
      are used to perform the builds.  Most useful are:</para>

    <itemizedlist>
      <listitem>
	<para><command>dopackages.6</command> - Perform
	  a 6.X build
	</para>
      </listitem>

      <listitem>
	<para><command>dopackages.6-exp</command> - Perform
	  a 6.X build with experimental patches
	  (6-exp branch)
	</para>
      </listitem>

      <listitem>
	<para><command>dopackages.7</command> - Perform
	  a 7.X build
	</para>
      </listitem>

      <listitem>
	<para><command>dopackages.7-exp</command> - Perform
	  a 7.X build with experimental patches
	  (7-exp branch)
	</para>
      </listitem>

      <listitem>
	<para><command>dopackages.8</command> - Perform
	  a 8.X build
	</para>
      </listitem>

      <listitem>
	<para><command>dopackages.8-exp</command> - Perform
	  a 8.X build with experimental patches
	  (8-exp branch)
	</para>
      </listitem>
    </itemizedlist>

    <para>These are wrappers around <command>dopackages</command>,
      and are all symlinked to <command>dopackages.wrapper</command>.
      New branch wrapper scripts can be created by symlinking
      <command>dopackages.${branch}</command> to
      <command>dopackages.wrapper</command>.  These scripts
      take a number of arguments.  For example:</para>

    <screen><command>dopackages.6 <replaceable>${arch}</replaceable> <literal>[-options]</literal></command></screen>

    <para><literal>[-options]</literal> may be zero or more of the
      following:</para>

    <itemizedlist>
      <listitem>
	<para><literal>-keep</literal> - Do not delete this build in the
	  future, when it would be normally deleted as part of the
	  <literal>latest</literal> - <literal>previous</literal> cycle.
	  Don't forget to clean it up manually when you no longer need it.
	</para>
      </listitem>

      <listitem>
	<para><literal>-nofinish</literal> - Do not perform
	  post-processing once the build is complete.  Useful
	  if you expect that the build will need to be restarted
	  once it finishes.  If you use this option, don't forget to cleanup
	  the clients when you don't need the build anymore.
	</para>
      </listitem>

      <listitem>
	<para><literal>-finish</literal> - Perform
	  post-processing only.
	</para>
      </listitem>

      <listitem>
	<para><literal>-nocleanup</literal> - By default, when the
	  <literal>-finish</literal> stage of the build is complete, the build
	  data will be deleted from the clients.  This option will prevent
	  that.</para>
      </listitem>

      <listitem>
	<para><literal>-restart</literal> - Restart an interrupted
	  (or non-<literal>finish</literal>ed) build from the
	  beginning.  Ports that failed on the previous build will
	  be rebuilt.
	</para>
      </listitem>

      <listitem>
	<para><literal>-continue</literal> - Restart an interrupted
	  (or non-<literal>finish</literal>ed) build.  Will not
	  rebuild ports that failed on the previous build.
	</para>
      </listitem>

      <listitem>
	<para><literal>-incremental</literal> - Compare the
	  interesting fields of the new
	  <literal>INDEX</literal> with the previous one,
	  remove packages and log files for the old ports that
	  have changed, and rebuild the rest.  This
	  cuts down on build times substantially since
	  unchanged ports do not get rebuilt every time.
	</para>
      </listitem>

      <listitem>
	<para><literal>-cdrom</literal> - This package build is
	  intended to end up on a CD-ROM, so
	  <literal>NO_CDROM</literal> packages and distfiles
	  should be deleted in post-processing.
	</para>
      </listitem>

      <listitem>
	<para><literal>-nobuild</literal> - Perform all
	  the preprocessing steps, but do not actually do
	  the package build.
	</para>
      </listitem>

      <listitem>
	<para><literal>-noindex</literal> - Do not rebuild
	  <filename>INDEX</filename> during preprocessing.
	</para>
      </listitem>

      <listitem>
	<para><literal>-noduds</literal> - Do not rebuild the
	  <filename>duds</filename> file (ports that are never
	  built, e.g.  those marked <literal>IGNORE</literal>,
	  <literal>NO_PACKAGE</literal>, etc.) during
	  preprocessing.
	</para>
      </listitem>

      <listitem>
	<para><literal>-trybroken</literal> - Try to build
	  <literal>BROKEN</literal> ports (off by default
	  because the amd64/&i386; clusters are fast enough now
	  that when doing incremental builds, more time
	  was spent rebuilding things that were going to
	  fail anyway.  Conversely, the other clusters
	  are slow enough that it would be a waste of time
	  to try and build <literal>BROKEN</literal> ports).
	</para>
      </listitem>

      <listitem>
	<para><literal>-nosrc</literal> - Do not update the
	  <literal>src</literal> tree from the ZFS snapshot, keep the tree from
	  previous build instead.
	</para>
      </listitem>

      <listitem>
	<para><literal>-srccvs</literal> - Do not update the
	  <literal>src</literal> tree from the ZFS snapshot, update it with
	  <literal>cvs update</literal> instead.
	</para>
      </listitem>

      <listitem>
	<para><literal>-noports</literal> - Do not update the
	  <literal>ports</literal> tree from the ZFS snapshot, keep the tree from
	  previous build instead.
	</para>
      </listitem>

      <listitem>
	<para><literal>-portscvs</literal> - Do not update the
	  <literal>ports</literal> tree from the ZFS snapshot, update it with
	  <literal>cvs update</literal> instead.
	</para>
      </listitem>

      <listitem>
	<para><literal>-norestr</literal> - Do not attempt to build
	  <literal>RESTRICTED</literal> ports.
	</para>
      </listitem>

      <listitem>
	<para><literal>-plistcheck</literal> - Make it fatal for
	  ports to leave behind files after deinstallation.
	</para>
      </listitem>

      <listitem>
	<para><literal>-nodistfiles</literal> - Do not collect distfiles
	  that pass <command>make checksum</command> for later
	  uploading to <hostid>ftp-master</hostid>.
	</para>
      </listitem>

      <listitem>
	<para><literal>-fetch-original</literal> - Fetch the
	  distfile from the original <literal>MASTER_SITES</literal>
	  rather than <hostid>ftp-master</hostid>.
	</para>
      </listitem>
    </itemizedlist>

    <para>If the last build finished cleanly you do not need to delete
      anything.  If it was interrupted, or you selected
      <literal>-nocleanup</literal>, you need to clean up clients by running
    </para>

    <para><command>build cleanup <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable> <replaceable>${buildid}</replaceable> -full</command></para>

    <para><filename>errors/</filename>,
      <filename>logs/</filename>, <filename>packages/</filename>, and so
      forth, are cleaned by the scripts.  If you are short of space,
      you can also clean out <filename>ports/distfiles/</filename>.
      Leave the <filename>latest/</filename> directory alone; it is
      a symlink for the webserver.</para>

    <note>
      <para><literal>dosetupnodes</literal> is supposed to be run from
	the <literal>dopackages</literal> script in the
	<literal>-restart</literal> case, but it can be a good idea to
	run it by hand and then verify that the clients all have the
	expected job load.  Sometimes,
	<literal>dosetupnode</literal> cannot clean up a build and you
	need to do it by hand.  (This is a bug.)</para>
    </note>

    <para>Make sure the <replaceable>${arch}</replaceable> build
      is run as the ports-<replaceable>${arch}</replaceable> user
      or it will complain loudly.</para>

    <note><para>The actual package build itself occurs in two
      identical phases.  The reason for this is that sometimes
      transient problems (e.g. NFS failures, FTP sites being
      unreachable, etc.) may halt a build.  Doing things
      in two phases is a workaround for these types of
      problems.</para></note>

    <para>Be careful that <filename>ports/Makefile</filename>
      does not specify any empty subdirectories.  This is especially
      important if you are doing an -exp build.  If the build
      process encounters an empty subdirectory, both package build
      phases will stop short, and an error similar to the following
      will be written to
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/make.[0|1]</filename>:
    </para>

    <screen><literal>don't know how to make dns-all(continuing)</literal></screen>

    <para>To correct this problem, simply comment out or remove
      the <literal>SUBDIR</literal> entries that point to empty
      subdirectories.  After doing this, you can restart the build
      by running the proper <command>dopackages</command> command
      with the <literal>-restart</literal> option.
    </para>

    <note>
      <para>This problem also appears if you create a new category
	<filename>Makefile</filename> with no <makevar>SUBDIR</makevar>s
	in it.  This is probably a bug.</para>
    </note>

    <example>
      <title>Update the i386-6 tree and do a complete build</title>

      <para><command>dopackages.6 i386 -nosrc -norestr -nofinish</command></para>
    </example>

    <example>
      <title>Restart an interrupted amd64-8 build without updating</title>

      <para><command>dopackages.8 amd64 -nosrc -noports -norestr -continue -noindex -noduds -nofinish</command></para>
    </example>

    <example>
      <title>Post-process a completed sparc64-7 tree</title>

      <para><command>dopackages.7 sparc64 -finish</command></para>
    </example>
    </sect2>

    <sect2 id="build-command">
      <title><command>build</command> command</title>

      <para>You may need to manipulate the build data before starting it,
	especially for experimental builds.  This is done with
	<command>build</command> command.</para>

      <itemizedlist>
	<listitem>
	  <para><literal>build list <replaceable>arch</replaceable>
	    <replaceable>branch</replaceable></literal> - Shows the current set
	    of build ids.
	  </para>
	</listitem>

	<listitem>
	  <para><literal>build clone <replaceable>arch</replaceable>
	    <replaceable>branch</replaceable> <replaceable>oldid</replaceable>
	    [<replaceable>newid</replaceable>]</literal> - Clones
	    <replaceable>oldid</replaceable> to
	    <replaceable>newid</replaceable> (or a datestamp if not specified).
	  </para>
	</listitem>

	<listitem>
	  <para><literal>build srcupdate <replaceable>arch</replaceable>
	    <replaceable>branch</replaceable>
	    <replaceable>buildid</replaceable></literal> - Replaces the src
	    tree with a new ZFS snapshot.  Don't forget to use
	    <literal>-nosrc</literal> flag to <command>dopackages</command>
	    later!
	  </para>
	</listitem>

	<listitem>
	  <para><literal>build portsupdate <replaceable>arch</replaceable>
	    <replaceable>branch</replaceable></literal> - Replaces the ports
	    tree with a new ZFS snapshot.  Don't forget to use
	    <literal>-noports</literal> flag to <command>dopackages</command>
	    later!
	  </para>
	</listitem>

      </itemizedlist>
    </sect2>

    <sect2 id="build-one">
      <title>Building a single package</title>

      <para>Sometimes there is a need to rebuild a single package from the
	package set.  This can be accomplished with the following
	invocation:</para>

      <para><command>/var/portbuild/evil/qmanager/packagebuild <replaceable>amd64</replaceable> <replaceable>7-exp</replaceable> <replaceable>20080904212103</replaceable> <replaceable>aclock-0.2.3_2</replaceable></command></para>
    </sect2>
  </sect1>

  <sect1 id="anatomy">
    <title>Anatomy of a Build</title>

    <para>A full build without any <literal>-no</literal>
      options performs the following operations in the
      specified order:</para>

    <orderedlist>
      <listitem>
	<para>An update of the current <literal>ports</literal>
	  tree from the ZFS snapshot [*]
	</para>
      </listitem>

      <listitem>
	<para>An update of the running branch's
	  <literal>src</literal> tree from the ZFS snapshot [*]
	</para>
      </listitem>

      <listitem>
	<para>Checks which ports do not have a
	  <literal>SUBDIR</literal> entry in their respective
	  category's <filename>Makefile</filename> [*]
	</para>
      </listitem>

      <listitem>
	<para>Creates the <filename>duds</filename> file, which
	  is a list of ports not to build [*] [+]
	</para>
      </listitem>

      <listitem>
	<para>Generates a fresh <filename>INDEX</filename>
	  file [*] [+]
	</para>
      </listitem>

      <listitem>
	<para>Sets up the nodes that will be used in the
	  build [*] [+]
	</para>
      </listitem>

      <listitem>
	<para>Builds a list of restricted ports [*] [+]</para>
      </listitem>

      <listitem>
	<para>Builds packages (phase 1) [++]</para>
      </listitem>

      <listitem>
	<para>Performs another node setup [+]</para>
      </listitem>

      <listitem>
	<para>Builds packages (phase 2) [++]</para>
      </listitem>
    </orderedlist>

    <para>[*] Status of these steps can be found in
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/build.log</filename>
      as well as on stderr of the tty running the
      <command>dopackages</command> command.</para>

    <para>[+] If any of these steps fail, the build will stop
      cold in its tracks.</para>

    <para>[++] Status of these steps can be found in
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/make.[0|1]</filename>,
      where <filename>make.0</filename> is the log file used by
      phase 1 of the package build and <filename>make.1</filename>
      is the log file used by phase 2.  Individual ports will write
      their build logs to
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/logs</filename>
      and their error logs to
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/errors</filename>.
    </para>

    <para>Formerly the docs tree was also checked out, however, it has
      been found to be unnecessary.
    </para>
  </sect1>

  <sect1 id="interrupting">
    <title>Interrupting a Build</title>

    <para>Interrupting a build is a bit messy.  First you need to
      identify the tty in which it's running (either record the output
      of &man.tty.1; when you start the build, or use <command>ps x</command>
      to identify it.  You need to make sure that nothing else important
      is running in this tty, e.g. <command>ps -t p1</command> or whatever.
      If there is not, you can just kill off the whole term easily with
      <command>pkill -t p1</command>; otherwise issue a
      <command>kill -HUP</command> in there by, for example,
<command>ps -t p1 -o pid= | xargs kill -HUP</command>.  Replace
      <replaceable>p1</replaceable> by whatever the tty is, of course.</para>

    <para>The
      package builds dispatched by <command>make</command> to
      the client machines will clean themselves up after a
      few minutes (check with <command>ps x</command> until they
      all go away).</para>

    <para>If you do not kill &man.make.1;, then it will spawn more jobs.
     If you do not kill <command>dopackages</command>, then it will restart
     the entire build.  If you do not kill the <command>pdispatch</command>
     processes, they'll keep going (or respawn) until they've built their
     package.</para>

    <para>To free up resources, you will need to clean up client machines by
      running <command>build cleanup</command> command.  For example:
      <screen>&prompt.user; <userinput>/var/portbuild/scripts/build cleanup i386 6-exp 20080714120411 -full</userinput></screen>

    <para>If you forget to do this, then the old build
      <literal>chroot</literal>s will not be cleaned up for 24 hours, and no
      new jobs will be dispatched in their place since
      <hostid>pointyhat</hostid> thinks the job slot is still occupied.</para>

    <para>To check, <command>cat ~/loads/*</command> to display the
      status of client machines; the first column is the number of jobs
      it thinks is running, and this should be roughly concordant
      with the load average.  <literal>loads</literal> is refreshed
      every 2 minutes.  If you do <command>ps x | grep pdispatch</command>
      and it's less than the number of jobs that <literal>loads</literal>
      thinks are in use, you're in trouble.</para>

    <para>You may have problem with the <command>umount</command>
      commands hanging.  If so, you are going to have to use the
      <command>allgohans</command> script to run an &man.ssh.1;
      command across all clients for that buildenv.  For example:
<screen>ssh -l root gohan24 df</screen>

      will get you a df, and

<screen>allgohans "umount -f pointyhat.freebsd.org:/var/portbuild/i386/6-exp/ports"
allgohans "umount -f pointyhat.freebsd.org:/var/portbuild/i386/6-exp/src"</screen>

      are supposed to get rid of the hanging mounts.  You will have to
      keep doing them since there can be multiple mounts.</para>

    <note>
      <para>Ignore the following:

<screen>umount: pointyhat.freebsd.org:/var/portbuild/i386/6-exp/ports: statfs: No such file or directory
umount: pointyhat.freebsd.org:/var/portbuild/i386/6-exp/ports: unknown file system
umount: Cleanup of /x/tmp/6-exp/chroot/53837/compat/linux/proc failed!
/x/tmp/6-exp/chroot/53837/compat/linux/proc: not a file system root directory</screen>

      The former 2 mean that that client did not have those mounted;
      the latter 2 are a bug.</para>

      <para>You may also see messages about <literal>procfs</literal>.</para>
    </note>

    <para>After you have done all the above, remove the
      <filename><replaceable>${arch}</replaceable>/lock</filename>
      file before trying to restart the build.  If you do not,
      <filename>dopackages</filename> will simply exit.
    </para>

    <para>If you have to do a <command>cvs update</command> before
      restarting, you may have to rebuild either <filename>duds</filename>,
      <filename>INDEX</filename>, or both.  If you are doing the latter
      manually, you will also have to rebuild
      <filename>packages/All/Makefile</filename> via the
      <command>makeparallel</command> script.</para>
  </sect1>

  <sect1 id="monitoring">
    <title>Monitoring the Build</title>

    <para>You can use <command>qclient</command> command to monitor the status
      of build nodes, and to list the currently scheduled jobs:</para>

    <para><command>python /var/portbuild/evil/qmanager/qclient jobs</command></para>
    <para><command>python /var/portbuild/evil/qmanager/qclient status</command></para>

    <para>The
      <command>scripts/stats <replaceable>${branch}</replaceable></command>
      command shows the number of packages already built.</para>

    <para>Running <command>cat /var/portbuild/*/loads/*</command>
      shows the client loads and number of concurrent builds in
      progress.  The files that have been recently updated are the clients
      that are online; the others are the offline clients.</para>

    <note>
      <para>The <command>pdispatch</command> command does the dispatching
        of work onto the client, and post-processing.
        <command>ptimeout.host</command> is a watchdog that kills a build
        after timeouts.  So, having 50 <command>pdispatch</command>
        processes but only 4 &man.ssh.1; processes means 46
        <command>pdispatch</command>es are idle, waiting to get an
        idle node.</para>
    </note>

    <para>Running <command>tail -f <replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/build.log</command>
      shows the overall build progress.</para>

    <para>If a port build is failing, and it is not immediately obvious
      from the log as to why, you can preserve the
      <literal>WRKDIR</literal> for further analysis.  To do this,
      touch a file called <filename>.keep</filename> in the port's
      directory.  The next time the cluster tries to build this port,
      it will tar, compress, and copy the <literal>WRKDIR</literal>
      to
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/wrkdirs</filename>.
    </para>

    <para>If you find that the system is looping trying to build the
      same package over and over again, you may be able to fix the
      problem by rebuilding the offending package by hand.</para>

    <para>If all the builds start failing with complaints that they
      cannot load the dependent packages, check to see that
      <application>httpd</application> is still running, and restart
      it if not.</para>

    <para>Keep an eye on &man.df.1; output.  If the
      <filename>/var/portbuild</filename> file system becomes full
      then <trademark>Bad Things</trademark> happen.
    </para>

    <para>The status of all current builds is generated twice an hour
      and posted to
      <ulink url="http://pointyhat.FreeBSD.org/errorlogs/packagestats.html"></ulink>.
      For each <literal>buildenv</literal>, the following is displayed:</para>

    <itemizedlist>
      <listitem>
	<para><literal>cvs date</literal> is the contents of
	  <filename>cvsdone</filename>.  This is why we recommend that you
	  update <filename>cvsdone</filename> for <literal>-exp</literal>
	  runs (see below).</para>
      </listitem>

      <listitem>
	<para>date of <literal>latest log</literal></para>
      </listitem>

      <listitem>
	<para>number of lines in <literal>INDEX</literal></para>
      </listitem>

      <listitem>
	<para>the number of current <literal>build logs</literal></para>
      </listitem>

      <listitem>
	<para>the number of completed <literal>packages</literal></para>
      </listitem>

      <listitem>
	<para>the number of <literal>errors</literal></para>
      </listitem>

      <listitem>
	<para>the number of duds (shown as <literal>skipped</literal>)</para>
      </listitem>

      <listitem>
	<para><literal>missing</literal> shows the difference between
	  <filename>INDEX</filename> and the other columns.  If you have
	  restarted a run after a <command>cvs update</command>, there
	  will likely be duplicates in the packages and error columns,
	  and this column will be meaningless.  (The script is naive).</para>
      </listitem>

      <listitem>
	<para><literal>running</literal> and <literal>completed</literal>
	  are guesses based on a &man.grep.1; of <filename>build.log</filename>.
	</para>
      </listitem>
    </itemizedlist>
  </sect1>

  <sect1 id="errors">
    <title>Dealing With Build Errors</title>

    <para>The easiest way to track build failures is to receive
      the emailed logs and sort them to a folder, so you can maintain a
      running list of current failures and detect new ones easily.
      To do this, add an email address to
      <filename><replaceable>${branch}</replaceable>/portbuild.conf</filename>.
      You can easily bounce the new ones to maintainers.</para>

    <para>After a port appears broken on every build combination
      multiple times, it is time to mark it <literal>BROKEN</literal>.
      Two weeks' notification for the maintainers seems fair.</para>

    <note>
      <para>To avoid build errors with ports that need to be manually
	fetched, put the distfiles into
	<filename>~ftp/pub/FreeBSD/distfiles</filename>.</para>
    </note>
  </sect1>

  <sect1 id="release">
    <title>Release Builds</title>

    <para>When building packages for a release, it may be
      necessary to manually update the <literal>ports</literal>
      and <literal>src</literal> trees to the release tag and use
      <literal>-nocvs</literal> and
      <literal>-noportscvs</literal>.</para>

    <para>To build package sets intended for use on a CD-ROM,
      use the <literal>-cdrom</literal> option to
      <command>dopackages</command>.</para>

    <para>If the disk space is not available on the cluster, use
      <literal>-nodistfiles</literal> to avoid collecting distfiles.</para>

    <para>After the initial build completes, restart the build
      with
      <literal>-restart -fetch-original</literal>
      to collect updated distfiles as well.  Then, once the
      build is post-processed, take an inventory of the list
      of files fetched:</para>

    <screen>&prompt.user; <userinput>cd <replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable></userinput>
&prompt.user; <userinput>find distfiles > distfiles-<replaceable>${release}</replaceable></userinput></screen>

    <para>This inventory file typically lives in
      <filename>i386/<replaceable>${branch}</replaceable></filename>
      on the cluster master.</para>

    <para>This is useful to aid in periodically cleaning out
      the distfiles from <hostid>ftp-master</hostid>.  When space
      gets tight, distfiles from recent releases can be kept while
      others can be thrown away.</para>

    <para>Once the distfiles have been uploaded (see below),
      the final release package set must be created.  Just to be
      on the safe side, run the
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/cdrom.sh</filename>
      script by hand to make sure all the CD-ROM restricted packages
      and distfiles have been pruned.  Then, copy the
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/packages</filename>
      directory to
      <filename><replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable>/packages-<replaceable>${release}</replaceable></filename>.
      Once the packages are safely moved off, contact the &a.re;
      and inform them of the release package location.</para>

    <para>Remember to coordinate with the &a.re; about the timing
      and status of the release builds.
    </para>
  </sect1>

  <sect1 id="uploading">
    <title>Uploading Packages</title>

    <para>Once a build has completed, packages and/or distfiles
      can be transferred to <hostid>ftp-master</hostid> for
      propagation to the FTP mirror network.  If the build was
      run with <literal>-nofinish</literal>, then make sure to
      follow up with
      <command>dopackages -finish</command> to post-process the
      packages (removes <literal>RESTRICTED</literal> and
      <literal>NO_CDROM</literal> packages where appropriate,
      prunes packages not listed in <filename>INDEX</filename>,
      removes from <filename>INDEX</filename>
      references to packages not built, and generates a
      <filename>CHECKSUM.MD5</filename>
      summary); and distfiles (moves them from the temporary
      <filename>distfiles/.pbtmp</filename> directory into
      <filename>distfiles/</filename> and removes
      <literal>RESTRICTED</literal> and <literal>NO_CDROM</literal>
      distfiles).</para>

    <para>It is usually a good idea to run the
      <command>restricted.sh</command> and/or
      <command>cdrom.sh</command> scripts by hand after
      <command>dopackages</command> finishes just to be safe.
      Run the <command>restricted.sh</command> script before
      uploading to <hostid>ftp-master</hostid>, then run
      <command>cdrom.sh</command> before preparing
      the final package set for a release.</para>

    <para>The package subdirectories are named by whether they are for
      <literal>release</literal>, <literal>stable</literal>, or
      <literal>current</literal>.  Examples:</para>

    <itemizedlist>
      <listitem>
	<para><literal>packages-6.3-release</literal></para>
      </listitem>

      <listitem>
	<para><literal>packages-6-stable</literal></para>
      </listitem>

      <listitem>
	<para><literal>packages-7.0-release</literal></para>
      </listitem>

      <listitem>
	<para><literal>packages-7-stable</literal></para>
      </listitem>

      <listitem>
	<para><literal>packages-8-current</literal></para>
      </listitem>
    </itemizedlist>

    <note><para>Some of the directories on
      <hostid>ftp-master</hostid> are, in fact, symlinks.  Examples:</para>

      <itemizedlist>
	<listitem>
	  <para><literal>packages-stable</literal></para>
	</listitem>

	<listitem>
	  <para><literal>packages-current</literal></para>
	</listitem>
      </itemizedlist>

      <para> Be sure
	you move the new packages directory over the
	<emphasis>real</emphasis> destination directory, and not
	one of the symlinks that points to it.</para>
    </note>

    <para>If you are doing a completely new package set (e.g. for
      a new release), copy packages to the staging area on
      <hostid>ftp-master</hostid> with something like the following:</para>

    <screen>&prompt.root; <userinput>cd /var/portbuild/<replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable></userinput>
&prompt.root; <userinput>tar cfv - packages/ | ssh portmgr@ftp-master tar xfC - w/ports/<replaceable>${arch}</replaceable>/tmp/<replaceable>${subdir}</replaceable></userinput></screen>

    <para>Then log into <hostid>ftp-master</hostid>, verify that
      the package set was transferred successfully, remove the
      package set that the new package set is to replace (in
      <filename>~/w/ports/<replaceable>${arch}</replaceable></filename>),
      and move the new set into place.  (<literal>w/</literal> is
      merely a shortcut.)</para>

    <para>For incremental builds, packages should be uploaded
      using <command>rsync</command> so we do not put too much
      strain on the mirrors.</para>

    <para><emphasis>ALWAYS</emphasis> use <literal>-n</literal>
      first with <command>rsync</command> and check the output
      to make sure it is sane.  If it looks good, re-run the
      <command>rsync</command> without the <literal>-n</literal>
      option.
    </para>

    <para>Example <command>rsync</command> command for incremental
      package upload:</para>

    <screen>&prompt.root; <userinput>rsync -n -r -v -l -t -p --delete packages/ portmgr@ftp-master:w/ports/<replaceable>${arch}</replaceable>/<replaceable>${subdir}</replaceable>/ | tee log</userinput></screen>

    <para>Distfiles can be transferred with the
      <command>cpdistfiles</command> script:</para>

    <screen>&prompt.root; <userinput>/var/portbuild/scripts/cpdistfiles <replaceable>${arch}</replaceable> <replaceable>${branch}</replaceable></userinput></screen>

    <para>Or you can do it by hand using <command>rsync</command>
      command:</para>

    <screen>&prompt.root; <userinput>cd /var/portbuild/<replaceable>${arch}</replaceable>/<replaceable>${branch}</replaceable></userinput>
&prompt.root; <userinput>rsync -n -r -v -l -p -c distfiles/ portmgr@ftp-master:w/ports/distfiles/ | tee log</userinput></screen>

    <para>Again, run the command without the <literal>-n</literal>
      option after you have checked it.</para>
  </sect1>

  <sect1 id="expbuilds">
    <title>Experimental Patches Builds</title>

    <para>Experimental patches builds are run from time to time to
      new features or bugfixes to the ports infrastructure (i.e.
      <literal>bsd.port.mk</literal>), or to test large sweeping
      upgrades.  The current experimental patches branch is
      <literal>7-exp</literal> on the &i386;
      architecture.</para>

    <para>In general, an experimental patches build is run the same
      way as any other build, except that you should first update the
      ports tree to the latest version and then apply your patches.
      To do the former, you can use the following:

      <screen>&prompt.user; <userinput>cvs -R update -dP > update.out</userinput>
&prompt.user; <userinput>date > cvsdone</userinput></screen>
      This will most closely simulate what the <literal>dopackages</literal>
      script does.  (While <filename>cvsdone</filename> is merely
      informative, it can be a help.)</para>

    <para>You will need to edit <filename>update.out</filename> to look
      for lines beginning with <literal>^M</literal>, <literal>^C</literal>,
      or <literal>^?</literal> and then deal with them.</para>

    <para>It is always a good idea to save
      original copies of all changed files, as well as a list of what
      you are changing.  You can then look back on this list when doing
      the final commit, to make sure you are committing exactly what you
      tested.</para>

    <para>Since the machine is shared, someone else may delete your
      changes by mistake, so keep a copy of them in e.g. your home
      directory on <hostid>freefall</hostid>.  Do not use
      <filename>tmp/</filename>; since <hostid>pointyhat</hostid>
      itself runs some version of <literal>-CURRENT</literal>, you
      can expect reboots (if nothing else, for updates).</para>

    <para>In order to have a good control case with which to compare
      failures, you should first do a package build of the branch on
      which the experimental patches branch is based for the &i386;
      architecture (currently this is <literal>6</literal>).  Then, when
      preparing for the experimental patches build, checkout a ports
      tree and a src tree with the same date as was used for the control
      build.  This will ensure an apples-to-apples comparison
      later.</para>

    <note><para>One build cluster can do the control build while the other
      does the experimental patches build.  This can be a great
      time-saver.</para></note>

    <para>Once the build finishes, compare the control build failures
      to those of the experimental patches build.  Use the following
      commands to facilitate this (this assumes the <literal>6</literal>
      branch is the control branch, and the <literal>6-exp</literal>
      branch is the experimental patches branch):</para>

    <screen>&prompt.user; <userinput>cd /var/portbuild/i386/6-exp/errors</userinput>
&prompt.user; <userinput>find . -name \*.log\* | sort > /tmp/6-exp-errs</userinput>
&prompt.user; <userinput>cd /var/portbuild/i386/6/errors</userinput>
&prompt.user; <userinput>find . -name \*.log\* | sort > /tmp/6-errs</userinput></screen>

    <note><para>If it has been a long time since one of the builds
      finished, the logs may have been automatically compressed with
      bzip2.  In that case, you must use <literal>sort | sed
      's,\.bz2,,g'</literal> instead.</para></note>

    <screen>&prompt.user; <userinput>comm -3 /tmp/6-errs /tmp/6-exp-errs | less</userinput></screen>

    <para>This last command will produce a two-column report.  The
      first column is ports that failed on the control build but not in
      the experimental patches build; the second column is vice versa.
      Reasons that the port might be in the first column
      include:</para>

    <itemizedlist>
      <listitem>
	<para>Port was fixed since the control build was run, or was
	  upgraded to a newer version that is also broken (thus the
	  newer version should appear in the second column)
	</para>
      </listitem>

      <listitem>
	<para>Port is fixed by the patches in the experimental patches
	  build
	</para>
      </listitem>

      <listitem>
	<para>Port did not build under the experimental patches build
	  due to a dependency failure
	</para>
      </listitem>
    </itemizedlist>

    <para>Reasons for a port appearing in the second column
      include:</para>

    <itemizedlist>
      <listitem>
	<para>Port was broken by the experimental patches [1]</para>
      </listitem>

      <listitem>
	<para>Port was upgraded since the control build and has become
	  broken [2]
	</para>
      </listitem>

      <listitem>
	<para>Port was broken due to a transient error (e.g. FTP site
	  down, package client error, etc.)
	</para>
      </listitem>
    </itemizedlist>

    <para>Both columns should be investigated and the reason for the
      errors understood before committing the experimental patches set.
      To differentiate between [1] and [2] above, you can do a rebuild
      of the affected packages under the control branch:</para>

    <screen>&prompt.user; <userinput>cd /var/portbuild/i386/6/ports</userinput></screen>

    <note><para>Be sure to <literal>cvs update</literal> this tree to the same date as
      the experimental patches tree.</para></note>

    <para>The following command will set up the control branch for
      the partial build:</para>

    <screen>&prompt.user; <userinput>/var/portbuild/scripts/dopackages.6 -noportscvs -nobuild -nocvs -nofinish</userinput></screen>

    <para>The builds must be performed from the
      <literal>packages/All</literal> directory.  This directory should
      initially be empty except for the Makefile symlink.  If this
      symlink does not exist, it must be created:</para>

    <screen>&prompt.user; <userinput>cd /var/portbuild/i386/6/packages/All</userinput>
&prompt.user; <userinput>ln -sf ../../Makefile .</userinput>
&prompt.user; <userinput>make -k -j&lt;#&gt; &lt;list of packages to build&gt;</userinput></screen>

    <note><para>&lt;#&gt; is the concurrency of the build to
      attempt.  It is usually the sum of the weights listed in
      <filename>/var/portbuild/i386/mlist</filename> unless you have a
      reason to run a heavier or lighter build.</para>

    <para>The list of packages to build should be a list of package
      names (including versions) as they appear in
      <filename>INDEX</filename>.  The <literal>PKGSUFFIX</literal>
      (i.e. .tgz or .tbz) is optional.</para></note>

    <para>This will build only those packages listed as well as all
      of their dependencies.</para>

    <para>You can check the progress of this
      partial build the same way you would a regular build.</para>

    <para>Once all
      the errors have been resolved, you can commit the package set.
      After committing, it is customary to send a <literal>HEADS
      UP</literal> email to <ulink
      url="mailto:ports@FreeBSD.org">ports@FreeBSD.org</ulink> and
      copy <ulink
      url="mailto:ports-developers@FreeBSD.org">ports-developers@FreeBSD.org</ulink>
      informing people of the changes.  A summary of all changes
      should also be committed  to
      <filename>/usr/ports/CHANGES</filename>.</para>
  </sect1>

  <sect1 id="disk-failure">
    <title>Procedures for dealing with disk failures</title>

    <para>When a machine has a disk failure (e.g. panics due to read errors,
      etc), then we should do the following steps:</para>

    <itemizedlist>
      <listitem><para>Note the time and failure mode (e.g. paste in the
	relevant console output) in
	<filename>/var/portbuild/<replaceable>${arch}</replaceable>/reboots</filename></para></listitem>

      <listitem><para>For i386 gohan clients, scrub the disk by touching
	<filename>/SCRUB</filename> in the nfsroot (e.g.
	<filename>/a/nfs/8.dir1/SCRUB</filename>) and rebooting.  This will
	<command>dd if=/dev/zero of=/dev/ad0</command> and force the drive to
	remap any bad sectors it finds, if it has enough spares left.  This is
	a temporary measure to extend the lifetime of a drive that is on the
	way out.</para>

	<note><para>For the i386 blade systems another signal of a failing
	  disk seems to be that the blade will completely hang and be
	  unresponsive to either console break, or even NMI.</para></note>

	<para>For other build systems that don't newfs their disk at boot (e.g.
	  amd64 systems) this step has to be skipped.</para></listitem>

      <listitem><para>If the problem recurs, then the disk is probably toast.
	Take the machine out of <filename>mlist</filename> and (for ata disks)
	run <command>smartctl</command> on the drive:</para>

	<screen>smartctl -t long /dev/ad0</screen>

	<para>It will take about 1/2 hour:</para>

	<screen>gohan51# smartctl -t long /dev/ad0
smartctl version 5.38 [i386-portbld-freebsd8.0] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 31 minutes for test to complete.
Test will complete after Fri Jul  4 03:59:56 2008

Use smartctl -X to abort test.</screen>

	<para>Then <command>smartctl -a /dev/ad0</command> shows the status
	  after it finishes:</para>

	<screen># SMART Self-test log structure revision number 1
# Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
#   1  Extended offline    Completed: read failure       80%     15252    319286</screen>

	<para>It will also display other data including a log of previous drive
	  errors.  It is possible for the drive to show previous DMA errors
	  without failing the self-test though (because of sector
	  remapping).</para></listitem>
    </itemizedlist>

    <para>When a disk has failed, please inform &a.kris; so he can try to get it
      replaced.</para>
  </sect1>
</article>