doc/en_US.ISO8859-1/books/developers-handbook/secure/chapter.sgml

<!--
     The FreeBSD Documentation Project

     $FreeBSD$
-->

    <chapter id="secure">
      <chapterinfo>
	<authorgroup>
	  <author>
	    <firstname>Murray</firstname>
	    <surname>Stokely</surname>
	    <contrib>Contributed by </contrib>
	  </author>
	</authorgroup>
      </chapterinfo>

      <title>Secure Programming</title>

      <sect1 id="secure-synopsis"><title>Synopsis</title>

      <para>This chapter describes some of the security issues that
      have plagued &unix; programmers for decades and some of the new
      tools available to help programmers avoid writing exploitable
      code.</para>
      </sect1>

      <sect1 id="secure-philosophy"><title>Secure Design
      Methodology</title>

      <para>Writing secure applications takes a very scrutinous and
      pessimistic outlook on life.  Applications should be run with
      the principle of <quote>least privilege</quote> so that no
      process is ever running with more than the bare minimum access
      that it needs to accomplish its function.  Previously tested
      code should be reused whenever possible to avoid common
      mistakes that others may have already fixed.</para>

      <para>One of the pitfalls of the &unix; environment is how easy it
      is to make assumptions about the sanity of the environment.
      Applications should never trust user input (in all its forms),
      system resources, inter-process communication, or the timing of
      events.  &unix; processes do not execute synchronously so logical
      operations are rarely atomic.</para>
      </sect1>

      <sect1 id="secure-bufferov"><title>Buffer Overflows</title>

      <para>Buffer Overflows have been around since the very
      beginnings of the Von-Neuman <xref linkend="COD"> architecture.

      <indexterm><primary>buffer overflow</primary></indexterm>
      <indexterm><primary>Von-Neuman</primary></indexterm>

      They first gained widespread notoriety in 1988 with the Morris
      Internet worm.  Unfortunately, the same basic attack remains

      <indexterm><primary>Morris Internet worm</primary></indexterm>

      effective today.
      By far the most common type of buffer overflow attack is based
      on corrupting the stack.</para>

      <indexterm><primary>stack</primary></indexterm>
      <indexterm><primary>arguments</primary></indexterm>

      <para>Most modern computer systems use a stack to pass arguments
      to procedures and to store local variables.  A stack is a last
      in first out (LIFO) buffer in the high memory area of a process
      image.  When a program invokes a function a new "stack frame" is

      <indexterm><primary>LIFO</primary></indexterm>
      <indexterm>
        <primary>process image</primary>
	  <secondary>stack pointer</secondary>
      </indexterm>

      created.  This stack frame consists of the arguments passed to
      the function as well as a dynamic amount of local variable
      space.  The "stack pointer" is a register that holds the current

      <indexterm><primary>stack frame</primary></indexterm>
      <indexterm><primary>stack pointer</primary></indexterm>

      location of the top of the stack.  Since this value is
      constantly changing as new values are pushed onto the top of the
      stack, many implementations also provide a "frame pointer" that
      is located near the beginning of a stack frame so that local
      variables can more easily be addressed relative to this
      value. <xref linkend="COD"> The return address for function

      <indexterm><primary>frame pointer</primary></indexterm>
      <indexterm>
        <primary>process image</primary>
        <secondary>frame pointer</secondary>
      </indexterm>
      <indexterm><primary>return address</primary></indexterm>
      <indexterm><primary>stack-overflow</primary></indexterm>

      calls is also stored on the stack, and this is the cause of
      stack-overflow exploits since overflowing a local variable in a
      function can overwrite the return address of that function,
      potentially allowing a malicious user to execute any code he or
      she wants.</para>

      <para>Although stack-based attacks are by far the most common,
      it would also be possible to overrun the stack with a heap-based
      (malloc/free) attack.</para>

      <para>The C programming language does not perform automatic
      bounds checking on arrays or pointers as many other languages
      do.  In addition, the standard C library is filled with a
      handful of very dangerous functions.</para>

      <informaltable frame="none" pgwide="1">
        <tgroup cols="2">
          <tbody>
          <row><entry><function>strcpy</function>(char *dest, const char
          *src)</entry>
          <entry><simpara>May overflow the dest buffer</simpara></entry>
          </row>

          <row><entry><function>strcat</function>(char *dest, const char
          *src)</entry>
          <entry><simpara>May overflow the dest buffer</simpara></entry>
          </row>

          <row><entry><function>getwd</function>(char *buf)</entry>
          <entry><simpara>May overflow the buf buffer</simpara></entry>
          </row>

          <row><entry><function>gets</function>(char *s)</entry>
          <entry><simpara>May overflow the s buffer</simpara></entry>
          </row>

          <row><entry><function>[vf]scanf</function>(const char *format,
          ...)</entry>
          <entry><simpara>May overflow its arguments.</simpara></entry>
          </row>

          <row><entry><function>realpath</function>(char *path, char
          resolved_path[])</entry>
          <entry><simpara>May overflow the path buffer</simpara></entry>
          </row>

          <row><entry><function>[v]sprintf</function>(char *str, const char
          *format, ...)</entry>
          <entry><simpara>May overflow the str buffer.</simpara></entry>
          </row>
          </tbody>
        </tgroup>
      </informaltable>

      <sect2><title>Example Buffer Overflow</title>

      <para>The following example code contains a buffer overflow
      designed to overwrite the return address and skip the
      instruction immediately following the function call.  (Inspired
      by <xref linkend="Phrack">)</para>

<programlisting>#include &lt;stdio.h&gt;

void manipulate(char *buffer) {
  char newbuffer[80];
  strcpy(newbuffer,buffer);
}

int main() {
  char ch,buffer[4096];
  int i=0;

  while ((buffer[i++] = getchar()) != '\n') {};

  i=1;
  manipulate(buffer);
  i=2;
  printf("The value of i is : %d\n",i);
  return 0;
}</programlisting>

      <para>Let us examine what the memory image of this process would
      look like if we were to input 160 spaces into our little program
      before hitting return.</para>

      <para>[XXX figure here!]</para>

      <para>Obviously more malicious input can be devised to execute
      actual compiled instructions (such as exec(/bin/sh)).</para>
      </sect2>

      <sect2><title>Avoiding Buffer Overflows</title>

      <para>The most straightforward solution to the problem of
      stack-overflows is to always use length restricted memory and
      string copy functions.  <function>strncpy</function> and
      <function>strncat</function> are part of the standard C library.

      <indexterm>
        <primary>string copy functions</primary>
	  <secondary>strncpy</secondary>
      </indexterm>
      <indexterm>
        <primary>string copy functions</primary>
	  <secondary>strncat</secondary>
      </indexterm>

      These functions accept a length value as a parameter which
      should be no larger than the size of the destination buffer.
      These functions will then copy up to `length' bytes from the
      source to the destination.  However there are a number of
      problems with these functions.  Neither function guarantees NUL
      termination if the size of the input buffer is as large as the

      <indexterm><primary>NUL termination</primary></indexterm>

      destination.  The length parameter is also used inconsistently
      between strncpy and strncat so it is easy for programmers to get
      confused as to their proper usage.  There is also a significant
      performance loss compared to <function>strcpy</function> when
      copying a short string into a large buffer since
      <function>strncpy</function> NUL fills up the size
      specified.</para>

      <para>In OpenBSD, another memory copy implementation has been

      <indexterm><primary>OpenBSD</primary></indexterm>

      created to get around these problem.  The
      <function>strlcpy</function> and <function>strlcat</function>
      functions guarantee that they will always null terminate the
      destination string when given a non-zero length argument.  For
      more information about these functions see <xref
      linkend="OpenBSD">.  The OpenBSD <function>strlcpy</function> and
      <function>strlcat</function> instructions have been in FreeBSD
      since 3.3.</para>

      <indexterm>
        <primary>string copy functions</primary>
	  <secondary>strlcpy</secondary>
      </indexterm>

      <indexterm>
	<primary>string copy functions</primary>
          <secondary>strlcat</secondary>
      </indexterm>

        <sect3><title>Compiler based run-time bounds checking</title>

	<indexterm><primary>bounds checking</primary>
	<secondary>compiler-based</secondary></indexterm>

        <para>Unfortunately there is still a very large assortment of
        code in public use which blindly copies memory around without
        using any of the bounded copy routines we just discussed.
        Fortunately, there is a way to help prevent such attacks &mdash;
        run-time bounds checking, which is implemented by several
        C/C++ compilers.</para>

	<indexterm><primary>ProPolice</primary></indexterm>
	<indexterm><primary>StackGuard</primary></indexterm>
	<indexterm><primary>gcc</primary></indexterm>

	<para>ProPolice is one such compiler feature, and is integrated
	  into &man.gcc.1; versions 4.1 and later.  It replaces and
	  extends the earlier StackGuard &man.gcc.1; extension.</para>

	<para>ProPolice helps to protect against stack-based buffer
	  overflows and other attacks by laying pseudo-random numbers in
	  key areas of the stack before calling any function.  When a
	  function returns, these <quote>canaries</quote> are checked
	  and if they are found to have been changed the executable is
	  immediately aborted.  Thus any attempt to modify the return
	  address or other variable stored on the stack in an attempt to
	  get malicious code to run is unlikely to succeed, as the
	  attacker would have to also manage to leave the pseudo-random
	  canaries untouched.</para>

        <indexterm><primary>buffer overflow</primary></indexterm>

        <para>Recompiling your application with ProPolice is an
        effective means of stopping most buffer-overflow attacks, but
        it can still be compromised.</para>

        </sect3>

        <sect3><title>Library based run-time bounds checking</title>

  	<indexterm>
	  <primary>bounds checking</primary>
	  <secondary>library-based</secondary>
	</indexterm>

        <para>Compiler-based mechanisms are completely useless for
        binary-only software for which you cannot recompile.  For
        these situations there are a number of libraries which
        re-implement the unsafe functions of the C-library
        (<function>strcpy</function>, <function>fscanf</function>,
        <function>getwd</function>, etc..) and ensure that these
        functions can never write past the stack pointer.</para>

        <itemizedlist>
        <listitem><simpara>libsafe</simpara></listitem>
        <listitem><simpara>libverify</simpara></listitem>
        <listitem><simpara>libparanoia</simpara></listitem>
        </itemizedlist>

	<para>Unfortunately these library-based defenses have a number
        of shortcomings.  These libraries only protect against a very
        small set of security related issues and they neglect to fix
        the actual problem.  These defenses may fail if the
        application was compiled with -fomit-frame-pointer.  Also, the
        LD_PRELOAD and LD_LIBRARY_PATH environment variables can be
        overwritten/unset by the user.</para>
        </sect3>

        </sect2>
      </sect1>

      <sect1 id="secure-setuid"><title>SetUID issues</title>

      <indexterm><primary>seteuid</primary></indexterm>

      <para>There are at least 6 different IDs associated with any
      given process.  Because of this you have to be very careful with
      the access that your process has at any given time.  In
      particular, all seteuid applications should give up their
      privileges as soon as it is no longer required.</para>

      <indexterm>
        <primary>user IDs</primary>
          <secondary>real user ID</secondary>
      </indexterm>
      <indexterm>
        <primary>user IDs</primary>
          <secondary>effective user ID</secondary>
      </indexterm>

      <para>The real user ID can only be changed by a superuser
      process.  The <application>login</application> program sets this
      when a user initially logs in and it is seldom changed.</para>

      <para>The effective user ID is set by the
      <function>exec()</function> functions if a program has its
      seteuid bit set.  An application can call
      <function>seteuid()</function> at any time to set the effective
      user ID to either the real user ID or the saved set-user-ID.
      When the effective user ID is set by <function>exec()</function>
      functions, the previous value is saved in the saved set-user-ID.</para>

      </sect1>

      <sect1 id="secure-chroot"><title>Limiting your program's environment</title>

      <indexterm><primary>chroot()</primary></indexterm>

      <para>The traditional method of restricting a process
      is with the <function>chroot()</function> system call.  This
      system call changes the root directory from which all other
      paths are referenced for a process and any child processes.  For
      this call to succeed the process must have execute (search)
      permission on the directory being referenced.  The new
      environment does not actually take effect until you
      <function>chdir()</function> into your new environment.  It
      should also be noted that a process can easily break out of a
      chroot environment if it has root privilege.  This could be
      accomplished by creating device nodes to read kernel memory,
      attaching a debugger to a process outside of the &man.chroot.8;
      environment, or in
      many other creative ways.</para>

      <para>The behavior of the <function>chroot()</function> system
      call can be controlled somewhat with the
      kern.chroot_allow_open_directories <command>sysctl</command>
      variable.  When this value is set to 0,
      <function>chroot()</function> will fail with EPERM if there are
      any directories open.  If set to the default value of 1, then
      <function>chroot()</function> will fail with EPERM if there are
      any directories open and the process is already subject to a
      <function>chroot()</function> call.  For any other value, the
      check for open directories will be bypassed completely.</para>

      <sect2><title>FreeBSD's jail functionality</title>

      <indexterm><primary>jail</primary></indexterm>

      <para>The concept of a Jail extends upon the
      <function>chroot()</function> by limiting the powers of the
      superuser to create a true `virtual server'.  Once a prison is
      set up all network communication must take place through the
      specified IP address, and the power of "root privilege" in this
      jail is severely constrained.</para>

      <para>While in a prison, any tests of superuser power within the
      kernel using the <function>suser()</function> call will fail.
      However, some calls to <function>suser()</function> have been
      changed to a new interface <function>suser_xxx()</function>.
      This function is responsible for recognizing or denying access
      to superuser power for imprisoned processes.</para>

      <para>A superuser process within a jailed environment has the
      power to:</para>

      <itemizedlist>
      <listitem><simpara>Manipulate credential with
        <function>setuid</function>, <function>seteuid</function>,
        <function>setgid</function>, <function>setegid</function>,
        <function>setgroups</function>, <function>setreuid</function>,
        <function>setregid</function>, <function>setlogin</function></simpara></listitem>
      <listitem><simpara>Set resource limits with <function>setrlimit</function></simpara></listitem>
      <listitem><simpara>Modify some sysctl nodes
        (kern.hostname)</simpara></listitem>
      <listitem><simpara><function>chroot()</function></simpara></listitem>
      <listitem><simpara>Set flags on a vnode:
        <function>chflags</function>,
        <function>fchflags</function></simpara></listitem>
      <listitem><simpara>Set attributes of a vnode such as file
        permission, owner, group, size, access time, and modification
        time.</simpara></listitem>
      <listitem><simpara>Bind to privileged ports in the Internet
        domain (ports &lt; 1024)</simpara></listitem>
      </itemizedlist>

      <para><function>Jail</function> is a very useful tool for
      running applications in a secure environment but it does have
      some shortcomings.  Currently, the IPC mechanisms have not been
      converted to the <function>suser_xxx</function> so applications
      such as MySQL cannot be run within a jail.  Superuser access
      may have a very limited meaning within a jail, but there is
      no way to specify exactly what "very limited" means.</para>
      </sect2>

      <sect2><title>&posix;.1e Process Capabilities</title>

      <indexterm><primary>POSIX.1e Process Capabilities</primary></indexterm>
      <indexterm><primary>TrustedBSD</primary></indexterm>

      <para>&posix; has released a working draft that adds event
      auditing, access control lists, fine grained privileges,
      information labeling, and mandatory access control.</para>
      <para>This is a work in progress and is the focus of the <ulink
      url="http://www.trustedbsd.org/">TrustedBSD</ulink> project.  Some
      of the initial work has been committed to &os.current;
      (cap_set_proc(3)).</para>

      </sect2>

      </sect1>

      <sect1 id="secure-trust"><title>Trust</title>

      <para>An application should never assume that anything about the
      users environment is sane.  This includes (but is certainly not
      limited to): user input, signals, environment variables,
      resources, IPC, mmaps, the filesystem working directory, file
      descriptors, the # of open files, etc.</para>

      <indexterm><primary>positive filtering</primary></indexterm>
      <indexterm><primary>data validation</primary></indexterm>

      <para>You should never assume that you can catch all forms of
      invalid input that a user might supply.  Instead, your
      application should use positive filtering to only allow a
      specific subset of inputs that you deem safe.  Improper data
      validation has been the cause of many exploits, especially with
      CGI scripts on the world wide web.  For filenames you need to be
      extra careful about paths ("../", "/"), symbolic links, and
      shell escape characters.</para>

      <indexterm><primary>Perl Taint mode</primary></indexterm>

      <para>Perl has a really cool feature called "Taint" mode which
      can be used to prevent scripts from using data derived outside
      the program in an unsafe way.  This mode will check command line
      arguments, environment variables, locale information, the
      results of certain syscalls (<function>readdir()</function>,
      <function>readlink()</function>,
      <function>getpwxxx()</function>), and all file input.</para>

      </sect1>

      <sect1 id="secure-race-conditions">
      <title>Race Conditions</title>

      <para>A race condition is anomalous behavior caused by the
      unexpected dependence on the relative timing of events.  In
      other words, a programmer incorrectly assumed that a particular
      event would always happen before another.</para>

      <indexterm><primary>race conditions</primary>
      <secondary>signals</secondary></indexterm>

      <indexterm><primary>race conditions</primary>
      <secondary>access checks</secondary></indexterm>

      <indexterm><primary>race conditions</primary>
      <secondary>file opens</secondary></indexterm>

      <para>Some of the common causes of race conditions are signals,
      access checks, and file opens.  Signals are asynchronous events
      by nature so special care must be taken in dealing with them.
      Checking access with <function>access(2)</function> then
      <function>open(2)</function> is clearly non-atomic.  Users can
      move files in between the two calls.  Instead, privileged
      applications should <function>seteuid()</function> and then call
      <function>open()</function> directly.  Along the same lines, an
      application should always set a proper umask before
      <function>open()</function> to obviate the need for spurious
      <function>chmod()</function> calls.</para>

      </sect1>

     </chapter>