Minor wording and grammar fixes.
Approved by: murray
This commit is contained in:
parent
4f09d6756f
commit
6611bdac7c
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=25527
1 changed files with 77 additions and 75 deletions
|
@ -36,7 +36,7 @@
|
||||||
|
|
||||||
<para>This text documents the way I created the gjournal
|
<para>This text documents the way I created the gjournal
|
||||||
facility, starting with learning how to do kernel
|
facility, starting with learning how to do kernel
|
||||||
programming. It's assumed the reader is familiar with C
|
programming. It is assumed that the reader is familiar with C
|
||||||
userland programming.</para>
|
userland programming.</para>
|
||||||
|
|
||||||
</abstract>
|
</abstract>
|
||||||
|
@ -50,8 +50,8 @@
|
||||||
<sect2 id="intro-docs">
|
<sect2 id="intro-docs">
|
||||||
<title>Documentation</title>
|
<title>Documentation</title>
|
||||||
|
|
||||||
<para>Documentation on kernel programming is scarce - it's one of
|
<para>Documentation on kernel programming is scarce - it is one of
|
||||||
few areas where there's nearly nothing in the way of friendly
|
few areas where there is nearly nothing in the way of friendly
|
||||||
tutorials, and the phrase <quote>use the source!</quote> really
|
tutorials, and the phrase <quote>use the source!</quote> really
|
||||||
holds true. However, there are some bits and pieces (some of
|
holds true. However, there are some bits and pieces (some of
|
||||||
them seriously outdated) floating around that should be studied
|
them seriously outdated) floating around that should be studied
|
||||||
|
@ -59,14 +59,14 @@
|
||||||
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
|
||||||
<listitem><para><ulink
|
<listitem><para>The <ulink
|
||||||
url="&url.books.developers-handbook;/index.html">FreeBSD
|
url="&url.books.developers-handbook;/index.html">FreeBSD
|
||||||
Developer's Handbook</ulink> - part of the documentation
|
Developer's Handbook</ulink> - part of the documentation
|
||||||
project, it doesn't contain anything specific to kernel-land
|
project, it does not contain anything specific to kernel-land
|
||||||
programming, but rather some general
|
programming, but rather some general
|
||||||
information.</para></listitem>
|
information.</para></listitem>
|
||||||
|
|
||||||
<listitem><para><ulink
|
<listitem><para>The <ulink
|
||||||
url="&url.books.arch-handbook;/index.html">FreeBSD
|
url="&url.books.arch-handbook;/index.html">FreeBSD
|
||||||
Architecture Handbook</ulink> - also from the documentation
|
Architecture Handbook</ulink> - also from the documentation
|
||||||
project, contains descriptions of several low-level facilities
|
project, contains descriptions of several low-level facilities
|
||||||
|
@ -79,15 +79,17 @@
|
||||||
site - contains several interesting articles on kernel
|
site - contains several interesting articles on kernel
|
||||||
facilities.</para></listitem>
|
facilities.</para></listitem>
|
||||||
|
|
||||||
<listitem><para>The man pages in section 9 - most important
|
<listitem><para>The man pages in section 9 - for important
|
||||||
kernel-land calls are documented here.</para></listitem>
|
documentation on kernel functions.</para></listitem>
|
||||||
|
|
||||||
<listitem><para>The &man.geom.4; man page and PHK's GEOM slides
|
<listitem><para>The &man.geom.4; man page and <ulink
|
||||||
|
url="http://phk.freebsd.dk/pubs/">PHK's GEOM slides</ulink>
|
||||||
- for general introduction of the GEOM
|
- for general introduction of the GEOM
|
||||||
subsystem.</para></listitem>
|
subsystem.</para></listitem>
|
||||||
|
|
||||||
<listitem><para>&man.style.9; man page, if the code should go to
|
<listitem><para>The &man.style.9; man page - for documentation on
|
||||||
FreeBSD CVS tree</para></listitem>
|
the coding-style conventions which must be followed for any code
|
||||||
|
which is to be committed to the FreeBSD CVS tree.</para></listitem>
|
||||||
|
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
|
@ -97,18 +99,18 @@
|
||||||
<sect1 id="prelim">
|
<sect1 id="prelim">
|
||||||
<title>Preliminaries</title>
|
<title>Preliminaries</title>
|
||||||
|
|
||||||
<para>The best way to do kernel developing is to have (at least)
|
<para>The best way to do kernel development is to have (at least)
|
||||||
two separate computers. One of these would contain the
|
two separate computers. One of these would contain the
|
||||||
development environment and sources, and the other would be used
|
development environment and sources, and the other would be used
|
||||||
to test the newly written code by network-booting and
|
to test the newly written code by network-booting and
|
||||||
network-mounting filesystems from the first one. This way if
|
network-mounting filesystems from the first one. This way if
|
||||||
the new code contains bugs and crashes the machine, it won't
|
the new code contains bugs and crashes the machine, it will not
|
||||||
mess up the sources (and other <quote>live</quote> data). The
|
mess up the sources (and other <quote>live</quote> data). The
|
||||||
second system doesn't event have to have a proper display - it
|
second system does not even require a proper display. Instead, it
|
||||||
could be connected with a serial cable or KVM to the first
|
could be connected with a serial cable or KVM to the first
|
||||||
one.</para>
|
one.</para>
|
||||||
|
|
||||||
<para>But, since not everybody has two+ computers handy, there are
|
<para>But, since not everybody has two or more computers handy, there are
|
||||||
a few things that can be done to prepare an otherwise "live"
|
a few things that can be done to prepare an otherwise "live"
|
||||||
system for developing kernel code.</para>
|
system for developing kernel code.</para>
|
||||||
|
|
||||||
|
@ -116,7 +118,7 @@
|
||||||
<title>Converting a system for development</title>
|
<title>Converting a system for development</title>
|
||||||
|
|
||||||
<para>For any kernel programming a kernel with
|
<para>For any kernel programming a kernel with
|
||||||
<option>INVARIANTS</option> enabled is a must have. So enter
|
<option>INVARIANTS</option> enabled is a must-have. So enter
|
||||||
these in your kernel configuration file:</para>
|
these in your kernel configuration file:</para>
|
||||||
|
|
||||||
<programlisting> options INVARIANT_SUPPORT
|
<programlisting> options INVARIANT_SUPPORT
|
||||||
|
@ -129,7 +131,7 @@
|
||||||
|
|
||||||
<para>With the usual way of installing the kernel (<command>make
|
<para>With the usual way of installing the kernel (<command>make
|
||||||
installkernel</command>) the debug kernel will not be
|
installkernel</command>) the debug kernel will not be
|
||||||
automatically installed. It's called
|
automatically installed. It is called
|
||||||
<filename>kernel.debug</filename> and located in
|
<filename>kernel.debug</filename> and located in
|
||||||
<filename>/usr/obj/usr/src/sys/KERNELNAME/</filename>. For
|
<filename>/usr/obj/usr/src/sys/KERNELNAME/</filename>. For
|
||||||
convenience it should be copied to
|
convenience it should be copied to
|
||||||
|
@ -143,18 +145,18 @@
|
||||||
options DDB
|
options DDB
|
||||||
options KDB_TRACE</programlisting>
|
options KDB_TRACE</programlisting>
|
||||||
|
|
||||||
<para>For this to work you might need to set a sysctl (if it's
|
<para>For this to work you might need to set a sysctl (if it is
|
||||||
not on by default):</para>
|
not on by default):</para>
|
||||||
|
|
||||||
<programlisting> debug.debugger_on_panic=1</programlisting>
|
<programlisting> debug.debugger_on_panic=1</programlisting>
|
||||||
|
|
||||||
<para>Kernel panics will happen, so care should be taken with
|
<para>Kernel panics will happen, so care should be taken with
|
||||||
the filesystem cache. In particular, having softupdates might
|
the filesystem cache. In particular, having softupdates might
|
||||||
mean a latest file version could be lost if a panic occurs
|
mean the latest file version could be lost if a panic occurs
|
||||||
before it's committed to storage. Disabling softupdates
|
before it is committed to storage. Disabling softupdates
|
||||||
yields a great performance hit (and it still doesn't guarantee
|
yields a great performance hit, and still does not guarantee
|
||||||
data consistency - mounting filesystem with the "sync" option
|
data consistency. Mounting filesystem with the "sync" option
|
||||||
is needed for that) so for a compromise, the cache delays can
|
is needed for that. For a compromise, the cache delays can
|
||||||
be shortened. There are three sysctl's that are useful for
|
be shortened. There are three sysctl's that are useful for
|
||||||
this (best to be set in
|
this (best to be set in
|
||||||
<filename>/etc/sysctl.conf</filename>):</para>
|
<filename>/etc/sysctl.conf</filename>):</para>
|
||||||
|
@ -168,11 +170,11 @@
|
||||||
<para>For debugging kernel panics, kernel core dumps are
|
<para>For debugging kernel panics, kernel core dumps are
|
||||||
required. Since a kernel panic might make filesystems
|
required. Since a kernel panic might make filesystems
|
||||||
unusable, this crash dump is first written to a raw
|
unusable, this crash dump is first written to a raw
|
||||||
partition. Usually, this is the swap partition (it must be at
|
partition. Usually, this is the swap partition. This partition must be at
|
||||||
least as large as the physical RAM in the machine). On the
|
least as large as the physical RAM in the machine. On the
|
||||||
next boot (after filesystems are checked and mounted and
|
next boot, the dump is copied to a regular file.
|
||||||
before swap is enabled), the dump is copied to a regular
|
This happens after filesystems are checked and mounted, and
|
||||||
file. This is controlled with two
|
before swap is enabled. This is controlled with two
|
||||||
<filename>/etc/rc.conf</filename> variables:</para>
|
<filename>/etc/rc.conf</filename> variables:</para>
|
||||||
|
|
||||||
<programlisting> dumpdev="/dev/ad0s4b"
|
<programlisting> dumpdev="/dev/ad0s4b"
|
||||||
|
@ -184,24 +186,24 @@
|
||||||
|
|
||||||
<para>Writing kernel core dumps is slow and takes a long time so
|
<para>Writing kernel core dumps is slow and takes a long time so
|
||||||
if you have lots of memory (>256M) and lots of panics it could
|
if you have lots of memory (>256M) and lots of panics it could
|
||||||
be frustrating to sit and wait while it's done (twice - first
|
be frustrating to sit and wait while it is done (twice - first
|
||||||
to write it to swap, then to relocate it to filesystem). It's
|
to write it to swap, then to relocate it to filesystem). It is
|
||||||
convenient then to limit the amount of RAM the system will use
|
convenient then to limit the amount of RAM the system will use
|
||||||
via a <filename>/boot/loader.conf</filename> tunable:</para>
|
via a <filename>/boot/loader.conf</filename> tunable:</para>
|
||||||
|
|
||||||
<programlisting> hw.physmem="256M"</programlisting>
|
<programlisting> hw.physmem="256M"</programlisting>
|
||||||
|
|
||||||
<para>If the panics are frequent and filesystems large (or you
|
<para>If the panics are frequent and filesystems large (or you
|
||||||
simply don't trust softupdates+background fsck) it's advisable
|
simply do not trust softupdates+background fsck) it is advisable
|
||||||
to turn background fsck off via
|
to turn background fsck off via
|
||||||
<filename>/etc/rc.conf</filename> variable:</para>
|
<filename>/etc/rc.conf</filename> variable:</para>
|
||||||
|
|
||||||
<programlisting> background_fsck="NO"</programlisting>
|
<programlisting> background_fsck="NO"</programlisting>
|
||||||
|
|
||||||
<para>This way, the filesystems will always get checked when
|
<para>This way, the filesystems will always get checked when
|
||||||
needed (with background fsck, a new panic could happen while
|
needed. Note that with background fsck, a new panic could happen while
|
||||||
it's checking the disks). Again, the safest way is not to have
|
it is checking the disks. Again, the safest way is not to have
|
||||||
many local filesystems by using another computer as NFS
|
many local filesystems by using another computer as an NFS
|
||||||
server.</para>
|
server.</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
|
@ -210,20 +212,20 @@
|
||||||
|
|
||||||
<para>For the purpose of making gjournal, a new empty
|
<para>For the purpose of making gjournal, a new empty
|
||||||
subdirectory was created under an arbitrary user-accessible
|
subdirectory was created under an arbitrary user-accessible
|
||||||
directory. You don't have to create the module directory under
|
directory. You do not have to create the module directory under
|
||||||
<filename>/usr/src</filename>.</para>
|
<filename>/usr/src</filename>.</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="prelim-makefile">
|
<sect2 id="prelim-makefile">
|
||||||
<title>The Makefile</title>
|
<title>The Makefile</title>
|
||||||
|
|
||||||
<para>It's good practice to create
|
<para>It is good practice to create
|
||||||
<filename>Makefile</filename>s for every nontrivial coding
|
<filename>Makefile</filename>s for every nontrivial coding
|
||||||
project, which of course includes kernel modules.</para>
|
project, which of course includes kernel modules.</para>
|
||||||
|
|
||||||
<para>Creating the <filename>Makefile</filename> is simple
|
<para>Creating the <filename>Makefile</filename> is simple
|
||||||
thanks to extensive set of helper routines provided by the
|
thanks to extensive set of helper routines provided by the
|
||||||
system. In short, here's how it looks:</para>
|
system. In short, here is how it looks:</para>
|
||||||
|
|
||||||
<programlisting> SRCS=g_journal.c
|
<programlisting> SRCS=g_journal.c
|
||||||
KMOD=geom_journal
|
KMOD=geom_journal
|
||||||
|
@ -259,9 +261,9 @@
|
||||||
<filename>sys/malloc.h</filename> headers must be
|
<filename>sys/malloc.h</filename> headers must be
|
||||||
included.</para>
|
included.</para>
|
||||||
|
|
||||||
<para>There's another mechanism for allocating memory, the UMA
|
<para>There is another mechanism for allocating memory, the UMA
|
||||||
(Universal Memory Allocator). See &man.uma.9; for details, but
|
(Universal Memory Allocator). See &man.uma.9; for details, but
|
||||||
it's a special type of allocator mainly used for speedy
|
it is a special type of allocator mainly used for speedy
|
||||||
allocation of lists comprised of same-sized items (for
|
allocation of lists comprised of same-sized items (for
|
||||||
example, dynamic arrays of structs).</para>
|
example, dynamic arrays of structs).</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
@ -273,10 +275,10 @@
|
||||||
things needs to be maintained. Fortunately, this data
|
things needs to be maintained. Fortunately, this data
|
||||||
structure is implemented (in several ways) by the C macros
|
structure is implemented (in several ways) by the C macros
|
||||||
included in the system. The most used list type is TAILQ
|
included in the system. The most used list type is TAILQ
|
||||||
because it's the most flexible. It's also the one with largest
|
because it is the most flexible. It is also the one with largest
|
||||||
memory requirements (its elements are doubly-linked) and
|
memory requirements (its elements are doubly-linked) and
|
||||||
theoretically the slowest (though the speed variation is on
|
theoretically the slowest (though the speed variation is on
|
||||||
the order of several CPU instructions more, so it shouldn't be
|
the order of several CPU instructions more, so it should not be
|
||||||
taken seriously).</para>
|
taken seriously).</para>
|
||||||
|
|
||||||
<para>If data retrieval speed is very important, see
|
<para>If data retrieval speed is very important, see
|
||||||
|
@ -295,8 +297,8 @@
|
||||||
|
|
||||||
<para>The important thing here is that bios are dealt with
|
<para>The important thing here is that bios are dealt with
|
||||||
asynchronously. That means that, in most parts of the code,
|
asynchronously. That means that, in most parts of the code,
|
||||||
there's no analogue to userland's &man.read.2; and
|
there is no analogue to userland's &man.read.2; and
|
||||||
&man.write.2; calls that don't return until a request is
|
&man.write.2; calls that do not return until a request is
|
||||||
done. Rather, a developer-supplied function is called as a
|
done. Rather, a developer-supplied function is called as a
|
||||||
notification when the request gets completed (or results in
|
notification when the request gets completed (or results in
|
||||||
error).</para>
|
error).</para>
|
||||||
|
@ -306,8 +308,8 @@
|
||||||
than the much more used imperative one (at least it takes a
|
than the much more used imperative one (at least it takes a
|
||||||
while to get used to it). In some cases helper routines
|
while to get used to it). In some cases helper routines
|
||||||
<function>g_write_data</function>() and
|
<function>g_write_data</function>() and
|
||||||
<function>g_read_data</function>() can be used (NOT
|
<function>g_read_data</function>() can be used, but <emphasis>NOT
|
||||||
ALWAYS!).</para>
|
ALWAYS</emphasis>!.</para>
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
@ -320,7 +322,7 @@
|
||||||
|
|
||||||
<para>If maximum performance is not needed, a much simpler way
|
<para>If maximum performance is not needed, a much simpler way
|
||||||
of making a data transformation is to implement it in userland
|
of making a data transformation is to implement it in userland
|
||||||
via the ggate (GEOM gate) facility. Unfortunately, there's no
|
via the ggate (GEOM gate) facility. Unfortunately, there is no
|
||||||
easy way to convert between, or even share code between the
|
easy way to convert between, or even share code between the
|
||||||
two approaches.</para>
|
two approaches.</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
@ -329,7 +331,7 @@
|
||||||
<title>GEOM class</title>
|
<title>GEOM class</title>
|
||||||
|
|
||||||
<para>GEOM class has several "class methods" that get called
|
<para>GEOM class has several "class methods" that get called
|
||||||
when there's no geom instance available (or they're simply not
|
when there is no geom instance available (or they are simply not
|
||||||
bound to a single instance):</para>
|
bound to a single instance):</para>
|
||||||
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
@ -372,11 +374,11 @@
|
||||||
|
|
||||||
<para>The name <quote>softc</quote> is a legacy term for
|
<para>The name <quote>softc</quote> is a legacy term for
|
||||||
<quote>driver private data</quote>. The name most probably
|
<quote>driver private data</quote>. The name most probably
|
||||||
comes from archaic term <quote>software control block</quote>.
|
comes from the archaic term <quote>software control block</quote>.
|
||||||
In GEOM, it's a structure (more precise: pointer to a
|
In GEOM, it is a structure (more precise: pointer to a
|
||||||
structure) that can be attached to a geom instance to hold
|
structure) that can be attached to a geom instance to hold
|
||||||
whatever data is private to the geom instance. In gjournal
|
whatever data is private to the geom instance. In gjournal
|
||||||
(and most of the other GEOM classes), some of it's members
|
(and most of the other GEOM classes), some of its members
|
||||||
are:</para>
|
are:</para>
|
||||||
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
@ -387,7 +389,7 @@
|
||||||
consumer this geom consumes</para></listitem>
|
consumer this geom consumes</para></listitem>
|
||||||
|
|
||||||
<listitem><para><varname>struct g_consumer **disks</varname> : Array
|
<listitem><para><varname>struct g_consumer **disks</varname> : Array
|
||||||
of <varname>struct g_consumer*</varname>. (It's not possible
|
of <varname>struct g_consumer*</varname>. (It is not possible
|
||||||
to use just single indirection because struct g_consumer*
|
to use just single indirection because struct g_consumer*
|
||||||
are created on our behalf by GEOM).</para></listitem>
|
are created on our behalf by GEOM).</para></listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
@ -412,14 +414,14 @@
|
||||||
|
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>It's assumed that geom classes know how to handle metadata
|
<para>It is assumed that geom classes know how to handle metadata
|
||||||
with version ID's lower than theirs.</para>
|
with version ID's lower than theirs.</para>
|
||||||
|
|
||||||
<para>Metadata is located in the last sector of the provider
|
<para>Metadata is located in the last sector of the provider
|
||||||
(and thus must fit in it).</para>
|
(and thus must fit in it).</para>
|
||||||
|
|
||||||
<para>(All this is implementation-dependent but all existing
|
<para>(All this is implementation-dependent but all existing
|
||||||
code works like that, and it's supported by libraries.)</para>
|
code works like that, and it is supported by libraries.)</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="geom-creating">
|
<sect2 id="geom-creating">
|
||||||
|
@ -429,10 +431,10 @@
|
||||||
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
|
||||||
<listitem><para>user calls &man.geom.8; utility (or one of it's
|
<listitem><para>user calls &man.geom.8; utility (or one of its
|
||||||
hardlinked friends)</para></listitem>
|
hardlinked friends)</para></listitem>
|
||||||
|
|
||||||
<listitem><para>the utility figures out which geom class it's
|
<listitem><para>the utility figures out which geom class it is
|
||||||
supposed to handle and searches for
|
supposed to handle and searches for
|
||||||
<filename>geom_<replaceable>CLASSNAME</replaceable>.so</filename>
|
<filename>geom_<replaceable>CLASSNAME</replaceable>.so</filename>
|
||||||
library (usually in
|
library (usually in
|
||||||
|
@ -450,10 +452,10 @@
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
|
||||||
<listitem><para>&man.geom.8; looks in the command-line definition
|
<listitem><para>&man.geom.8; looks in the command-line definition
|
||||||
for the command (usually "label"), calls a helper
|
for the command (usually "label"), and calls a helper
|
||||||
function.</para></listitem>
|
function.</para></listitem>
|
||||||
|
|
||||||
<listitem><para>helper function checks parameters & gathers
|
<listitem><para>helper function checks parameters and gathers
|
||||||
metadata, which it proceeds to write to all concerned
|
metadata, which it proceeds to write to all concerned
|
||||||
providers.</para></listitem>
|
providers.</para></listitem>
|
||||||
|
|
||||||
|
@ -465,7 +467,7 @@
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>(The above sequence of events is implementation-dependent
|
<para>(The above sequence of events is implementation-dependent
|
||||||
but all existing code works like that, and it's supported by
|
but all existing code works like that, and it is supported by
|
||||||
libraries.)</para>
|
libraries.)</para>
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
|
@ -532,10 +534,10 @@
|
||||||
<listitem><para><function>.spoiled</function> : called when some
|
<listitem><para><function>.spoiled</function> : called when some
|
||||||
underlying provider gets written to</para></listitem>
|
underlying provider gets written to</para></listitem>
|
||||||
|
|
||||||
<listitem><para><function>.start</function> : handles IO</para></listitem>
|
<listitem><para><function>.start</function> : handles I/O</para></listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>These functions are called from g_down? kernel thread and
|
<para>These functions are called from the g_down? kernel thread and
|
||||||
there can be no sleeping in this context (no blocking on a
|
there can be no sleeping in this context (no blocking on a
|
||||||
mutex or any kind of locks) which limits what can be done
|
mutex or any kind of locks) which limits what can be done
|
||||||
quite a bit, but forces the handling to be fast.</para>
|
quite a bit, but forces the handling to be fast.</para>
|
||||||
|
@ -567,16 +569,16 @@
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>When a user process issues <quote>read data X at offset Y
|
<para>When a user process issues <quote>read data X at offset Y
|
||||||
of a file</quote> request, this is what happenes:</para>
|
of a file</quote> request, this is what happens:</para>
|
||||||
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
|
||||||
<listitem><para>The filesystem converts the request into struct bio
|
<listitem><para>The filesystem converts the request into a struct bio
|
||||||
instance and passes it to GEOM subsystem. It knows what geom
|
instance and passes it to the GEOM subsystem. It knows what geom
|
||||||
instance should handle it because filesystems are hosted
|
instance should handle it because filesystems are hosted
|
||||||
directly on a geom instance.</para></listitem>
|
directly on a geom instance.</para></listitem>
|
||||||
|
|
||||||
<listitem><para>The request ends up as a call to
|
<listitem><para>The request ends up as a call to the
|
||||||
<function>.start</function>() function made on the g_down
|
<function>.start</function>() function made on the g_down
|
||||||
thread and reaches the top-level geom instance.</para></listitem>
|
thread and reaches the top-level geom instance.</para></listitem>
|
||||||
|
|
||||||
|
@ -612,12 +614,12 @@
|
||||||
|
|
||||||
<para>See &man.g.bio.9; man page for information how the data is
|
<para>See &man.g.bio.9; man page for information how the data is
|
||||||
passed back and forth in the <structname>bio</structname>
|
passed back and forth in the <structname>bio</structname>
|
||||||
structure (note particular the <varname>bio_parent</varname>
|
structure (note in particular the <varname>bio_parent</varname>
|
||||||
and <varname>bio_children</varname> fields and how they are
|
and <varname>bio_children</varname> fields and how they are
|
||||||
handled).</para>
|
handled).</para>
|
||||||
|
|
||||||
<para>One important feature is: THERE CAN BE NO SLEEPING IN G_UP
|
<para>One important feature is: <emphasis>THERE CAN BE NO SLEEPING IN G_UP
|
||||||
AND G_DOWN THREADS. This means that none of the following
|
AND G_DOWN THREADS</emphasis>. This means that none of the following
|
||||||
things can be done in those threads (the list is of course not
|
things can be done in those threads (the list is of course not
|
||||||
complete, but only informative):</para>
|
complete, but only informative):</para>
|
||||||
|
|
||||||
|
@ -637,11 +639,11 @@
|
||||||
<listitem><para>sx locks</para></listitem>
|
<listitem><para>sx locks</para></listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>This restriction is here to stop geom code clogging the IO
|
<para>This restriction is here to stop geom code clogging the I/O
|
||||||
request path, because sleeping in the code is usually not
|
request path, because sleeping in the code is usually not
|
||||||
time-bound and there can be no guarantiees on how long will it
|
time-bound and there can be no guarantiees on how long will it
|
||||||
take (there are some other, more technical reasons also). It
|
take (there are some other, more technical reasons also). It
|
||||||
also means that there's not much that can be done in those
|
also means that there is not much that can be done in those
|
||||||
threads; for example, almost any complex thing requires memory
|
threads; for example, almost any complex thing requires memory
|
||||||
allocation. Fortunately, there is a way out: creating
|
allocation. Fortunately, there is a way out: creating
|
||||||
additional kernel threads.</para>
|
additional kernel threads.</para>
|
||||||
|
@ -652,20 +654,20 @@
|
||||||
|
|
||||||
<para>Kernel threads are created with &man.kthread.create.9;
|
<para>Kernel threads are created with &man.kthread.create.9;
|
||||||
function, and they are sort of similar to userland threads in
|
function, and they are sort of similar to userland threads in
|
||||||
behaviour, only they can't return to caller to signify
|
behaviour, only they cannot return to caller to signify
|
||||||
termination, but must call &man.kthread.exit.9;.</para>
|
termination, but must call &man.kthread.exit.9;.</para>
|
||||||
|
|
||||||
<para>In geom code, the usual use of threads is to offload
|
<para>In geom code, the usual use of threads is to offload
|
||||||
processing of requests from <literal>g_down</literal> thread
|
processing of requests from <literal>g_down</literal> thread
|
||||||
(the <function>.start</function>() function). These threads
|
(the <function>.start</function>() function). These threads
|
||||||
look like <quote>event handlers</quote>: they have a linked
|
look like <quote>event handlers</quote>: they have a linked
|
||||||
list of event associated with them (on which events can posted
|
list of event associated with them (on which events can be posted
|
||||||
by various functions in various threads so it must be
|
by various functions in various threads so it must be
|
||||||
protected by a mutex), take the events from the list one by
|
protected by a mutex), take the events from the list one by
|
||||||
one and process them in a big <literal>switch</literal>()
|
one and process them in a big <literal>switch</literal>()
|
||||||
statement.</para>
|
statement.</para>
|
||||||
|
|
||||||
<para>The main benefit of using a thread to handle IO requests
|
<para>The main benefit of using a thread to handle I/O requests
|
||||||
is that it can sleep when needed. Now, this sounds good, but
|
is that it can sleep when needed. Now, this sounds good, but
|
||||||
should be carefully thought out. Sleeping is well and very
|
should be carefully thought out. Sleeping is well and very
|
||||||
convenient but can very effectively destroy performance of the
|
convenient but can very effectively destroy performance of the
|
||||||
|
@ -683,11 +685,11 @@
|
||||||
|
|
||||||
<para>Mutexes in FreeBSD kernel (see &man.mutex.9; man page) have
|
<para>Mutexes in FreeBSD kernel (see &man.mutex.9; man page) have
|
||||||
one distinction from their more common userland cousins - they
|
one distinction from their more common userland cousins - they
|
||||||
disallow sleeping (meaning: the code can't sleep while holding
|
disallow sleeping (meaning: the code cannot sleep while holding
|
||||||
a mutex). If the code needs to sleep a lot, &man.sx.9; locks
|
a mutex). If the code needs to sleep a lot, &man.sx.9; locks
|
||||||
may be more appropriate. (On the other hand, if you do almost
|
may be more appropriate. On the other hand, if you do almost
|
||||||
everything in a single thread, you may get away with no
|
everything in a single thread, you may get away with no
|
||||||
mutexes at all).</para>
|
mutexes at all.</para>
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue