Whitespace-only fixes to indentation and line wrap. Translators may

ignore.

Approved by:	gjb (mentor)
This commit is contained in:
Warren Block 2012-01-10 02:50:25 +00:00
parent b3a79080bd
commit 75e065b0fa
Notes: svn2git 2020-12-08 03:00:23 +00:00
svn path=/head/; revision=38171

View file

@ -10,7 +10,7 @@ $FreeBSD$
<chapterinfo>
<authorgroup>
<author>
<firstname>Sergey</firstname>
<firstname>Sergey</firstname>
<surname>Lyubka</surname>
<contrib>Contributed by </contrib>
</author> <!-- devnull@uptsoft.com 12 Jun 2002 -->
@ -64,39 +64,32 @@ $FreeBSD$
<informaltable frame="none" pgwide="0">
<tgroup cols="2">
<tbody>
<row>
<entry><para>Output (may vary)</para></entry>
<tbody>
<row>
<entry><para>Output (may vary)</para></entry>
<entry><para>BIOS (firmware) messages</para></entry>
</row>
</row>
<row>
<entry><para>
<screen>F1 FreeBSD
<row>
<entry><para><screen>F1 FreeBSD
F2 BSD
F5 Disk 2</screen>
</para></entry>
F5 Disk 2</screen></para></entry>
<entry><para><literal>boot0</literal></para></entry>
</row>
</row>
<row>
<entry><para>
<screen>&gt;&gt;FreeBSD/i386 BOOT
<row>
<entry><para><screen>&gt;&gt;FreeBSD/i386 BOOT
Default: 1:ad(1,a)/boot/loader
boot:</screen>
</para></entry>
boot:</screen></para></entry>
<entry><para><literal>boot2</literal><footnote><para>This
prompt will appear if the user presses a key just after
selecting an OS to boot at the <literal>boot0</literal>
stage.</para></footnote></para></entry>
</row>
prompt will appear if the user presses a key just
after selecting an OS to boot at the
<literal>boot0</literal>
stage.</para></footnote></para></entry>
</row>
<row>
<entry><para>
<screen>BTX loader 1.0 BTX version is 1.01
<row>
<entry><para><screen>BTX loader 1.0 BTX version is 1.01
BIOS drive A: is disk0
BIOS drive C: is disk1
BIOS 639kB/64512kB available memory
@ -105,24 +98,20 @@ Console internal video/keyboard
(jkh@bento.freebsd.org, Mon Nov 20 11:41:23 GMT 2000)
/kernel text=0x1234 data=0x2345 syms=[0x4+0x3456]
Hit [Enter] to boot immediately, or any other key for command prompt
Booting [kernel] in 9 seconds..._</screen>
</para></entry>
Booting [kernel] in 9 seconds..._</screen></para></entry>
<entry><para>loader</para></entry>
</row>
<entry><para>loader</para></entry>
</row>
<row>
<entry><para>
<screen>Copyright (c) 1992-2002 The FreeBSD Project.
<row>
<entry><para><screen>Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.6-RC #0: Sat May 4 22:49:02 GMT 2002
devnull@kukas:/usr/obj/usr/src/sys/DEVNULL
Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
<entry><para>kernel</para></entry>
</row>
</tbody>
<entry><para>kernel</para></entry>
</row>
</tbody>
</tgroup>
</informaltable>
</sect1>
@ -150,8 +139,8 @@ Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
so that it points to a BIOS memory block.</para>
<para>BIOS stands for <emphasis>Basic Input Output
System</emphasis>, and it is a chip on the motherboard that has
a relatively small amount of read-only memory (ROM). This
System</emphasis>, and it is a chip on the motherboard that
has a relatively small amount of read-only memory (ROM). This
memory contains various low-level routines that are specific to
the hardware supplied with the motherboard. So, the processor
will first jump to the address 0xfffffff0, which really resides
@ -167,7 +156,7 @@ Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
CD-ROM, harddisk etc.</para>
<para>The very last thing in the POST is the <literal>INT
0x19</literal> instruction. That instruction reads 512 bytes
0x19</literal> instruction. That instruction reads 512 bytes
from the first sector of boot device into the memory at address
0x7c00. The term <emphasis>first sector</emphasis> originates
from harddrive architecture, where the magnetic plate is divided
@ -178,9 +167,9 @@ Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
from 0, but sectors - starting from 1), has a special meaning.
It is also called Master Boot Record, or MBR. The remaining
sectors on the first track are never used <footnote><para>Some
utilities such as &man.disklabel.8; may store the information in
this area, mostly in the second
sector.</para></footnote>.</para>
utilities such as &man.disklabel.8; may store the
information in this area, mostly in the second
sector.</para></footnote>.</para>
</sect1>
<sect1 id="boot-boot0">
@ -190,7 +179,8 @@ Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
<para>Take a look at the file <filename>/boot/boot0</filename>.
This is a small 512-byte file, and it is exactly what FreeBSD's
installation procedure wrote to your harddisk's MBR if you chose
the <quote>bootmanager</quote> option at installation time.</para>
the <quote>bootmanager</quote> option at installation
time.</para>
<para>As mentioned previously, the <literal>INT 0x19</literal>
instruction loads an MBR, i.e. the <filename>boot0</filename>
@ -214,19 +204,19 @@ Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
<itemizedlist>
<listitem>
<para>the 1-byte filesystem type</para>
<para>the 1-byte filesystem type</para>
</listitem>
<listitem>
<para>the 1-byte bootable flag</para>
<para>the 1-byte bootable flag</para>
</listitem>
<listitem>
<para>the 6 byte descriptor in CHS format</para>
<para>the 6 byte descriptor in CHS format</para>
</listitem>
<listitem>
<para>the 8 byte descriptor in LBA format</para>
<para>the 8 byte descriptor in LBA format</para>
</listitem>
</itemizedlist>
@ -346,30 +336,36 @@ boot2: boot2.ldr boot2.bin ${BTX}/btx/btx
<indexterm><primary>virtual v86 mode</primary></indexterm>
<itemizedlist>
<listitem><para>virtual v86 mode. That means, the BTX is a v86
monitor. Real mode instructions like pushf, popf, cli, sti, if
called by the client, will work.</para></listitem>
<listitem>
<para>virtual v86 mode. That means, the BTX is a v86 monitor.
Real mode instructions like pushf, popf, cli, sti, if called
by the client, will work.</para>
</listitem>
<listitem><para>Interrupt Descriptor Table (IDT) is set up so
all hardware interrupts are routed to the default BIOS's
handlers, and interrupt 0x30 is set up to be the syscall
gate.</para></listitem>
<listitem>
<para>Interrupt Descriptor Table (IDT) is set up so all
hardware interrupts are routed to the default BIOS's
handlers, and interrupt 0x30 is set up to be the syscall
gate.</para>
</listitem>
<listitem><para>Two system calls: <function>exec</function> and
<function>exit</function>, are defined:</para>
<listitem>
<para>Two system calls: <function>exec</function> and
<function>exit</function>, are defined:</para>
<programlisting><filename>sys/boot/i386/btx/lib/btxsys.s:</filename>
<programlisting><filename>sys/boot/i386/btx/lib/btxsys.s:</filename>
.set INT_SYS,0x30 # Interrupt number
#
# System call: exit
#
__exit: xorl %eax,%eax # BTX system
__exit: xorl %eax,%eax # BTX system
int $INT_SYS # call 0x0
#
# System call: exec
#
__exec: movl $0x1,%eax # BTX system
int $INT_SYS # call 0x1</programlisting></listitem>
__exec: movl $0x1,%eax # BTX system
int $INT_SYS # call 0x1</programlisting>
</listitem>
</itemizedlist>
<para>BTX creates a Global Descriptor Table (GDT):</para>
@ -392,8 +388,8 @@ gdt: .word 0x0,0x0,0x0,0x0 # Null entry
segment pointed to by the SEL_SCODE (supervisor code) selector,
as shown from the code that creates an IDT:</para>
<programlisting> mov $SEL_SCODE,%dh # Segment selector
init.2: shr %bx # Handle this int?
<programlisting> mov $SEL_SCODE,%dh # Segment selector
init.2: shr %bx # Handle this int?
jnc init.3 # No
mov %ax,(%di) # Set handler offset
mov %dh,0x2(%di) # and selector
@ -438,21 +434,22 @@ struct bootinfo {
u_int32_t bi_modulep; /* preloaded modules */
};</programlisting>
<para><literal>boot2</literal> enters into an infinite loop waiting
for user input, then calls <function>load()</function>. If the
user does not press anything, the loop breaks by a timeout, so
<function>load()</function> will load the default file
(<filename>/boot/loader</filename>). Functions <function>ino_t
lookup(char *filename)</function> and <function>int xfsread(ino_t
inode, void *buf, size_t nbyte)</function> are used to read the
content of a file into memory. <filename>/boot/loader</filename>
is an ELF binary, but where the ELF header is prepended with
a.out's <literal>struct exec</literal> structure.
<function>load()</function> scans the loader's ELF header, loading
the content of <filename>/boot/loader</filename> into memory, and
passing the execution to the loader's entry:</para>
<para><literal>boot2</literal> enters into an infinite loop
waiting for user input, then calls <function>load()</function>.
If the user does not press anything, the loop breaks by a
timeout, so <function>load()</function> will load the default
file (<filename>/boot/loader</filename>). Functions
<function>ino_t lookup(char *filename)</function> and
<function>int xfsread(ino_t inode, void *buf, size_t
nbyte)</function> are used to read the content of a file into
memory. <filename>/boot/loader</filename> is an ELF binary, but
where the ELF header is prepended with a.out's <literal>struct
exec</literal> structure. <function>load()</function> scans the
loader's ELF header, loading the content of
<filename>/boot/loader</filename> into memory, and passing the
execution to the loader's entry:</para>
<programlisting><filename>sys/boot/i386/boot2/boot2.c:</filename>
<programlisting><filename>sys/boot/i386/boot2/boot2.c:</filename>
__exec((caddr_t)addr, RB_BOOTINFO | (opts &amp; RBX_MASK),
MAKEBOOTDEV(dev_maj[dsk.type], 0, dsk.slice, dsk.unit, dsk.part),
0, 0, 0, VTOP(&amp;bootinfo));</programlisting>
@ -470,7 +467,7 @@ struct bootinfo {
the kernel is loaded into memory, it is being called by the
loader:</para>
<programlisting><filename>sys/boot/common/boot.c:</filename>
<programlisting><filename>sys/boot/common/boot.c:</filename>
/* Call the exec handler from the loader matching the kernel */
module_formats[km-&gt;m_loader]-&gt;l_exec(km);</programlisting>
</sect1>
@ -478,10 +475,10 @@ struct bootinfo {
<sect1 id="boot-kernel">
<title>Kernel Initialization</title>
<para>Let us take a look at the command that links the kernel. This
will help identify the exact location where the loader passes
execution to the kernel. This location is the kernel's actual entry
point.</para>
<para>Let us take a look at the command that links the kernel.
This will help identify the exact location where the loader
passes execution to the kernel. This location is the kernel's
actual entry point.</para>
<programlisting><filename>sys/conf/Makefile.i386:</filename>
ld -elf -Bdynamic -T /usr/src/sys/conf/ldscript.i386 -export-dynamic \
@ -489,8 +486,8 @@ ld -elf -Bdynamic -T /usr/src/sys/conf/ldscript.i386 -export-dynamic \
&lt;lots of kernel .o files&gt;</programlisting>
<indexterm><primary>ELF</primary></indexterm>
<para>A few interesting things can be seen here. First,
the kernel is an ELF dynamically linked binary, but the dynamic
<para>A few interesting things can be seen here. First, the
kernel is an ELF dynamically linked binary, but the dynamic
linker for kernel is <filename>/red/herring</filename>, which is
definitely a bogus file. Second, taking a look at the file
<filename>sys/conf/ldscript.i386</filename> gives an idea about
@ -498,13 +495,13 @@ ld -elf -Bdynamic -T /usr/src/sys/conf/ldscript.i386 -export-dynamic \
compiling a kernel. Reading through the first few lines, the
string</para>
<programlisting><filename>sys/conf/ldscript.i386:</filename>
<programlisting><filename>sys/conf/ldscript.i386:</filename>
ENTRY(btext)</programlisting>
<para>says that a kernel's entry point is the symbol `btext'.
This symbol is defined in <filename>locore.s</filename>:</para>
<programlisting><filename>sys/i386/i386/locore.s:</filename>
<programlisting><filename>sys/i386/i386/locore.s:</filename>
.text
/**********************************************************************
*
@ -513,9 +510,9 @@ ENTRY(btext)</programlisting>
*/
NON_GPROF_ENTRY(btext)</programlisting>
<para>First, the register EFLAGS is set to a
predefined value of 0x00000002. Then all the segment
registers are initialized:</para>
<para>First, the register EFLAGS is set to a predefined value of
0x00000002. Then all the segment registers are
initialized:</para>
<programlisting><filename>sys/i386/i386/locore.s:</filename>
/* Don't trust what the BIOS gives for eflags. */
@ -539,34 +536,31 @@ NON_GPROF_ENTRY(btext)</programlisting>
<informaltable frame="none" pgwide="1">
<tgroup cols="2" align="left">
<tbody>
<row>
<entry><function>recover_bootinfo</function></entry>
<tbody>
<row>
<entry><function>recover_bootinfo</function></entry>
<entry>This routine parses the parameters to the kernel
passed from the bootstrap. The kernel may have been
booted in 3 ways: by the loader, described above, by the
old disk boot blocks, or by the old diskless boot
procedure. This function determines the booting method,
and stores the <literal>struct bootinfo</literal>
structure into the kernel memory.</entry>
</row>
<entry>This routine parses the parameters to the kernel
passed from the bootstrap. The kernel may have been
booted in 3 ways: by the loader, described above, by the
old disk boot blocks, or by the old diskless boot
procedure. This function determines the booting method,
and stores the <literal>struct bootinfo</literal>
structure into the kernel memory.</entry>
</row>
<row>
<entry><function>identify_cpu</function></entry>
<entry>This functions tries to find out what CPU it is
running on, storing the value found in a variable
<varname>_cpu</varname>.</entry>
</row>
<row>
<entry><function>identify_cpu</function></entry>
<entry>This functions tries to find out what CPU it is
running on, storing the value found in a variable
<varname>_cpu</varname>.</entry>
</row>
<row>
<entry><function>create_pagetables</function></entry>
<entry>This function allocates and fills out a Page Table
Directory at the top of the kernel memory area.</entry>
</row>
</tbody>
<row>
<entry><function>create_pagetables</function></entry>
<entry>This function allocates and fills out a Page Table
Directory at the top of the kernel memory area.</entry>
</row>
</tbody>
</tgroup>
</informaltable>
@ -580,6 +574,7 @@ NON_GPROF_ENTRY(btext)</programlisting>
movl %eax, %cr4</programlisting>
<para>Then, enabling paging:</para>
<programlisting>/* Now enable paging */
movl R(_IdlePTD), %eax
movl %eax,%cr3 /* load ptd addr into mmu */
@ -617,57 +612,56 @@ begin:</programlisting>
<title><function>init386()</function></title>
<para><function>init386()</function> is defined in
<filename>sys/i386/i386/machdep.c</filename> and performs
low-level initialization specific to the i386 chip. The
switch to protected mode was performed by the loader. The
loader has created the very first task, in which the kernel
continues to operate. Before looking at the
code, consider the tasks the processor must complete
to initialize protected mode execution:</para>
<filename>sys/i386/i386/machdep.c</filename> and performs
low-level initialization specific to the i386 chip. The
switch to protected mode was performed by the loader. The
loader has created the very first task, in which the kernel
continues to operate. Before looking at the code, consider
the tasks the processor must complete to initialize protected
mode execution:</para>
<itemizedlist>
<listitem>
<listitem>
<para>Initialize the kernel tunable parameters, passed from
the bootstrapping program.</para>
</listitem>
</listitem>
<listitem>
<listitem>
<para>Prepare the GDT.</para>
</listitem>
<listitem>
<listitem>
<para>Prepare the IDT.</para>
</listitem>
<listitem>
<listitem>
<para>Initialize the system console.</para>
</listitem>
<listitem>
<listitem>
<para>Initialize the DDB, if it is compiled into
kernel.</para>
</listitem>
<listitem>
<listitem>
<para>Initialize the TSS.</para>
</listitem>
<listitem>
<listitem>
<para>Prepare the LDT.</para>
</listitem>
<listitem>
<listitem>
<para>Set up proc0's pcb.</para>
</listitem>
</itemizedlist>
<indexterm><primary>parameters</primary></indexterm>
<para><function>init386()</function>
initializes the tunable parameters passed from bootstrap
by setting the environment pointer (envp) and calling
<function>init_param1()</function>. The envp pointer has been
passed from loader in the <literal>bootinfo</literal>
structure:</para>
<indexterm><primary>parameters</primary></indexterm>
<para><function>init386()</function> initializes the tunable
parameters passed from bootstrap by setting the environment
pointer (envp) and calling <function>init_param1()</function>.
The envp pointer has been passed from loader in the
<literal>bootinfo</literal> structure:</para>
<programlisting><filename>sys/i386/i386/machdep.c:</filename>
kern_envp = (caddr_t)bootinfo.bi_envp + KERNBASE;
@ -676,45 +670,45 @@ begin:</programlisting>
init_param1();</programlisting>
<para><function>init_param1()</function> is defined in
<filename>sys/kern/subr_param.c</filename>. That file has a
number of sysctls, and two functions,
<function>init_param1()</function> and
<function>init_param2()</function>, that are called from
<function>init386()</function>:</para>
<filename>sys/kern/subr_param.c</filename>. That file has a
number of sysctls, and two functions,
<function>init_param1()</function> and
<function>init_param2()</function>, that are called from
<function>init386()</function>:</para>
<programlisting><filename>sys/kern/subr_param.c:</filename>
hz = HZ;
TUNABLE_INT_FETCH("kern.hz", &amp;hz);</programlisting>
<para>TUNABLE_&lt;typename&gt;_FETCH is used to fetch the value
from the environment:</para>
from the environment:</para>
<programlisting><filename>/usr/src/sys/sys/kernel.h:</filename>
<programlisting><filename>/usr/src/sys/sys/kernel.h:</filename>
#define TUNABLE_INT_FETCH(path, var) getenv_int((path), (var))
</programlisting>
<para>Sysctl <literal>kern.hz</literal> is the system clock tick.
Additionally, these sysctls are set by
<function>init_param1()</function>: <literal>kern.maxswzone,
kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.maxdsiz, kern.dflssiz,
kern.maxssiz, kern.sgrowsiz</literal>.</para>
<para>Sysctl <literal>kern.hz</literal> is the system clock
tick. Additionally, these sysctls are set by
<function>init_param1()</function>: <literal>kern.maxswzone,
kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.maxdsiz,
kern.dflssiz, kern.maxssiz, kern.sgrowsiz</literal>.</para>
<indexterm><primary>Global Descriptors Table (GDT)</primary></indexterm>
<indexterm><primary>Global Descriptors Table
(GDT)</primary></indexterm>
<para>Then <function>init386()</function> prepares the Global
Descriptors Table (GDT). Every task on an x86 is running in
its own virtual address space, and this space is addressed by
a segment:offset pair. Say, for instance, the current
instruction to be executed by the processor lies at CS:EIP,
then the linear virtual address for that instruction would be
<quote>the virtual address of code segment CS</quote> + EIP. For
convenience, segments begin at virtual address 0 and end at a
4Gb boundary. Therefore, the instruction's linear virtual
address for this example would just be the value of EIP.
Segment registers such as CS, DS etc are the selectors,
i.e. indexes, into GDT (to be more precise, an index is not a
selector itself, but the INDEX field of a selector).
FreeBSD's GDT holds descriptors for 15 selectors per
CPU:</para>
Descriptors Table (GDT). Every task on an x86 is running in
its own virtual address space, and this space is addressed by
a segment:offset pair. Say, for instance, the current
instruction to be executed by the processor lies at CS:EIP,
then the linear virtual address for that instruction would be
<quote>the virtual address of code segment CS</quote> + EIP.
For convenience, segments begin at virtual address 0 and end
at a 4Gb boundary. Therefore, the instruction's linear
virtual address for this example would just be the value of
EIP. Segment registers such as CS, DS etc are the selectors,
i.e. indexes, into GDT (to be more precise, an index is not a
selector itself, but the INDEX field of a selector). FreeBSD's
GDT holds descriptors for 15 selectors per CPU:</para>
<programlisting><filename>sys/i386/i386/machdep.c:</filename>
union descriptor gdt[NGDT * MAXCPU]; /* global descriptor table */
@ -740,22 +734,23 @@ union descriptor gdt[NGDT * MAXCPU]; /* global descriptor table */
#define GBIOSARGS_SEL 14 /* BIOS interface (Arguments) */</programlisting>
<para>Note that those #defines are not selectors themselves, but
just a field INDEX of a selector, so they are exactly the
indices of the GDT. for example, an actual selector for the
kernel code (GCODE_SEL) has the value 0x08.</para>
just a field INDEX of a selector, so they are exactly the
indices of the GDT. for example, an actual selector for the
kernel code (GCODE_SEL) has the value 0x08.</para>
<indexterm><primary>Interrupt Descriptor Table (IDT)</primary></indexterm>
<indexterm><primary>Interrupt Descriptor Table
(IDT)</primary></indexterm>
<para>The next step is to initialize the Interrupt Descriptor
Table (IDT). This table is referenced by the processor
when a software or hardware interrupt occurs. For example, to
make a system call, user application issues the <literal>INT
0x80</literal> instruction. This is a software interrupt, so
the processor's hardware looks up a record with index 0x80 in
the IDT. This record points to the routine that handles this
interrupt, in this particular case, this will be the kernel's
syscall gate. The IDT may have a maximum of 256 (0x100)
records. The kernel allocates NIDT records for the IDT, where
NIDT is the maximum (256):</para>
Table (IDT). This table is referenced by the processor when a
software or hardware interrupt occurs. For example, to make a
system call, user application issues the <literal>INT
0x80</literal> instruction. This is a software interrupt, so
the processor's hardware looks up a record with index 0x80 in
the IDT. This record points to the routine that handles this
interrupt, in this particular case, this will be the kernel's
syscall gate. The IDT may have a maximum of 256 (0x100)
records. The kernel allocates NIDT records for the IDT, where
NIDT is the maximum (256):</para>
<programlisting><filename>sys/i386/i386/machdep.c:</filename>
static struct gate_descriptor idt0[NIDT];
@ -763,18 +758,18 @@ struct gate_descriptor *idt = &amp;idt0[0]; /* interrupt descriptor table */
</programlisting>
<para>For each interrupt, an appropriate handler is set. The
syscall gate for <literal>INT 0x80</literal> is set as
well:</para>
syscall gate for <literal>INT 0x80</literal> is set as
well:</para>
<programlisting><filename>sys/i386/i386/machdep.c:</filename>
setidt(0x80, &amp;IDTVEC(int0x80_syscall),
SDT_SYS386TGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL));</programlisting>
<para>So when a userland application issues the <literal>INT
0x80</literal> instruction, control will transfer to the
function <function>_Xint0x80_syscall</function>, which is in
the kernel code segment and will be executed with supervisor
privileges.</para>
0x80</literal> instruction, control will transfer to the
function <function>_Xint0x80_syscall</function>, which is in
the kernel code segment and will be executed with supervisor
privileges.</para>
<para>Console and DDB are then initialized:</para>
<indexterm><primary>DDB</primary></indexterm>
@ -789,13 +784,13 @@ struct gate_descriptor *idt = &amp;idt0[0]; /* interrupt descriptor table */
#endif</programlisting>
<para>The Task State Segment is another x86 protected mode
structure, the TSS is used by the hardware to store task
information when a task switch occurs.</para>
structure, the TSS is used by the hardware to store task
information when a task switch occurs.</para>
<para>The Local Descriptors Table is used to reference userland
code and data. Several selectors are defined to point to the
LDT, they are the system call gates and the user code and data
selectors:</para>
code and data. Several selectors are defined to point to the
LDT, they are the system call gates and the user code and data
selectors:</para>
<programlisting><filename>/usr/include/machine/segments.h:</filename>
#define LSYS5CALLS_SEL 0 /* forced by intel BCS */
@ -810,28 +805,28 @@ struct gate_descriptor *idt = &amp;idt0[0]; /* interrupt descriptor table */
#define NLDT (LBSDICALLS_SEL + 1)
</programlisting>
<para>Next, proc0's Process Control Block (<literal>struct
pcb</literal>) structure is initialized. proc0 is a
<literal>struct proc</literal> structure that describes a kernel
process. It is always present while the kernel is running,
therefore it is declared as global:</para>
<para>Next, proc0's Process Control Block (<literal>struct
pcb</literal>) structure is initialized. proc0 is a
<literal>struct proc</literal> structure that describes a
kernel process. It is always present while the kernel is
running, therefore it is declared as global:</para>
<programlisting><filename>sys/kern/kern_init.c:</filename>
<programlisting><filename>sys/kern/kern_init.c:</filename>
struct proc proc0;</programlisting>
<para>The structure <literal>struct pcb</literal> is a part of a
proc structure. It is defined in
<filename>/usr/include/machine/pcb.h</filename> and has a
process's information specific to the i386 architecture, such as
registers values.</para>
<para>The structure <literal>struct pcb</literal> is a part of a
proc structure. It is defined in
<filename>/usr/include/machine/pcb.h</filename> and has a
process's information specific to the i386 architecture, such
as registers values.</para>
</sect2>
<sect2>
<title><function>mi_startup()</function></title>
<para>This function performs a bubble sort of all the system
initialization objects and then calls the entry of each object
one by one:</para>
initialization objects and then calls the entry of each object
one by one:</para>
<programlisting><filename>sys/kern/init_main.c:</filename>
for (sipp = sysinit; *sipp; sipp++) {
@ -843,18 +838,18 @@ struct gate_descriptor *idt = &amp;idt0[0]; /* interrupt descriptor table */
/* ... skipped ... */
}</programlisting>
<para>Although the sysinit framework is described in the
<ulink
url="&url.doc.langbase;/books/developers-handbook">Developers'
<para>Although the sysinit framework is described in the
<ulink
url="&url.doc.langbase;/books/developers-handbook">Developers'
Handbook</ulink>, I will discuss the internals of it.</para>
<indexterm><primary>sysinit objects</primary></indexterm>
<para>Every system initialization object (sysinit object) is
created by calling a SYSINIT() macro. Let us take as example an
<literal>announce</literal> sysinit object. This object prints
the copyright message:</para>
<indexterm><primary>sysinit objects</primary></indexterm>
<para>Every system initialization object (sysinit object) is
created by calling a SYSINIT() macro. Let us take as example
an <literal>announce</literal> sysinit object. This object
prints the copyright message:</para>
<programlisting><filename>sys/kern/init_main.c:</filename>
<programlisting><filename>sys/kern/init_main.c:</filename>
static void
print_caddr_t(void *data __unused)
{
@ -862,17 +857,18 @@ print_caddr_t(void *data __unused)
}
SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyright)</programlisting>
<para>The subsystem ID for this object is SI_SUB_COPYRIGHT
(0x0800001), which comes right after the SI_SUB_CONSOLE
(0x0800000). So, the copyright message will be printed out
first, just after the console initialization.</para>
<para>The subsystem ID for this object is SI_SUB_COPYRIGHT
(0x0800001), which comes right after the SI_SUB_CONSOLE
(0x0800000). So, the copyright message will be printed out
first, just after the console initialization.</para>
<para>Let us take a look at what exactly the macro
<literal>SYSINIT()</literal> does. It expands to a
<literal>C_SYSINIT()</literal> macro. The
<literal>C_SYSINIT()</literal> macro then expands to a static
<literal>struct sysinit</literal> structure declaration with
another <literal>DATA_SET</literal> macro call:</para>
<para>Let us take a look at what exactly the macro
<literal>SYSINIT()</literal> does. It expands to a
<literal>C_SYSINIT()</literal> macro. The
<literal>C_SYSINIT()</literal> macro then expands to a static
<literal>struct sysinit</literal> structure declaration with
another <literal>DATA_SET</literal> macro call:</para>
<programlisting><filename>/usr/include/sys/kernel.h:</filename>
#define C_SYSINIT(uniquifier, subsystem, order, func, ident) \
static struct sysinit uniquifier ## _sys_init = { \ subsystem, \
@ -883,11 +879,11 @@ SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyright)</p
C_SYSINIT(uniquifier, subsystem, order, \
(sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)ident)</programlisting>
<para>The <literal>DATA_SET()</literal> macro expands to a
<literal>MAKE_SET()</literal>, and that macro is the point where
the all sysinit magic is hidden:</para>
<para>The <literal>DATA_SET()</literal> macro expands to a
<literal>MAKE_SET()</literal>, and that macro is the point
where the all sysinit magic is hidden:</para>
<programlisting><filename>/usr/include/linker_set.h:</filename>
<programlisting><filename>/usr/include/linker_set.h:</filename>
#define MAKE_SET(set, sym) \
static void const * const __set_##set##_sym_##sym = &amp;sym; \
__asm(".section .set." #set ",\"aw\""); \
@ -897,9 +893,9 @@ SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyright)</p
#define TEXT_SET(set, sym) MAKE_SET(set, sym)
#define DATA_SET(set, sym) MAKE_SET(set, sym)</programlisting>
<para>In our case, the following declaration will occur:</para>
<para>In our case, the following declaration will occur:</para>
<programlisting>static struct sysinit announce_sys_init = {
<programlisting>static struct sysinit announce_sys_init = {
SI_SUB_COPYRIGHT,
SI_ORDER_FIRST,
(sysinit_cfunc_t)(sysinit_nfunc_t) print_caddr_t,
@ -912,22 +908,24 @@ __asm(".section .set.sysinit_set" ",\"aw\"");
__asm(".long " "announce_sys_init");
__asm(".previous");</programlisting>
<para>The first <literal>__asm</literal> instruction will create
an ELF section within the kernel's executable. This will happen
at kernel link time. The section will have the name
<literal>.set.sysinit_set</literal>. The content of this section is one 32-bit
value, the address of announce_sys_init structure, and that is
what the second <literal>__asm</literal> is. The third
<literal>__asm</literal> instruction marks the end of a section.
If a directive with the same section name occurred before, the
content, i.e. the 32-bit value, will be appended to the existing
section, so forming an array of 32-bit pointers.</para>
<para>The first <literal>__asm</literal> instruction will create
an ELF section within the kernel's executable. This will
happen at kernel link time. The section will have the name
<literal>.set.sysinit_set</literal>. The content of this
section is one 32-bit value, the address of announce_sys_init
structure, and that is what the second
<literal>__asm</literal> is. The third
<literal>__asm</literal> instruction marks the end of a
section. If a directive with the same section name occurred
before, the content, i.e. the 32-bit value, will be appended
to the existing section, so forming an array of 32-bit
pointers.</para>
<para>Running <application>objdump</application> on a kernel
binary, you may notice the presence of such small
sections:</para>
<para>Running <application>objdump</application> on a kernel
binary, you may notice the presence of such small
sections:</para>
<screen>&prompt.user; <userinput>objdump -h /kernel</userinput>
<screen>&prompt.user; <userinput>objdump -h /kernel</userinput>
7 .set.cons_set 00000014 c03164c0 c03164c0 002154c0 2**2
CONTENTS, ALLOC, LOAD, DATA
8 .set.kbddriver_set 00000010 c03164d4 c03164d4 002154d4 2**2
@ -941,39 +939,40 @@ __asm(".previous");</programlisting>
12 .set.sysinit_set 00000664 c0316e90 c0316e90 00215e90 2**2
CONTENTS, ALLOC, LOAD, DATA</screen>
<para>This screen dump shows that the size of .set.sysinit_set
section is 0x664 bytes, so <literal>0x664/sizeof(void
*)</literal> sysinit objects are compiled into the kernel. The
other sections such as <literal>.set.sysctl_set</literal>
represent other linker sets.</para>
<para>This screen dump shows that the size of .set.sysinit_set
section is 0x664 bytes, so <literal>0x664/sizeof(void
*)</literal> sysinit objects are compiled into the kernel.
The other sections such as <literal>.set.sysctl_set</literal>
represent other linker sets.</para>
<para>By defining a variable of type <literal>struct
linker_set</literal> the content of
<literal>.set.sysinit_set</literal> section will be
<quote>collected</quote> into that variable:</para>
<para>By defining a variable of type <literal>struct
linker_set</literal> the content of
<literal>.set.sysinit_set</literal> section will be <quote>collected</quote>
into that variable:</para>
<programlisting><filename>sys/kern/init_main.c:</filename>
extern struct linker_set sysinit_set; /* XXX */</programlisting>
<para>The <literal>struct linker_set</literal> is defined as
follows:</para>
<para>The <literal>struct linker_set</literal> is defined as
follows:</para>
<programlisting><filename>/usr/include/linker_set.h:</filename>
<programlisting><filename>/usr/include/linker_set.h:</filename>
struct linker_set {
int ls_length;
void *ls_items[1]; /* really ls_length of them, trailing NULL */
};</programlisting>
<para>The first node will be equal to the number of a sysinit
objects, and the second node will be a NULL-terminated array of
pointers to them.</para>
<para>The first node will be equal to the number of a sysinit
objects, and the second node will be a NULL-terminated array
of pointers to them.</para>
<para>Returning to the <function>mi_startup()</function>
discussion, it is must be clear now, how the sysinit objects are
being organized. The <function>mi_startup()</function> function
sorts them and calls each. The very last object is the system
scheduler:</para>
<para>Returning to the <function>mi_startup()</function>
discussion, it is must be clear now, how the sysinit objects
are being organized. The <function>mi_startup()</function>
function sorts them and calls each. The very last object is
the system scheduler:</para>
<programlisting><filename>/usr/include/sys/kernel.h:</filename>
<programlisting><filename>/usr/include/sys/kernel.h:</filename>
enum sysinit_sub_id {
SI_SUB_DUMMY = 0x0000000, /* not executed; for linker*/
SI_SUB_DONE = 0x0000001, /* processed*/
@ -983,17 +982,18 @@ enum sysinit_sub_id {
SI_SUB_RUN_SCHEDULER = 0xfffffff /* scheduler: no return*/
};</programlisting>
<para>The system scheduler sysinit object is defined in the file
<filename>sys/vm/vm_glue.c</filename>, and the entry point for
that object is <function>scheduler()</function>. That function
is actually an infinite loop, and it represents a process with
PID 0, the swapper process. The proc0 structure, mentioned
before, is used to describe it.</para>
<para>The system scheduler sysinit object is defined in the file
<filename>sys/vm/vm_glue.c</filename>, and the entry point for
that object is <function>scheduler()</function>. That
function is actually an infinite loop, and it represents a
process with PID 0, the swapper process. The proc0 structure,
mentioned before, is used to describe it.</para>
<para>The first user process, called <emphasis>init</emphasis>, is
created by the sysinit object <literal>init</literal>:</para>
<para>The first user process, called <emphasis>init</emphasis>,
is created by the sysinit object
<literal>init</literal>:</para>
<programlisting><filename>sys/kern/init_main.c:</filename>
<programlisting><filename>sys/kern/init_main.c:</filename>
static void
create_init(const void *udata __unused)
{
@ -1011,31 +1011,30 @@ create_init(const void *udata __unused)
}
SYSINIT(init,SI_SUB_CREATE_INIT, SI_ORDER_FIRST, create_init, NULL)</programlisting>
<para>The <function>create_init()</function> allocates a new process
by calling <function>fork1()</function>, but does not mark it
runnable. When this new process is scheduled for execution by the
scheduler, the <function>start_init()</function> will be called.
That function is defined in <filename>init_main.c</filename>. It
tries to load and exec the <filename>init</filename> binary,
probing <filename>/sbin/init</filename> first, then
<filename>/sbin/oinit</filename>,
<filename>/sbin/init.bak</filename>, and finally
<filename>/stand/sysinstall</filename>:</para>
<para>The <function>create_init()</function> allocates a new
process by calling <function>fork1()</function>, but does not
mark it runnable. When this new process is scheduled for
execution by the scheduler, the
<function>start_init()</function> will be called. That
function is defined in <filename>init_main.c</filename>. It
tries to load and exec the <filename>init</filename> binary,
probing <filename>/sbin/init</filename> first, then
<filename>/sbin/oinit</filename>,
<filename>/sbin/init.bak</filename>, and finally
<filename>/stand/sysinstall</filename>:</para>
<programlisting><filename>sys/kern/init_main.c:</filename>
<programlisting><filename>sys/kern/init_main.c:</filename>
static char init_path[MAXPATHLEN] =
#ifdef INIT_PATH
__XSTRING(INIT_PATH);
#else
"/sbin/init:/sbin/oinit:/sbin/init.bak:/stand/sysinstall";
#endif</programlisting>
</sect2>
</sect1>
</sect2>
</sect1>
</chapter>
<!--
<!--
Local Variables:
mode: sgml
sgml-declaration: "../chapter.decl"