1864 lines
62 KiB
Text
1864 lines
62 KiB
Text
<!--
|
|
The FreeBSD Documentation Project
|
|
|
|
$Id: chapter.sgml,v 1.11 1999-08-05 20:48:13 nik Exp $
|
|
-->
|
|
|
|
<chapter id="internals">
|
|
<title>FreeBSD Internals</title>
|
|
|
|
<sect1 id="booting">
|
|
<title>The FreeBSD Booting Process</title>
|
|
|
|
<para><emphasis>Contributed by &a.phk;. v1.1, April
|
|
26th.</emphasis></para>
|
|
|
|
<para>Booting FreeBSD is essentially a three step process: load the
|
|
kernel, determine the root filesystem and initialize user-land things.
|
|
This leads to some interesting possibilities shown below.</para>
|
|
|
|
<sect2>
|
|
<title>Loading a kernel</title>
|
|
|
|
<para>We presently have three basic mechanisms for loading the kernel as
|
|
described below: they all pass some information to the kernel to help
|
|
the kernel decide what to do next.</para>
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Biosboot</term>
|
|
|
|
<listitem>
|
|
<para>Biosboot is our “bootblocks”. It consists of
|
|
two files which will be installed in the first 8Kbytes of the
|
|
floppy or hard-disk slice to be booted from.</para>
|
|
|
|
<para>Biosboot can load a kernel from a FreeBSD filesystem.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Dosboot</term>
|
|
|
|
<listitem>
|
|
<para>Dosboot was written by DI. Christian Gusenbauer, and is
|
|
unfortunately at this time one of the few pieces of code that
|
|
will not compile under FreeBSD itself because it is written for
|
|
Microsoft compilers.</para>
|
|
|
|
<para>Dosboot will boot the kernel from a MS-DOS file or from a
|
|
FreeBSD filesystem partition on the disk. It attempts to
|
|
negotiate with the various and strange kinds of memory manglers
|
|
that lurk in high memory on MS/DOS systems and usually wins them
|
|
for its case.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Netboot</term>
|
|
|
|
<listitem>
|
|
<para>Netboot will try to find a supported Ethernet card, and use
|
|
BOOTP, TFTP and NFS to find a kernel file to boot.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Determine the root filesystem</title>
|
|
|
|
<para>Once the kernel is loaded and the boot-code jumps to it, the
|
|
kernel will initialize itself, trying to determine what hardware is
|
|
present and so on; it then needs to find a root filesystem.</para>
|
|
|
|
<para>Presently we support the following types of root
|
|
filesystems:</para>
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>UFS</term>
|
|
|
|
<listitem>
|
|
<para>This is the most normal type of root filesystem. It can
|
|
reside on a floppy or on hard disk.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>MSDOS</term>
|
|
|
|
<listitem>
|
|
<para>While this is technically possible, it is not particular
|
|
useful because of the <acronym>FAT</acronym> filesystem's
|
|
inability to deal with links, device nodes and other such
|
|
“UNIXisms”.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>MFS</term>
|
|
|
|
<listitem>
|
|
<para>This is actually a UFS filesystem which has been compiled
|
|
into the kernel. That means that the kernel does not really
|
|
need any hard disks, floppies or other hardware to
|
|
function.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>CD9660</term>
|
|
|
|
<listitem>
|
|
<para>This is for using a CD-ROM as root filesystem.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>NFS</term>
|
|
|
|
<listitem>
|
|
<para>This is for using a fileserver as root filesystem, basically
|
|
making it a diskless machine.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Initialize user-land things</title>
|
|
|
|
<para>To get the user-land going, the kernel, when it has finished
|
|
initialization, will create a process with <literal>pid == 1</literal>
|
|
and execute a program on the root filesystem; this program is normally
|
|
<filename>/sbin/init</filename>.</para>
|
|
|
|
<para>You can substitute any program for <command>/sbin/init</command>,
|
|
as long as you keep in mind that:</para>
|
|
|
|
<para>there is no stdin/out/err unless you open it yourself. If you
|
|
exit, the machine panics. Signal handling is special for <literal>pid
|
|
== 1</literal>.</para>
|
|
|
|
<para>An example of this is the <command>/stand/sysinstall</command>
|
|
program on the installation floppy.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Interesting combinations</title>
|
|
|
|
<para>Boot a kernel with a MFS in it with a special
|
|
<filename>/sbin/init</filename> which...</para>
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>A — Using DOS</term>
|
|
|
|
<listitem>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>mounts your <filename>C:</filename> as
|
|
<filename>/C:</filename></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Attaches <filename>C:/freebsd.fs</filename> on
|
|
<filename>/dev/vn0</filename></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>mounts <filename>/dev/vn0</filename> as
|
|
<filename>/rootfs</filename></para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>makes symlinks
|
|
<filename>/rootfs/bin</filename> ->
|
|
<filename>/bin</filename>
|
|
<filename>/rootfs/etc</filename> ->
|
|
<filename>/etc</filename>
|
|
<filename>/rootfs/sbin</filename> ->
|
|
<filename>/sbin</filename> (etc...)</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>Now you are running FreeBSD without repartitioning your hard
|
|
disk...</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>B — Using NFS</term>
|
|
|
|
<listitem>
|
|
<para>NFS mounts your <filename>server:~you/FreeBSD</filename> as
|
|
<filename>/nfs</filename>, chroots to <filename>/nfs</filename>
|
|
and executes <filename>/sbin/init</filename> there</para>
|
|
|
|
<para>Now you are running FreeBSD diskless, even though you do not
|
|
control the NFS server...</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>C — Start an X-server</term>
|
|
|
|
<listitem>
|
|
<para>Now you have an X-terminal, which is better than that dingy
|
|
X-under-windows-so-slow-you-can-see-what-it-does thing that your
|
|
boss insist is better than forking out money on hardware.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>D — Using a tape</term>
|
|
|
|
<listitem>
|
|
<para>Takes a copy of <filename>/dev/rwd0</filename> and writes it
|
|
to a remote tape station or fileserver.</para>
|
|
|
|
<para>Now you finally get that backup you should have made a year
|
|
ago...</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>E — Acts as a firewall/web-server/what do I
|
|
know...</term>
|
|
|
|
<listitem>
|
|
<para>This is particularly interesting since you can boot from a
|
|
write- protected floppy, but still write to your root
|
|
filesystem...</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="memoryuse">
|
|
<title>PC Memory Utilization</title>
|
|
|
|
<para><emphasis>Contributed by &a.joerg;. 16 Apr
|
|
1995.</emphasis></para>
|
|
|
|
<para><emphasis>A short description of how FreeBSD uses memory on the i386
|
|
platform</emphasis></para>
|
|
|
|
<para>The boot sector will be loaded at <literal>0:0x7c00</literal>, and
|
|
relocates itself immediately to <literal>0x7c0:0</literal>. (This is
|
|
nothing magic, just an adjustment for the <literal>%cs</literal>
|
|
selector, done by an <literal>ljmp</literal>.)</para>
|
|
|
|
<para>It then loads the first 15 sectors at <literal>0x10000</literal>
|
|
(segment <makevar>BOOTSEG</makevar> in the biosboot Makefile), and sets
|
|
up the stack to work below <literal>0x1fff0</literal>. After this, it
|
|
jumps to the entry of boot2 within that code. I.e., it jumps over
|
|
itself and the (dummy) partition table, and it is going to adjust the
|
|
%cs selector—we are still in 16-bit mode there.</para>
|
|
|
|
<para>boot2 asks for the boot file, and examines the
|
|
<filename>a.out</filename> header. It masks the file entry point
|
|
(usually <literal>0xf0100000</literal>) by
|
|
<literal>0x00ffffff</literal>, and loads the file there. Hence the
|
|
usual load point is 1 MB (<literal>0x00100000</literal>). During load,
|
|
the boot code toggles back and forth between real and protected mode, to
|
|
use the BIOS in real mode.</para>
|
|
|
|
<para>The boot code itself uses segment selectors <literal>0x18</literal>
|
|
and <literal>0x20</literal> for <literal>%cs</literal> and
|
|
<literal>%ds/%es</literal> in protected mode, and
|
|
<literal>0x28</literal> to jump back into real mode. The kernel is
|
|
finally started with <literal>%cs</literal> <literal>0x08</literal> and
|
|
<literal>%ds/%es/%ss</literal> <literal>0x10</literal>, which refer to
|
|
dummy descriptors covering the entire address space.</para>
|
|
|
|
<para>The kernel will be started at its load point. Since it has been
|
|
linked for another (high) address, it will have to execute PIC until the
|
|
page table and page directory stuff is setup properly, at which point
|
|
paging will be enabled and the kernel will finally run at the address
|
|
for which it was linked.</para>
|
|
|
|
<para><emphasis>Contributed by &a.dg;. 16 Apr
|
|
1995.</emphasis></para>
|
|
|
|
<para>The physical pages immediately following the kernel BSS contain
|
|
proc0's page directory, page tables, and upages. Some time later when
|
|
the VM system is initialized, the physical memory between
|
|
<literal>0x1000-0x9ffff</literal> and the physical memory after the
|
|
kernel (text+data+bss+proc0 stuff+other misc) is made available in the
|
|
form of general VM pages and added to the global free page list.</para>
|
|
</sect1>
|
|
|
|
<sect1 id="dma">
|
|
<title>DMA: What it Is and How it Works</title>
|
|
|
|
<para><emphasis>Copyright © 1995,1997 &a.uhclem;, All Rights
|
|
Reserved. 10 December 1996. Last Update 8 October
|
|
1997.</emphasis></para>
|
|
|
|
<para>Direct Memory Access (DMA) is a method of allowing data to be moved
|
|
from one location to another in a computer without intervention from the
|
|
central processor (CPU).</para>
|
|
|
|
<para>The way that the DMA function is implemented varies between computer
|
|
architectures, so this discussion will limit itself to the
|
|
implementation and workings of the DMA subsystem on the IBM Personal
|
|
Computer (PC), the IBM PC/AT and all of its successors and
|
|
clones.</para>
|
|
|
|
<para>The PC DMA subsystem is based on the Intel 8237 DMA controller. The
|
|
8237 contains four DMA channels that can be programmed independently and
|
|
any one of the channels may be active at any moment. These channels are
|
|
numbered 0, 1, 2 and 3. Starting with the PC/AT, IBM added a second
|
|
8237 chip, and numbered those channels 4, 5, 6 and 7.</para>
|
|
|
|
<para>The original DMA controller (0, 1, 2 and 3) moves one byte in each
|
|
transfer. The second DMA controller (4, 5, 6, and 7) moves 16-bits from
|
|
two adjacent memory locations in each transfer, with the first byte
|
|
always coming from an even-numbered address. The two controllers are
|
|
identical components and the difference in transfer size is caused by
|
|
the way the second controller is wired into the system.</para>
|
|
|
|
<para>The 8237 has two electrical signals for each channel, named DRQ and
|
|
-DACK. There are additional signals with the names HRQ (Hold Request),
|
|
HLDA (Hold Acknowledge), -EOP (End of Process), and the bus control
|
|
signals -MEMR (Memory Read), -MEMW (Memory Write), -IOR (I/O Read), and
|
|
-IOW (I/O Write).</para>
|
|
|
|
<para>The 8237 DMA is known as a “fly-by” DMA controller.
|
|
This means that the data being moved from one location to another does
|
|
not pass through the DMA chip and is not stored in the DMA chip.
|
|
Subsequently, the DMA can only transfer data between an I/O port and a
|
|
memory address, but not between two I/O ports or two memory
|
|
locations.</para>
|
|
|
|
<note>
|
|
<para>The 8237 does allow two channels to be connected together to allow
|
|
memory-to-memory DMA operations in a non-“fly-by” mode,
|
|
but nobody in the PC industry uses this scarce resource this way since
|
|
it is faster to move data between memory locations using the
|
|
CPU.</para>
|
|
</note>
|
|
|
|
<para>In the PC architecture, each DMA channel is normally activated only
|
|
when the hardware that uses a given DMA channel requests a transfer by
|
|
asserting the DRQ line for that channel.</para>
|
|
|
|
<sect2>
|
|
<title>A Sample DMA transfer</title>
|
|
|
|
<para>Here is an example of the steps that occur to cause and perform a
|
|
DMA transfer. In this example, the floppy disk controller (FDC) has
|
|
just read a byte from a diskette and wants the DMA to place it in
|
|
memory at location 0x00123456. The process begins by the FDC
|
|
asserting the DRQ2 signal (the DRQ line for DMA channel 2) to alert
|
|
the DMA controller.</para>
|
|
|
|
<para>The DMA controller will note that the DRQ2 signal is asserted. The
|
|
DMA controller will then make sure that DMA channel 2 has been
|
|
programmed and is unmasked (enabled). The DMA controller also makes
|
|
sure that none of the other DMA channels are active or want to be
|
|
active and have a higher priority. Once these checks are complete,
|
|
the DMA asks the CPU to release the bus so that the DMA may use the
|
|
bus. The DMA requests the bus by asserting the HRQ signal which goes
|
|
to the CPU.</para>
|
|
|
|
<para>The CPU detects the HRQ signal, and will complete executing the
|
|
current instruction. Once the processor has reached a state where it
|
|
can release the bus, it will. Now all of the signals normally
|
|
generated by the CPU (-MEMR, -MEMW, -IOR, -IOW and a few others) are
|
|
placed in a tri-stated condition (neither high or low) and then the
|
|
CPU asserts the HLDA signal which tells the DMA controller that it is
|
|
now in charge of the bus.</para>
|
|
|
|
<para>Depending on the processor, the CPU may be able to execute a few
|
|
additional instructions now that it no longer has the bus, but the CPU
|
|
will eventually have to wait when it reaches an instruction that must
|
|
read something from memory that is not in the internal processor cache
|
|
or pipeline.</para>
|
|
|
|
<para>Now that the DMA “is in charge”, the DMA activates its
|
|
-MEMR, -MEMW, -IOR, -IOW output signals, and the address outputs from
|
|
the DMA are set to 0x3456, which will be used to direct the byte that
|
|
is about to transferred to a specific memory location.</para>
|
|
|
|
<para>The DMA will then let the device that requested the DMA transfer
|
|
know that the transfer is commencing. This is done by asserting the
|
|
-DACK signal, or in the case of the floppy disk controller, -DACK2 is
|
|
asserted.</para>
|
|
|
|
<para>The floppy disk controller is now responsible for placing the byte
|
|
to be transferred on the bus Data lines. Unless the floppy controller
|
|
needs more time to get the data byte on the bus (and if the peripheral
|
|
does need more time it alerts the DMA via the READY signal), the DMA
|
|
will wait one DMA clock, and then de-assert the -MEMW and -IOR signals
|
|
so that the memory will latch and store the byte that was on the bus,
|
|
and the FDC will know that the byte has been transferred.</para>
|
|
|
|
<para>Since the DMA cycle only transfers a single byte at a time, the
|
|
FDC now drops the DRQ2 signal, so the DMA knows that it is no longer
|
|
needed. The DMA will de-assert the -DACK2 signal, so that the FDC
|
|
knows it must stop placing data on the bus.</para>
|
|
|
|
<para>The DMA will now check to see if any of the other DMA channels
|
|
have any work to do. If none of the channels have their DRQ lines
|
|
asserted, the DMA controller has completed its work and will now
|
|
tri-state the -MEMR, -MEMW, -IOR, -IOW and address signals.</para>
|
|
|
|
<para>Finally, the DMA will de-assert the HRQ signal. The CPU sees
|
|
this, and de-asserts the HOLDA signal. Now the CPU activates its
|
|
-MEMR, -MEMW, -IOR, -IOW and address lines, and it resumes executing
|
|
instructions and accessing main memory and the peripherals.</para>
|
|
|
|
<para>For a typical floppy disk sector, the above process is repeated
|
|
512 times, once for each byte. Each time a byte is transferred, the
|
|
address register in the DMA is incremented and the counter in the DMA
|
|
that shows how many bytes are to be transferred is decremented.</para>
|
|
|
|
<para>When the counter reaches zero, the DMA asserts the EOP signal,
|
|
which indicates that the counter has reached zero and no more data
|
|
will be transferred until the DMA controller is reprogrammed by the
|
|
CPU. This event is also called the Terminal Count (TC). There is only
|
|
one EOP signal, and since only DMA channel can be active at any
|
|
instant, the DMA channel that is currently active must be the DMA
|
|
channel that just completed its task.</para>
|
|
|
|
<para>If a peripheral wants to generate an interrupt when the transfer
|
|
of a buffer is complete, it can test for its -DACKn signal and the EOP
|
|
signal both being asserted at the same time. When that happens, it
|
|
means the DMA will not transfer any more information for that
|
|
peripheral without intervention by the CPU. The peripheral can then
|
|
assert one of the interrupt signals to get the processors' attention.
|
|
In the PC architecture, the DMA chip itself is not capable of
|
|
generating an interrupt. The peripheral and its associated hardware
|
|
is responsible for generating any interrupt that occurs.
|
|
Subsequently, it is possible to have a peripheral that uses DMA but
|
|
does not use interrupts.</para>
|
|
|
|
<para>It is important to understand that although the CPU always
|
|
releases the bus to the DMA when the DMA makes the request, this
|
|
action is invisible to both applications and the operating systems,
|
|
except for slight changes in the amount of time the processor takes to
|
|
execute instructions when the DMA is active. Subsequently, the
|
|
processor must poll the peripheral, poll the registers in the DMA
|
|
chip, or receive an interrupt from the peripheral to know for certain
|
|
when a DMA transfer has completed.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>DMA Page Registers and 16Meg address space limitations</title>
|
|
|
|
<para>You may have noticed earlier that instead of the DMA setting the
|
|
address lines to 0x00123456 as we said earlier, the DMA only set
|
|
0x3456. The reason for this takes a bit of explaining.</para>
|
|
|
|
<para>When the original IBM PC was designed, IBM elected to use both DMA
|
|
and interrupt controller chips that were designed for use with the
|
|
8085, an 8-bit processor with an address space of 16 bits (64K).
|
|
Since the IBM PC supported more than 64K of memory, something had to
|
|
be done to allow the DMA to read or write memory locations above the
|
|
64K mark. What IBM did to solve this problem was to add an external
|
|
data latch for each DMA channel that holds the upper bits of the
|
|
address to be read to or written from. Whenever a DMA channel is
|
|
active, the contents of that latch are written to the address bus and
|
|
kept there until the DMA operation for the channel ends. IBM called
|
|
these latches “Page Registers”.</para>
|
|
|
|
<para>So for our example above, the DMA would put the 0x3456 part of the
|
|
address on the bus, and the Page Register for DMA channel 2 would put
|
|
0x0012xxxx on the bus. Together, these two values form the complete
|
|
address in memory that is to be accessed.</para>
|
|
|
|
<para>Because the Page Register latch is independent of the DMA chip,
|
|
the area of memory to be read or written must not span a 64K physical
|
|
boundary. For example, if the DMA accesses memory location 0xffff,
|
|
after that transfer the DMA will then increment the address register
|
|
and the DMA will access the next byte at location 0x0000, not 0x10000.
|
|
The results of letting this happen are probably not intended.</para>
|
|
|
|
<note>
|
|
<para>“Physical” 64K boundaries should not be confused
|
|
with 8086-mode 64K “Segments”, which are created by
|
|
mathematically adding a segment register with an offset register.
|
|
Page Registers have no address overlap and are mathematically OR-ed
|
|
together.</para>
|
|
</note>
|
|
|
|
<para>To further complicate matters, the external DMA address latches on
|
|
the PC/AT hold only eight bits, so that gives us 8+16=24 bits, which
|
|
means that the DMA can only point at memory locations between 0 and
|
|
16Meg. For newer computers that allow more than 16Meg of memory, the
|
|
standard PC-compatible DMA cannot access memory locations above
|
|
16Meg.</para>
|
|
|
|
<para>To get around this restriction, operating systems will reserve a
|
|
RAM buffer in an area below 16Meg that also does not span a physical
|
|
64K boundary. Then the DMA will be programmed to transfer data from
|
|
the peripheral and into that buffer. Once the DMA has moved the data
|
|
into this buffer, the operating system will then copy the data from
|
|
the buffer to the address where the data is really supposed to be
|
|
stored.</para>
|
|
|
|
<para>When writing data from an address above 16Meg to a DMA-based
|
|
peripheral, the data must be first copied from where it resides into a
|
|
buffer located below 16Meg, and then the DMA can copy the data from
|
|
the buffer to the hardware. In FreeBSD, these reserved buffers are
|
|
called “Bounce Buffers”. In the MS-DOS world, they are
|
|
sometimes called “Smart Buffers”.</para>
|
|
|
|
<note>
|
|
<para>A new implementation of the 8237, called the 82374, allows 16
|
|
bits of page register to be specified, allows access to the entire
|
|
32 bit address space, without the use of bounce buffers.</para>
|
|
</note>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>DMA Operational Modes and Settings</title>
|
|
|
|
<para>The 8237 DMA can be operated in several modes. The main ones
|
|
are:</para>
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Single</term>
|
|
|
|
<listitem>
|
|
<para>A single byte (or word) is transferred. The DMA must
|
|
release and re-acquire the bus for each additional byte. This is
|
|
commonly-used by devices that cannot transfer the entire block
|
|
of data immediately. The peripheral will request the DMA each
|
|
time it is ready for another transfer.</para>
|
|
|
|
<para>The standard PC-compatible floppy disk controller (NEC 765)
|
|
only has a one-byte buffer, so it uses this mode.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Block/Demand</term>
|
|
|
|
<listitem>
|
|
<para>Once the DMA acquires the system bus, an entire block of
|
|
data is transferred, up to a maximum of 64K. If the peripheral
|
|
needs additional time, it can assert the READY signal to suspend
|
|
the transfer briefly. READY should not be used excessively, and
|
|
for slow peripheral transfers, the Single Transfer Mode should
|
|
be used instead.</para>
|
|
|
|
<para>The difference between Block and Demand is that once a Block
|
|
transfer is started, it runs until the transfer count reaches
|
|
zero. DRQ only needs to be asserted until -DACK is asserted.
|
|
Demand Mode will transfer one more bytes until DRQ is
|
|
de-asserted, at which point the DMA suspends the transfer and
|
|
releases the bus back to the CPU. When DRQ is asserted later,
|
|
the transfer resumes where it was suspended.</para>
|
|
|
|
<para>Older hard disk controllers used Demand Mode until CPU
|
|
speeds increased to the point that it was more efficient to
|
|
transfer the data using the CPU, particularly if the memory
|
|
locations used in the transfer were above the 16Meg mark.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Cascade</term>
|
|
|
|
<listitem>
|
|
<para>This mechanism allows a DMA channel to request the bus, but
|
|
then the attached peripheral device is responsible for placing
|
|
the addressing information on the bus instead of the DMA. This
|
|
is also used to implement a technique known as “Bus
|
|
Mastering”.</para>
|
|
|
|
<para>When a DMA channel in Cascade Mode receives control of the
|
|
bus, the DMA does not place addresses and I/O control signals on
|
|
the bus like the DMA normally does when it is active. Instead,
|
|
the DMA only asserts the -DACK signal for the active DMA
|
|
channel.</para>
|
|
|
|
<para>At this point it is up to the peripheral connected to that
|
|
DMA channel to provide address and bus control signals. The
|
|
peripheral has complete control over the system bus, and can do
|
|
reads and/or writes to any address below 16Meg. When the
|
|
peripheral is finished with the bus, it de-asserts the DRQ line,
|
|
and the DMA controller can then return control to the CPU or to
|
|
some other DMA channel.</para>
|
|
|
|
<para>Cascade Mode can be used to chain multiple DMA controllers
|
|
together, and this is exactly what DMA Channel 4 is used for in
|
|
the PC architecture. When a peripheral requests the bus on DMA
|
|
channels 0, 1, 2 or 3, the slave DMA controller asserts HLDREQ,
|
|
but this wire is actually connected to DRQ4 on the primary DMA
|
|
controller instead of to the CPU. The primary DMA controller,
|
|
thinking it has work to do on Channel 4, requests the bus from
|
|
the CPU using HLDREQ signal. Once the CPU grants the bus to the
|
|
primary DMA controller, -DACK4 is asserted, and that wire is
|
|
actually connected to the HLDA signal on the slave DMA
|
|
controller. The slave DMA controller then transfers data for
|
|
the DMA channel that requested it (0, 1, 2 or 3), or the slave
|
|
DMA may grant the bus to a peripheral that wants to perform its
|
|
own bus-mastering, such as a SCSI controller.</para>
|
|
|
|
<para>Because of this wiring arrangement, only DMA channels 0, 1,
|
|
2, 3, 5, 6 and 7 are usable with peripherals on PC/AT
|
|
systems.</para>
|
|
|
|
<note>
|
|
<para>DMA channel 0 was reserved for refresh operations in early
|
|
IBM PC computers, but is generally available for use by
|
|
peripherals in modern systems.</para>
|
|
</note>
|
|
|
|
<para>When a peripheral is performing Bus Mastering, it is
|
|
important that the peripheral transmit data to or from memory
|
|
constantly while it holds the system bus. If the peripheral
|
|
cannot do this, it must release the bus frequently so that the
|
|
system can perform refresh operations on main memory.</para>
|
|
|
|
<para>The Dynamic RAM used in all PCs for main memory must be
|
|
accessed frequently to keep the bits stored in the components
|
|
“charged”. Dynamic RAM essentially consists of
|
|
millions of capacitors with each one holding one bit of data.
|
|
These capacitors are charged with power to represent a
|
|
<literal>1</literal> or drained to represent a
|
|
<literal>0</literal>. Because all capacitors leak, power must
|
|
be added at regular intervals to keep the <literal>1</literal>
|
|
values intact. The RAM chips actually handle the task of
|
|
pumping power back into all of the appropriate locations in RAM,
|
|
but they must be told when to do it by the rest of the computer
|
|
so that the refresh activity won't interfere with the computer
|
|
wanting to access RAM normally. If the computer is unable to
|
|
refresh memory, the contents of memory will become corrupted in
|
|
just a few milliseconds.</para>
|
|
|
|
<para>Since memory read and write cycles “count” as
|
|
refresh cycles (a dynamic RAM refresh cycle is actually an
|
|
incomplete memory read cycle), as long as the peripheral
|
|
controller continues reading or writing data to sequential
|
|
memory locations, that action will refresh all of memory.</para>
|
|
|
|
<para>Bus-mastering is found in some SCSI host interfaces and
|
|
other high-performance peripheral controllers.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Autoinitialize</term>
|
|
|
|
<listitem>
|
|
<para>This mode causes the DMA to perform Byte, Block or Demand
|
|
transfers, but when the DMA transfer counter reaches zero, the
|
|
counter and address are set back to where they were when the DMA
|
|
channel was originally programmed. This means that as long as
|
|
the peripheral requests transfers, they will be granted. It is
|
|
up to the CPU to move new data into the fixed buffer ahead of
|
|
where the DMA is about to transfer it when doing output
|
|
operations, and read new data out of the buffer behind where the
|
|
DMA is writing when doing input operations.</para>
|
|
|
|
<para>This technique is frequently used on audio devices that have
|
|
small or no hardware “sample” buffers. There is
|
|
additional CPU overhead to manage this “circular”
|
|
buffer, but in some cases this may be the only way to eliminate
|
|
the latency that occurs when the DMA counter reaches zero and
|
|
the DMA stops transfers until it is reprogrammed.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Programming the DMA</title>
|
|
|
|
<para>The DMA channel that is to be programmed should always be
|
|
“masked” before loading any settings. This is because the
|
|
hardware might unexpectedly assert the DRQ for that channel, and the
|
|
DMA might respond, even though not all of the parameters have been
|
|
loaded or updated.</para>
|
|
|
|
<para>Once masked, the host must specify the direction of the transfer
|
|
(memory-to-I/O or I/O-to-memory), what mode of DMA operation is to be
|
|
used for the transfer (Single, Block, Demand, Cascade, etc), and
|
|
finally the address and length of the transfer are loaded. The length
|
|
that is loaded is one less than the amount you expect the DMA to
|
|
transfer. The LSB and MSB of the address and length are written to
|
|
the same 8-bit I/O port, so another port must be written to first to
|
|
guarantee that the DMA accepts the first byte as the LSB and the
|
|
second byte as the MSB of the length and address.</para>
|
|
|
|
<para>Then, be sure to update the Page Register, which is external to
|
|
the DMA and is accessed through a different set of I/O ports.</para>
|
|
|
|
<para>Once all the settings are ready, the DMA channel can be un-masked.
|
|
That DMA channel is now considered to be “armed”, and will
|
|
respond when the DRQ line for that channel is asserted.</para>
|
|
|
|
<para>Refer to a hardware data book for precise programming details for
|
|
the 8237. You will also need to refer to the I/O port map for the PC
|
|
system, which describes where the DMA and Page Register ports are
|
|
located. A complete port map table is located below.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>DMA Port Map</title>
|
|
|
|
<para>All systems based on the IBM-PC and PC/AT have the DMA hardware
|
|
located at the same I/O ports. The complete list is provided below.
|
|
Ports assigned to DMA Controller #2 are undefined on non-AT
|
|
designs.</para>
|
|
|
|
<sect3>
|
|
<title>0x00–0x1f DMA Controller #1 (Channels 0, 1, 2 and
|
|
3)</title>
|
|
|
|
<para>DMA Address and Count Registers</para>
|
|
|
|
<informaltable frame="none">
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>0x00</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 0 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x00</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 0 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x01</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 0 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x01</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 0 remaining word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x02</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 1 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x02</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 1 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x03</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 1 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x03</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 1 remaining word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x04</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 2 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x04</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 2 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x05</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 2 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x05</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 2 remaining word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x06</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 3 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x06</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 3 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x07</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 3 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x07</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 3 remaining word count</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
|
|
<para>DMA Command Registers</para>
|
|
|
|
<informaltable frame="none">
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>0x08</entry>
|
|
<entry>write</entry>
|
|
<entry>Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x08</entry>
|
|
<entry>read</entry>
|
|
<entry>Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x09</entry>
|
|
<entry>write</entry>
|
|
<entry>Request Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x09</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0a</entry>
|
|
<entry>write</entry>
|
|
<entry>Single Mask Register Bit</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0a</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0b</entry>
|
|
<entry>write</entry>
|
|
<entry>Mode Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0b</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0c</entry>
|
|
<entry>write</entry>
|
|
<entry>Clear LSB/MSB Flip-Flop</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0c</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0d</entry>
|
|
<entry>write</entry>
|
|
<entry>Master Clear/Reset</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0d</entry>
|
|
<entry>read</entry>
|
|
<entry>Temporary Register (not available on newer
|
|
versions)</entry>
|
|
</row>
|
|
<row>
|
|
<entry>0x0e</entry>
|
|
<entry>write</entry>
|
|
<entry>Clear Mask Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0e</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0f</entry>
|
|
<entry>write</entry>
|
|
<entry>Write All Mask Register Bits</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x0f</entry>
|
|
<entry>read</entry>
|
|
<entry>Read All Mask Register Bits (only in Intel
|
|
82374)</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>0xc0–0xdf DMA Controller #2 (Channels 4, 5, 6 and
|
|
7)</title>
|
|
|
|
<para>DMA Address and Count Registers</para>
|
|
|
|
<informaltable frame="none">
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>0xc0</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 4 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc0</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 4 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc2</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 4 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc2</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 4 remaining word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc4</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 5 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc4</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 5 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc6</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 5 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc6</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 5 remaining word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc8</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 6 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xc8</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 6 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xca</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 6 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xca</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 6 remaining word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xcc</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 7 starting address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xcc</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 7 current address</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xce</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 7 starting word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xce</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 7 remaining word count</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
|
|
<para>DMA Command Registers</para>
|
|
|
|
<informaltable frame="none">
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>0xd0</entry>
|
|
<entry>write</entry>
|
|
<entry>Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd0</entry>
|
|
<entry>read</entry>
|
|
<entry>Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd2</entry>
|
|
<entry>write</entry>
|
|
<entry>Request Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd2</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd4</entry>
|
|
<entry>write</entry>
|
|
<entry>Single Mask Register Bit</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd4</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd6</entry>
|
|
<entry>write</entry>
|
|
<entry>Mode Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd6</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd8</entry>
|
|
<entry>write</entry>
|
|
<entry>Clear LSB/MSB Flip-Flop</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xd8</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xda</entry>
|
|
<entry>write</entry>
|
|
<entry>Master Clear/Reset</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xda</entry>
|
|
<entry>read</entry>
|
|
<entry>Temporary Register (not present in Intel
|
|
82374)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xdc</entry>
|
|
<entry>write</entry>
|
|
<entry>Clear Mask Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xdc</entry>
|
|
<entry>read</entry>
|
|
<entry>-</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xde</entry>
|
|
<entry>write</entry>
|
|
<entry>Write All Mask Register Bits</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0xdf</entry>
|
|
<entry>read</entry>
|
|
<entry>Read All Mask Register Bits (only in Intel
|
|
82374)</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>0x80–0x9f DMA Page Registers</title>
|
|
|
|
<informaltable frame="none">
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>0x87</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x83</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x81</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x82</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x8b</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x89</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x8a</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 7 Low byte (23-16) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x8f</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Low byte page Refresh</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>0x400–0x4ff 82374 Enhanced DMA Registers</title>
|
|
|
|
<para>The Intel 82374 EISA System Component (ESC) was introduced in
|
|
early 1996 and includes a DMA controller that provides a superset of
|
|
8237 functionality as well as other PC-compatible core peripheral
|
|
components in a single package. This chip is targeted at both EISA
|
|
and PCI platforms, and provides modern DMA features like
|
|
scatter-gather, ring buffers as well as direct access by the system
|
|
DMA to all 32 bits of address space.</para>
|
|
|
|
<para>If these features are used, code should also be included to
|
|
provide similar functionality in the previous 16 years worth of
|
|
PC-compatible computers. For compatibility reasons, some of the
|
|
82374 registers must be programmed <emphasis>after</emphasis>
|
|
programming the traditional 8237 registers for each transfer.
|
|
Writing to a traditional 8237 register forces the contents of some
|
|
of the 82374 enhanced registers to zero to provide backward software
|
|
compatibility.</para>
|
|
|
|
<informaltable frame="none">
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>0x401</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x403</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x405</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x407</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4c6</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4ca</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4ce</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 7 High byte (bits 23-16) word count</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x487</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x483</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x481</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x482</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x48b</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x489</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x48a</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 High byte (bits 31-24) page Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x48f</entry>
|
|
<entry>r/w</entry>
|
|
<entry>High byte page Refresh</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e0</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e1</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e2</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e4</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e5</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e6</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e8</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4e9</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4ea</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4ec</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4ed</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4ee</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4f4</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4f5</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4f6</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4f8</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4f9</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4fa</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4fc</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 7 Stop Register (bits 7-2)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4fd</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 7 Stop Register (bits 15-8)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4fe</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 7 Stop Register (bits 23-16)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x40a</entry>
|
|
<entry>write</entry>
|
|
<entry>Channels 0-3 Chaining Mode Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x40a</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel Interrupt Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4d4</entry>
|
|
<entry>write</entry>
|
|
<entry>Channels 4-7 Chaining Mode Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x4d4</entry>
|
|
<entry>read</entry>
|
|
<entry>Chaining Mode Status</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x40c</entry>
|
|
<entry>read</entry>
|
|
<entry>Chain Buffer Expiration Control Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x410</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 0 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x411</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 1 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x412</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 2 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x413</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 3 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x415</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 5 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x416</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 6 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x417</entry>
|
|
<entry>write</entry>
|
|
<entry>Channel 7 Scatter-Gather Command Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x418</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 0 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x419</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 1 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x41a</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 2 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x41b</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 3 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x41d</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 5 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x41e</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 5 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x41f</entry>
|
|
<entry>read</entry>
|
|
<entry>Channel 7 Scatter-Gather Status Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x420-0x423</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 0 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x424-0x427</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 1 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x428-0x42b</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 2 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x42c-0x42f</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 3 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x434-0x437</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 5 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x438-0x43b</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 6 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>0x43c-0x43f</entry>
|
|
<entry>r/w</entry>
|
|
<entry>Channel 7 Scatter-Gather Descriptor Table Pointer
|
|
Register</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</informaltable>
|
|
</sect3>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="internals-vm">
|
|
<title>The FreeBSD VM System</title>
|
|
|
|
<para><emphasis>Contributed by &a.dillon;. 6 Feb 1999</emphasis></para>
|
|
|
|
<sect2>
|
|
<title>Management of physical
|
|
memory—<literal>vm_page_t</literal></title>
|
|
|
|
<para>Physical memory is managed on a page-by-page basis through the
|
|
<literal>vm_page_t</literal> structure. Pages of physical memory are
|
|
categorized through the placement of their respective
|
|
<literal>vm_page_t</literal> structures on one of several paging
|
|
queues.</para>
|
|
|
|
<para>A page can be in a wired, active, inactive, cache, or free state.
|
|
Except for the wired state, the page is typically placed in a doubly
|
|
link list queue representing the state that it is in. Wired pages
|
|
are not placed on any queue.</para>
|
|
|
|
<para>FreeBSD implements a more involved paging queue for cached and
|
|
free pages in order to implement page coloring. Each of these states
|
|
involves multiple queues arranged according to the size of the
|
|
processor's L1 and L2 caches. When a new page needs to be allocated,
|
|
FreeBSD attempts to obtain one that is reasonably well aligned from
|
|
the point of view of the L1 and L2 caches relative to the VM object
|
|
the page is being allocated for.</para>
|
|
|
|
<para>Additionally, a page may be held with a reference count or locked
|
|
with a busy count. The VM system also implements an “ultimate
|
|
locked” state for a page using the PG_BUSY bit in the page's
|
|
flags.</para>
|
|
|
|
<para>In general terms, each of the paging queues operates in a LRU
|
|
fashion. A page is typically placed in a wired or active state
|
|
initially. When wired, the page is usually associated with a page
|
|
table somewhere. The VM system ages the page by scanning pages in a
|
|
more active paging queue (LRU) in order to move them to a less-active
|
|
paging queue. Pages that get moved into the cache are still
|
|
associated with a VM object but are candidates for immediate reuse.
|
|
Pages in the free queue are truly free. FreeBSD attempts to minimize
|
|
the number of pages in the free queue, but a certain minimum number of
|
|
truly free pages must be maintained in order to accommodate page
|
|
allocation at interrupt time.</para>
|
|
|
|
<para>If a process attempts to access a page that does not exist in its
|
|
page table but does exist in one of the paging queues ( such as the
|
|
inactive or cache queues), a relatively inexpensive page reactivation
|
|
fault occurs which causes the page to be reactivated. If the page
|
|
does not exist in system memory at all, the process must block while
|
|
the page is brought in from disk.</para>
|
|
|
|
<para>FreeBSD dynamically tunes its paging queues and attempts to
|
|
maintain reasonable ratios of pages in the various queues as well as
|
|
attempts to maintain a reasonable breakdown of clean vs dirty pages.
|
|
The amount of rebalancing that occurs depends on the system's memory
|
|
load. This rebalancing is implemented by the pageout daemon and
|
|
involves laundering dirty pages (syncing them with their backing
|
|
store), noticing when pages are activity referenced (resetting their
|
|
position in the LRU queues or moving them between queues), migrating
|
|
pages between queues when the queues are out of balance, and so forth.
|
|
FreeBSD's VM system is willing to take a reasonable number of
|
|
reactivation page faults to determine how active or how idle a page
|
|
actually is. This leads to better decisions being made as to when to
|
|
launder or swap-out a page.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>The unified buffer
|
|
cache—<literal>vm_object_t</literal></title>
|
|
|
|
<para>FreeBSD implements the idea of a generic “VM object”.
|
|
VM objects can be associated with backing store of various
|
|
types—unbacked, swap-backed, physical device-backed, or
|
|
file-backed storage. Since the filesystem uses the same VM objects to
|
|
manage in-core data relating to files, the result is a unified buffer
|
|
cache.</para>
|
|
|
|
<para>VM objects can be <emphasis>shadowed</emphasis>. That is, they
|
|
can be stacked on top of each other. For example, you might have a
|
|
swap-backed VM object stacked on top of a file-backed VM object in
|
|
order to implement a MAP_PRIVATE mmap()ing. This stacking is also
|
|
used to implement various sharing properties, including,
|
|
copy-on-write, for forked address spaces.</para>
|
|
|
|
<para>It should be noted that a <literal>vm_page_t</literal> can only be
|
|
associated with one VM object at a time. The VM object shadowing
|
|
implements the perceived sharing of the same page across multiple
|
|
instances.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Filesystem I/O—<literal>struct buf</literal></title>
|
|
|
|
<para>vnode-backed VM objects, such as file-backed objects, generally
|
|
need to maintain their own clean/dirty info independent from the VM
|
|
system's idea of clean/dirty. For example, when the VM system decides
|
|
to synchronize a physical page to its backing store, the VM system
|
|
needs to mark the page clean before the page is actually written to
|
|
its backing s tore. Additionally, filesystems need to be able to map
|
|
portions of a file or file metadata into KVM in order to operate on
|
|
it.</para>
|
|
|
|
<para>The entities used to manage this are known as filesystem buffers,
|
|
<literal>struct buf</literal>'s, and also known as
|
|
<literal>bp</literal>'s. When a filesystem needs to operate on a
|
|
portion of a VM object, it typically maps part of the object into a
|
|
struct buf and the maps the pages in the struct buf into KVM. In the
|
|
same manner, disk I/O is typically issued by mapping portions of
|
|
objects into buffer structures and then issuing the I/O on the buffer
|
|
structures. The underlying vm_page_t's are typically busied for the
|
|
duration of the I/O. Filesystem buffers also have their own notion of
|
|
being busy, which is useful to filesystem driver code which would
|
|
rather operate on filesystem buffers instead of hard VM pages.</para>
|
|
|
|
<para>FreeBSD reserves a limited amount of KVM to hold mappings from
|
|
struct bufs, but it should be made clear that this KVM is used solely
|
|
to hold mappings and does not limit the ability to cache data.
|
|
Physical data caching is strictly a function of
|
|
<literal>vm_page_t</literal>'s, not filesystem buffers. However,
|
|
since filesystem buffers are used placehold I/O, they do inherently
|
|
limit the amount of concurrent I/O possible. As there are usually a
|
|
few thousand filesystem buffers available, this is not usually a
|
|
problem.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Mapping Page Tables - vm_map_t, vm_entry_t</title>
|
|
|
|
<para>FreeBSD separates the physical page table topology from the VM
|
|
system. All hard per-process page tables can be reconstructed on the
|
|
fly and are usually considered throwaway. Special page tables such as
|
|
those managing KVM are typically permanently preallocated. These page
|
|
tables are not throwaway.</para>
|
|
|
|
<para>FreeBSD associates portions of vm_objects with address ranges in
|
|
virtual memory through <literal>vm_map_t</literal> and
|
|
<literal>vm_entry_t</literal> structures. Page tables are directly
|
|
synthesized from the
|
|
<literal>vm_map_t</literal>/<literal>vm_entry_t</literal>/
|
|
<literal>vm_object_t</literal> hierarchy. Remember when I mentioned
|
|
that physical pages are only directly associated with a
|
|
<literal>vm_object</literal>. Well, that isn't quite true.
|
|
<literal>vm_page_t</literal>'s are also linked into page tables that
|
|
they are actively associated with. One <literal>vm_page_t</literal>
|
|
can be linked into several <emphasis>pmaps</emphasis>, as page tables
|
|
are called. However, the hierarchical association holds so all
|
|
references to the same page in the same object reference the same
|
|
<literal>vm_page_t</literal> and thus give us buffer cache unification
|
|
across the board.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>KVM Memory Mapping</title>
|
|
|
|
<para>FreeBSD uses KVM to hold various kernel structures. The single
|
|
largest entity held in KVM is the filesystem buffer cache. That is,
|
|
mappings relating to <literal>struct buf</literal> entities.</para>
|
|
|
|
<para>Unlike Linux, FreeBSD does NOT map all of physical memory into
|
|
KVM. This means that FreeBSD can handle memory configurations up to
|
|
4G on 32 bit platforms. In fact, if the mmu were capable of it,
|
|
FreeBSD could theoretically handle memory configurations up to 8TB on
|
|
a 32 bit platform. However, since most 32 bit platforms are only
|
|
capable of mapping 4GB of ram, this is a moot point.</para>
|
|
|
|
<para>KVM is managed through several mechanisms. The main mechanism
|
|
used to manage KVM is the <emphasis>zone allocator</emphasis>. The
|
|
zone allocator takes a chunk of KVM and splits it up into
|
|
constant-sized blocks of memory in order to allocate a specific type
|
|
of structure. You can use <command>vmstat -m</command> to get an
|
|
overview of current KVM utilization broken down by zone.</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Tuning the FreeBSD VM system</title>
|
|
|
|
<para>A concerted effort has been made to make the FreeBSD kernel
|
|
dynamically tune itself. Typically you do not need to mess with
|
|
anything beyond the <literal>maxusers</literal> and
|
|
<literal>NMBCLUSTERS</literal> kernel config options. That is, kernel
|
|
compilation options specified in (typically)
|
|
<filename>/usr/src/sys/i386/conf/<replaceable>CONFIG_FILE</replaceable></filename>.
|
|
A description of all available kernel configuration options can be
|
|
found in <filename>/usr/src/sys/i386/conf/LINT</filename>.</para>
|
|
|
|
<para>In a large system configuration you may wish to increase
|
|
<literal>maxusers</literal>. Values typically range from 10 to 128.
|
|
Note that raising <literal>maxusers</literal> too high can cause the
|
|
system to overflow available KVM resulting in unpredictable operation.
|
|
It is better to leave maxusers at some reasonable number and add other
|
|
options, such as <literal>NMBCLUSTERS</literal>, to increase specific
|
|
resources.</para>
|
|
|
|
<para>If your system is going to use the network heavily, you may want
|
|
to increase <literal>NMBCLUSTERS</literal>. Typical values range from
|
|
1024 to 4096.</para>
|
|
|
|
<para>The <literal>NBUF</literal> parameter is also traditionally used
|
|
to scale the system. This parameter determines the amount of KVA the
|
|
system can use to map filesystem buffers for I/O. Note that this
|
|
parameter has nothing whatsoever to do with the unified buffer cache!
|
|
This parameter is dynamically tuned in 3.0-CURRENT and later kernels
|
|
and should generally not be adjusted manually. We recommend that you
|
|
<emphasis>not</emphasis> try to specify an <literal>NBUF</literal>
|
|
parameter. Let the system pick it. Too small a value can result in
|
|
extremely inefficient filesystem operation while too large a value can
|
|
starve the page queues by causing too many pages to become wired
|
|
down.</para>
|
|
|
|
<para>By default, FreeBSD kernels are not optimized. You can set
|
|
debugging and optimization flags with the
|
|
<literal>makeoptions</literal> directive in the kernel configuration.
|
|
Note that you should not use <option>-g</option> unless you can
|
|
accommodate the large (typically 7 MB+) kernels that result.</para>
|
|
|
|
<programlisting>makeoptions DEBUG="-g"
|
|
makeoptions COPTFLAGS="-O2 -pipe"</programlisting>
|
|
|
|
<para>Sysctl provides a way to tune kernel parameters at run-time. You
|
|
typically do not need to mess with any of the sysctl variables,
|
|
especially the VM related ones.</para>
|
|
|
|
<para>Run time VM and system tuning is relatively straightforward.
|
|
First, use softupdates on your UFS/FFS filesystems whenever possible.
|
|
<filename>/usr/src/contrib/sys/softupdates/README</filename> contains
|
|
instructions (and restrictions) on how to configure it up.</para>
|
|
|
|
<para>Second, configure sufficient swap. You should have a swap
|
|
partition configured on each physical disk, up to four, even on your
|
|
“work” disks. You should have at least 2x the swap space
|
|
as you have main memory, and possibly even more if you do not have a
|
|
lot of memory. You should also size your swap partition based on the
|
|
maximum memory configuration you ever intend to put on the machine so
|
|
you do not have to repartition your disks later on. If you want to be
|
|
able to accommodate a crash dump, your first swap partition must be at
|
|
least as large as main memory and <filename>/var/crash</filename> must
|
|
have sufficient free space to hold the dump.</para>
|
|
|
|
<para>NFS-based swap is perfectly acceptable on -4.x or later systems,
|
|
but you must be aware that the NFS server will take the brunt of the
|
|
paging load.</para>
|
|
</sect2>
|
|
</sect1>
|
|
</chapter>
|
|
|
|
<!--
|
|
Local Variables:
|
|
mode: sgml
|
|
sgml-declaration: "../chapter.decl"
|
|
sgml-indent-data: t
|
|
sgml-omittag: nil
|
|
sgml-always-quote-attributes: t
|
|
sgml-parent-document: ("../handbook.sgml" "part" "chapter")
|
|
End:
|
|
-->
|
|
|