doc/en/handbook/README
Nik Clayton a497ab6892 After an extended absence (quit job, set up company, get girlfriend, get
contract, start contract, work too many hours per day) I'm back working
on the DocBook conversion :-)

Create two new entities, prompt.root and prompt.user. Use these where the
user is shown an OS prompt, to indicate whether they should be a normal
user or do it as root.

Everything else that looks like a prompt (e.g., C:\> which occurs here
and there) is also marked up as <prompt>.
1998-10-01 06:11:44 +00:00

337 lines
12 KiB
Text

This will be the location of the DocBook version of the FreeBSD Handbook,
which will eventually obsolete the version currently in doc/handbook/.
Interested parties should examine
<URL:http://www.nothing-going-on.demon.co.uk/FreeBSD/docbook-migration.html>
and get in touch with Nik Clayton (either to nik@FreeBSD.ORG or via the
FreeBSD-doc mailing list) if they have specific questions.
All the scripts mentioned here can also be downloaded by doing to
<URL:http://www.freebsd.org/~nik/script_name>
for example,
<URL:http://www.freebsd.org/~nik/entity-cdata.pl>
------------------------------------------------------------------------
The Handbook is midway through the conversion process. It will almost
certainly not convert to other formats cleanly
------------------------------------------------------------------------
Actions
This list explains what's been done so far, so the Japanese team can
track my changes. All actions took place on freefall.
1. Initial conversion to DocBook
Checked out a copy of the doc repository to ~/cvs/. Then used 2 scripts
to convert the handbook to its initial DocBook format. The 2 scripts are
2docbook.sh and entity-cdata.pl, both of which can be found in ~nik/bin/.
2docbook.sh calls entity-cdata.pl as necessary.
% cd ~/cvs/doc/handbook
% 2docbook.sh
This created handbook-db.sgml in ~/cvs/doc/handbook. This file contains
syntactically valid (but quite ugly) SGML. This file was then moved to
the doc/en/handbook directory and renamed to handbook.sgml. The
conversion process left a few spurious changes in the old handbook files
which I don't want to commit, so I removed them and updated the
repository.
The new file was then committed.
% mv handbook-db.sgml ~/cvs/doc/en/handbook/handbook.sgml
% rm *.sgml
% cvs update
% cd ~/cvs/doc/en/handbook
% cvs add handbook.sgml
% cvs commit
2. handbook.sgml was loaded into XEmacs 20.30 (straight from the ports
collection) and sgml-mode was turned on. My .emacs file contains the
following hook:
(add-hook 'sgml-mode-hook
(function
(lambda()
(setq sgml-omittag nil)
(setq sgml-indent-data t))))
This configures psgml to not omit any tags that the DTD lists as
omittable, and to indent data in the same way that markup is indented.
The following function was pasted into the *scratch* buffer, and then
"M-x eval-current-buffer" was run.
(defun sgml-indent-buffer
"Indents the current buffer, one line at a time"
(interactive "*")
(save-excursion
(goto-char (point-min))
(while (= (forward-line 1) 0)
(sgml-indent-or-tab))))
In the handbook.sgml buffer, the point was placed on the first character
of the first line, and "M-x sgml-indent-buffer" was run.
The changes were then committed.
3. Refilled the Handbook -- this rewraps the lines as necessary. This was
done by placing the point on the first <book> tag, and running
"M-x sgml-fill-element".
This takes about 10 minutes to run.
It also reformats some sections that should not be reformatted, including
examples of text on the screen, PGP key blocks and so on. They will be
fixed in a later commit.
4. Removed spurious markup. The conversion process has left a lot of
<para></para>
entries in the handbook, and they need to be removed. There are a
number of places this happens, and the rules are slightly different
each time.
For example,
====================================================================
Original markup Changed markup
--------------------------------------------------------------
<listitem> <listitem>
<para></para> <para>A real paragraph</para>
<para>A real paragraph</para>
--------------------------------------------------------------
<para>A real paragraph</para> <para>A real paragraph</para>
</listitem>
<para></para>
</listitem>
--------------------------------------------------------------
<para>A real paragraph</para> <para>A real paragraph</para>
<para></para> <para>Another paragraph</para>
<para>Another paragraph</para>
====================================================================
Notice the last example. It's not enough to simply put together a
regexp that matches (all whitespace)<para></para>(allwhitespace)
and removes it, since that would leave you with
<para>A realparagraph</para>
<para>Another paragraph</para>
In the end I got bored of trying to write this using Emacs regexps,
and knocked together a quick Perl script to do it. It's ~nik/bin/para.pl
on freefall.
5. Got halfway through looking for filenames, and marking them up as such.
There are a lot ( :-( ) of filenames in the Handbook. The conversion
process did a pretty good job of marking them as <filename>...</filename>
but it wasn't perfect.
I'm halfway through (line 16704) going through the Handbook, eyeballing
each line and changing things like <emphasis remap="tt">...</emphasis>
to <filename>...</filename> where appropriate.
The remainder will follow tomorrow evening.
6. Finished the first sweep marking up filenames.
If it looked like a filename (but wasn't a command for the user to type
in) it's been marked up with <filename>...</filename>.
If it had already been marked up as <filename>...</filename> but wasn't
a filename, the markup was changed to <emphasis remap="tt">...</emphasis>
PSGML and Xemacs are very useful for this, using "C-c =" to change
existing markup.
Synchronising with changes 5 and 6 will involve examining the diffs
and changing by hand I'm afraid. It could not be automated.
7. Start replacing `` and '' with <quote> and </quote>. Don't change
things indiscriminately, but look at the context to see if the change is
appropriate. There are still many `` and '' occurences which should be
changed to some other element.
This was done using a regexp search/replace, looking for the regexp
``\([^']\)''
and replacing with
<quote>\1</quote>
Not all the `` '' pairs were changed, since in some cases they delimit
filenames, options and so on.
8. As with change 7, but replace with <command> ... </command> as
necessary.
9. Remove the `` and '' from options.
``<option>...</option>'' becomes <option>...</option>
10. Converted appropriate occurences of
<emphasis remap=tt>...</emphasis>
to
<filename>...</filename>
11. As above, but changing to <command>...</command>. Modified the Emacs
regexp slightly to search for
<emphasis[ \n\t]+remap=tt>\([^<]+\)</emphasis>
which matches elements spread over two lines.
12. Looked for explanatory notes in the text (typically prefixed by "note",
"Note:" or "<para><blockquote><para><emphasis role=bf>Note:</emphasis>"
and marked them up as 'note' elements.
This change involves markup changes *and* text changes. This is because
text like
<para>Note: The foo file is only used once, and can be deleted.</para>
became
<note>
<para>The foo file is only used once, and can be deleted.</para>
</note>
13. Look for text marked up as an acronym and alter as necessary. The
automatic conversion tended to mark any string of upper case letters
as acronyms, which is not always right.
The difference between an <acronym> and <abbrev> is subtle -- in a
nutshell, an acronym is pronounceble, an abbreviation isn't.
14. Another sweep for "`" and "``" (and their closing equivalents),
replacing them with the right markup (since most of the time they're
used to 'delimit' filenames or options from the surrounding text.
The only quotes left now are either around items for which I'm not 100%
sure which element to use, or in literal blocks as part of commands the
user types in.
15. Look for double quotes not used in attributes and alter to the
appropriate markup (or remove as necessary). A useful Emacs regexp
when doing the search replace is
\([^=]\)"\([^ \t\n]+[^"]+\)"\([^>]\)
and replace with
\1<quote>\2</quote>\3
or whatever the replacement element is.
Converted '"' into <quote>, <literal>, <command>, <application>,
<filename>, <emphasis>, <option> or removed it as neccessary.
16. A general cleanup to get it to validate. The original conversion
process left some <sect?>'s with just a title, which is invalid,
they must contain a <para> or similar element.
Also fixed a couple of typos in the tags. The document should now
validate, save for the undefined external entitites.
17. Created a new FreeBSD Doc. Project DTD in the ../../sgml directory.
Changed the declaration at the top of the handbook to use this new
DTD.
18. Yet more things that should be filenames marked up as such.
19. Use the new <hostid> element to mark up hostnames, IP addresses and
such. The markup choice is as follows.
<hostid>...</hostid> is a simple hostname.
<hostid role="ipaddr">...</hostid> is an IP address.
<hostid role="domainname">...</hostid> is a domain name.
<hostid role="fqdn">...</hostid> is a fully qualified domain name.
<hostid role="netmask">...</hostid> is a netmask.
<hostid role="mac">...</hostid> is a network card MAC address.
These might migrate to being separate elements in the future. However,
if they do then changing the markup can be done automatically.
20. Convert <emphasis remap=it>...</emphasis> to plain <emphasis> in some
cases. I'm pretty certain that all the <emphasis>...</emphasis>
markup is correct now, which makes searching for markup that does
need changing much easier.
21. Replace the last few occurences of curly quoted items (`` and '')
with the right markup.
22. Almost the last lot. I missed a diff I'd done at home. There's a
section in the handbook that talks about kernel options, where the
quoted options are quoted with `` and ''. Fix them so that standard
double quotes are used (so they can cut-n-pasted).
23. Start working on <emphasis remap=bf>...</emphasis>
Convert the first lot to <command>...</command>
24. Fixed manual page references to use the right markup, which is
<citerefentry>
<refentrytitle>page_name</refentrytitle>
<manvolnum>number</manvolnum>
</citerefentry>
Did this with a regexp search for
\([a-z-_\.]+\)(\([1-9]\))
and replacing with
<citerefentry><refentrytitle>\1</refentrytitle><manvolnum>\2</manvolnum>
Since most of the page references had <command>, <emphasis>, or
<ulink> elements wrapped around them, you then have to sweep through the
file looking for "><cite" and using C-c C-k to kill the markup
immediately before and after.
25. <emphasis remap=..>...</emphasis> -> <literal>...</literal>
26. <emphasis remap=..>...</emphasis> -> <makevar>...</makevar>
27. <emphasis remap=..>...</emphasis> -> <maketarget>...</maketarget>
28. Fix up some uses of <screen> and the use of <emphasis> elements within
and near it. Most of the time this consisted of replacing the <emphasis>
with <replaceable> or <userinput>.
29. Fixed up more references to manpages that used <ulink> to use
<citerefentry>. These were missed at step 24 because they didn't
include a section number. No references to man.cgi now exist in
handbook.sgml.
30. Create two entities, prompt.root and prompt.user. Use these anywhere
the OS prompt is displayed, depending on whether the user should be
a normal user or root.
Also markup other prompts (e.g., the DOS prompt C:\> that occurs in
some places) as <prompt>s.