contract, start contract, work too many hours per day) I'm back working on the DocBook conversion :-) Create two new entities, prompt.root and prompt.user. Use these where the user is shown an OS prompt, to indicate whether they should be a normal user or do it as root. Everything else that looks like a prompt (e.g., C:\> which occurs here and there) is also marked up as <prompt>.
337 lines
12 KiB
Text
337 lines
12 KiB
Text
This will be the location of the DocBook version of the FreeBSD Handbook,
|
|
which will eventually obsolete the version currently in doc/handbook/.
|
|
|
|
Interested parties should examine
|
|
|
|
<URL:http://www.nothing-going-on.demon.co.uk/FreeBSD/docbook-migration.html>
|
|
|
|
and get in touch with Nik Clayton (either to nik@FreeBSD.ORG or via the
|
|
FreeBSD-doc mailing list) if they have specific questions.
|
|
|
|
All the scripts mentioned here can also be downloaded by doing to
|
|
|
|
<URL:http://www.freebsd.org/~nik/script_name>
|
|
|
|
for example,
|
|
|
|
<URL:http://www.freebsd.org/~nik/entity-cdata.pl>
|
|
|
|
|
|
------------------------------------------------------------------------
|
|
The Handbook is midway through the conversion process. It will almost
|
|
certainly not convert to other formats cleanly
|
|
------------------------------------------------------------------------
|
|
|
|
|
|
Actions
|
|
|
|
This list explains what's been done so far, so the Japanese team can
|
|
track my changes. All actions took place on freefall.
|
|
|
|
1. Initial conversion to DocBook
|
|
|
|
Checked out a copy of the doc repository to ~/cvs/. Then used 2 scripts
|
|
to convert the handbook to its initial DocBook format. The 2 scripts are
|
|
2docbook.sh and entity-cdata.pl, both of which can be found in ~nik/bin/.
|
|
|
|
2docbook.sh calls entity-cdata.pl as necessary.
|
|
|
|
% cd ~/cvs/doc/handbook
|
|
% 2docbook.sh
|
|
|
|
This created handbook-db.sgml in ~/cvs/doc/handbook. This file contains
|
|
syntactically valid (but quite ugly) SGML. This file was then moved to
|
|
the doc/en/handbook directory and renamed to handbook.sgml. The
|
|
conversion process left a few spurious changes in the old handbook files
|
|
which I don't want to commit, so I removed them and updated the
|
|
repository.
|
|
|
|
The new file was then committed.
|
|
|
|
% mv handbook-db.sgml ~/cvs/doc/en/handbook/handbook.sgml
|
|
% rm *.sgml
|
|
% cvs update
|
|
% cd ~/cvs/doc/en/handbook
|
|
% cvs add handbook.sgml
|
|
% cvs commit
|
|
|
|
2. handbook.sgml was loaded into XEmacs 20.30 (straight from the ports
|
|
collection) and sgml-mode was turned on. My .emacs file contains the
|
|
following hook:
|
|
|
|
(add-hook 'sgml-mode-hook
|
|
(function
|
|
(lambda()
|
|
(setq sgml-omittag nil)
|
|
(setq sgml-indent-data t))))
|
|
|
|
This configures psgml to not omit any tags that the DTD lists as
|
|
omittable, and to indent data in the same way that markup is indented.
|
|
|
|
The following function was pasted into the *scratch* buffer, and then
|
|
"M-x eval-current-buffer" was run.
|
|
|
|
(defun sgml-indent-buffer
|
|
"Indents the current buffer, one line at a time"
|
|
(interactive "*")
|
|
(save-excursion
|
|
(goto-char (point-min))
|
|
(while (= (forward-line 1) 0)
|
|
(sgml-indent-or-tab))))
|
|
|
|
In the handbook.sgml buffer, the point was placed on the first character
|
|
of the first line, and "M-x sgml-indent-buffer" was run.
|
|
|
|
The changes were then committed.
|
|
|
|
3. Refilled the Handbook -- this rewraps the lines as necessary. This was
|
|
done by placing the point on the first <book> tag, and running
|
|
"M-x sgml-fill-element".
|
|
|
|
This takes about 10 minutes to run.
|
|
|
|
It also reformats some sections that should not be reformatted, including
|
|
examples of text on the screen, PGP key blocks and so on. They will be
|
|
fixed in a later commit.
|
|
|
|
4. Removed spurious markup. The conversion process has left a lot of
|
|
|
|
<para></para>
|
|
|
|
entries in the handbook, and they need to be removed. There are a
|
|
number of places this happens, and the rules are slightly different
|
|
each time.
|
|
|
|
For example,
|
|
|
|
====================================================================
|
|
Original markup Changed markup
|
|
--------------------------------------------------------------
|
|
|
|
<listitem> <listitem>
|
|
<para></para> <para>A real paragraph</para>
|
|
|
|
<para>A real paragraph</para>
|
|
|
|
--------------------------------------------------------------
|
|
|
|
<para>A real paragraph</para> <para>A real paragraph</para>
|
|
</listitem>
|
|
<para></para>
|
|
|
|
</listitem>
|
|
|
|
--------------------------------------------------------------
|
|
|
|
<para>A real paragraph</para> <para>A real paragraph</para>
|
|
|
|
<para></para> <para>Another paragraph</para>
|
|
|
|
<para>Another paragraph</para>
|
|
====================================================================
|
|
|
|
Notice the last example. It's not enough to simply put together a
|
|
regexp that matches (all whitespace)<para></para>(allwhitespace)
|
|
and removes it, since that would leave you with
|
|
|
|
<para>A realparagraph</para>
|
|
<para>Another paragraph</para>
|
|
|
|
In the end I got bored of trying to write this using Emacs regexps,
|
|
and knocked together a quick Perl script to do it. It's ~nik/bin/para.pl
|
|
on freefall.
|
|
|
|
5. Got halfway through looking for filenames, and marking them up as such.
|
|
|
|
There are a lot ( :-( ) of filenames in the Handbook. The conversion
|
|
process did a pretty good job of marking them as <filename>...</filename>
|
|
but it wasn't perfect.
|
|
|
|
I'm halfway through (line 16704) going through the Handbook, eyeballing
|
|
each line and changing things like <emphasis remap="tt">...</emphasis>
|
|
to <filename>...</filename> where appropriate.
|
|
|
|
The remainder will follow tomorrow evening.
|
|
|
|
6. Finished the first sweep marking up filenames.
|
|
|
|
If it looked like a filename (but wasn't a command for the user to type
|
|
in) it's been marked up with <filename>...</filename>.
|
|
|
|
If it had already been marked up as <filename>...</filename> but wasn't
|
|
a filename, the markup was changed to <emphasis remap="tt">...</emphasis>
|
|
|
|
PSGML and Xemacs are very useful for this, using "C-c =" to change
|
|
existing markup.
|
|
|
|
Synchronising with changes 5 and 6 will involve examining the diffs
|
|
and changing by hand I'm afraid. It could not be automated.
|
|
|
|
7. Start replacing `` and '' with <quote> and </quote>. Don't change
|
|
things indiscriminately, but look at the context to see if the change is
|
|
appropriate. There are still many `` and '' occurences which should be
|
|
changed to some other element.
|
|
|
|
This was done using a regexp search/replace, looking for the regexp
|
|
|
|
``\([^']\)''
|
|
|
|
and replacing with
|
|
|
|
<quote>\1</quote>
|
|
|
|
Not all the `` '' pairs were changed, since in some cases they delimit
|
|
filenames, options and so on.
|
|
|
|
8. As with change 7, but replace with <command> ... </command> as
|
|
necessary.
|
|
|
|
9. Remove the `` and '' from options.
|
|
|
|
``<option>...</option>'' becomes <option>...</option>
|
|
|
|
10. Converted appropriate occurences of
|
|
|
|
<emphasis remap=tt>...</emphasis>
|
|
|
|
to
|
|
|
|
<filename>...</filename>
|
|
|
|
11. As above, but changing to <command>...</command>. Modified the Emacs
|
|
regexp slightly to search for
|
|
|
|
<emphasis[ \n\t]+remap=tt>\([^<]+\)</emphasis>
|
|
|
|
which matches elements spread over two lines.
|
|
|
|
12. Looked for explanatory notes in the text (typically prefixed by "note",
|
|
"Note:" or "<para><blockquote><para><emphasis role=bf>Note:</emphasis>"
|
|
and marked them up as 'note' elements.
|
|
|
|
This change involves markup changes *and* text changes. This is because
|
|
text like
|
|
|
|
<para>Note: The foo file is only used once, and can be deleted.</para>
|
|
|
|
became
|
|
|
|
<note>
|
|
<para>The foo file is only used once, and can be deleted.</para>
|
|
</note>
|
|
|
|
13. Look for text marked up as an acronym and alter as necessary. The
|
|
automatic conversion tended to mark any string of upper case letters
|
|
as acronyms, which is not always right.
|
|
|
|
The difference between an <acronym> and <abbrev> is subtle -- in a
|
|
nutshell, an acronym is pronounceble, an abbreviation isn't.
|
|
|
|
14. Another sweep for "`" and "``" (and their closing equivalents),
|
|
replacing them with the right markup (since most of the time they're
|
|
used to 'delimit' filenames or options from the surrounding text.
|
|
|
|
The only quotes left now are either around items for which I'm not 100%
|
|
sure which element to use, or in literal blocks as part of commands the
|
|
user types in.
|
|
|
|
15. Look for double quotes not used in attributes and alter to the
|
|
appropriate markup (or remove as necessary). A useful Emacs regexp
|
|
when doing the search replace is
|
|
|
|
\([^=]\)"\([^ \t\n]+[^"]+\)"\([^>]\)
|
|
|
|
and replace with
|
|
|
|
\1<quote>\2</quote>\3
|
|
|
|
or whatever the replacement element is.
|
|
|
|
Converted '"' into <quote>, <literal>, <command>, <application>,
|
|
<filename>, <emphasis>, <option> or removed it as neccessary.
|
|
|
|
16. A general cleanup to get it to validate. The original conversion
|
|
process left some <sect?>'s with just a title, which is invalid,
|
|
they must contain a <para> or similar element.
|
|
|
|
Also fixed a couple of typos in the tags. The document should now
|
|
validate, save for the undefined external entitites.
|
|
|
|
17. Created a new FreeBSD Doc. Project DTD in the ../../sgml directory.
|
|
Changed the declaration at the top of the handbook to use this new
|
|
DTD.
|
|
|
|
18. Yet more things that should be filenames marked up as such.
|
|
|
|
19. Use the new <hostid> element to mark up hostnames, IP addresses and
|
|
such. The markup choice is as follows.
|
|
|
|
<hostid>...</hostid> is a simple hostname.
|
|
<hostid role="ipaddr">...</hostid> is an IP address.
|
|
<hostid role="domainname">...</hostid> is a domain name.
|
|
<hostid role="fqdn">...</hostid> is a fully qualified domain name.
|
|
<hostid role="netmask">...</hostid> is a netmask.
|
|
<hostid role="mac">...</hostid> is a network card MAC address.
|
|
|
|
These might migrate to being separate elements in the future. However,
|
|
if they do then changing the markup can be done automatically.
|
|
|
|
20. Convert <emphasis remap=it>...</emphasis> to plain <emphasis> in some
|
|
cases. I'm pretty certain that all the <emphasis>...</emphasis>
|
|
markup is correct now, which makes searching for markup that does
|
|
need changing much easier.
|
|
|
|
21. Replace the last few occurences of curly quoted items (`` and '')
|
|
with the right markup.
|
|
|
|
22. Almost the last lot. I missed a diff I'd done at home. There's a
|
|
section in the handbook that talks about kernel options, where the
|
|
quoted options are quoted with `` and ''. Fix them so that standard
|
|
double quotes are used (so they can cut-n-pasted).
|
|
|
|
23. Start working on <emphasis remap=bf>...</emphasis>
|
|
|
|
Convert the first lot to <command>...</command>
|
|
|
|
24. Fixed manual page references to use the right markup, which is
|
|
|
|
<citerefentry>
|
|
<refentrytitle>page_name</refentrytitle>
|
|
<manvolnum>number</manvolnum>
|
|
</citerefentry>
|
|
|
|
Did this with a regexp search for
|
|
|
|
\([a-z-_\.]+\)(\([1-9]\))
|
|
|
|
and replacing with
|
|
|
|
<citerefentry><refentrytitle>\1</refentrytitle><manvolnum>\2</manvolnum>
|
|
|
|
Since most of the page references had <command>, <emphasis>, or
|
|
<ulink> elements wrapped around them, you then have to sweep through the
|
|
file looking for "><cite" and using C-c C-k to kill the markup
|
|
immediately before and after.
|
|
|
|
25. <emphasis remap=..>...</emphasis> -> <literal>...</literal>
|
|
|
|
26. <emphasis remap=..>...</emphasis> -> <makevar>...</makevar>
|
|
|
|
27. <emphasis remap=..>...</emphasis> -> <maketarget>...</maketarget>
|
|
|
|
28. Fix up some uses of <screen> and the use of <emphasis> elements within
|
|
and near it. Most of the time this consisted of replacing the <emphasis>
|
|
with <replaceable> or <userinput>.
|
|
|
|
29. Fixed up more references to manpages that used <ulink> to use
|
|
<citerefentry>. These were missed at step 24 because they didn't
|
|
include a section number. No references to man.cgi now exist in
|
|
handbook.sgml.
|
|
|
|
30. Create two entities, prompt.root and prompt.user. Use these anywhere
|
|
the OS prompt is displayed, depending on whether the user should be
|
|
a normal user or root.
|
|
|
|
Also markup other prompts (e.g., the DOS prompt C:\> that occurs in
|
|
some places) as <prompt>s.
|
|
|