* Fixup use of <symbol> with more appropriate element
* Fixup wrong occurence of $Id$
* Fixup references to 'make' variables, and strim off the surrounding
${...}, it can be added back by the stylesheet at presentation time.
* More insertions or deletions of <para>...</para> as appropriate.
And with this commit, ladies and gentlemen, we're almost there as far as
the DocBook conversion goes. I still need to:
- Split the big handbook.sgml into its constituent files and directories.
- Sort out the files that will contain entities, and put in the correct
SGML to use them.
- Merge in the changes that have happened to doc/handbook over the past
7 or so months.
- Build the Makefile framework and supporting apps to do .txt, .ps, .rtf
and .pdf conversions.
But the mind numbingly tedious stuff is over. Of course, there's
always more to do (like the whole bibliography section should be marked
up as a bibliography) and I'm putting together the "This is how the
handbook should be marked up" document as well. Oh, and organising my
notes on how the Handbook could be re-arranged. But apart from that,
it's done :-)
id="bar">
...
changed to
<foo id="bar">
...
Before people complain that "Hang on, now you can't find out what the
allocated ID values are with a simple 'grep'" I'll say that's not a
problem. I plan to introduce a target in the Makefile (probably
something like 'handbook.id' which will automatically generate this
list doing a proper SGML parse.
to <email>.
Can't do this globally. Some of the links are odd (i.e,. the link
is not their e-mail address but is their name, eg
<ulink url="mailto:nik@freebsd.org">Nik Clayton</ulink>
which would turn to
<email>Nik Clayton</email>
which isn't very useful. Ignore these ones, and do the others.
(i.e., the ones that look like
<ulink url="mailto:nik@freebsd.org">nik@freebsd.org</ulink>
)
This Emacs regexp does the job.
Search for: <ulink\s-+url="mailto[^>]+>\([^<]+\)</ulink>
Replace with: <email>\1</email>
Step 2. A lot of the <email>...</email> sets will have '<' and '>' embedded
in them (as entities). These can be removed, since the stylesheet
will add them;
Search for: <email><\([^&]+\)></email>
Replace with: <email>\1</email>
Step 3. The trick now is to turn
<ulink url="mailto:nik@freebsd.org">Nik Clayton</ulink>
into
Nik Clayton <email>nik@freebsd.org</email>
This step could (possibly) have been done first, and then steps
1 and 2 could be done globally. I haven't done this because of
concerns about the ordering of names within languages. This
transformation is fairly simple in English, I've no idea what
it's like in Japanese.
Search for: <ulink\s-+url="mailto:\([^"]+\)">\([^<]+\)</ulink>
Replace with: \2 <email>\1</email>
Step 4. Remove leading and trailing spaces that may have slipped in
Search for: <email>\s-+
Replace with: <email>
Search for: \s-+</email>
Replace with: </email>
<literal remap=..> -> <literal>
<command remap=..> -> <command>
Or deleted <emphasis ..> altogether in some cases.
More redundant <para>..</para>'s removed.
things.
I'm now working through from the beginning of the handbook to end,
correcting as I go. I'll commit in chunks of 5,000 lines (or
thereabouts).
Most of the changes fall into the following categories.
* <emphasis remap=bf> --> <emphasis>
* Spurious <para>s around <*list>s deleted (but not reformatted)
"C-c -" in Emacs SGML mode (when the point is on an element starting
or end tags) will delete that element's starting or end tags.
* Marked smileys with <!-- smiley --> for possible future deletion
* Deleting <emphasis>, around
<term><emphasis>...</emphasis></term> -> <term>...</term>
* Fine tuning markup choices in some cases
- <filename>C:</filename> -> <devicename>C:</devicename>
* Extra <note>s here and there.
* Some <*list>s to <procedure> (and <listitem>s to <step>)
* ASCII emphasis converted to <emphasis>
i.e., do it like *this* -> do it like <emphasis>this</emphasis>
* <symbol> -> <replaceable>
There are very few whitespace changes, although a few have probably
cropped up. The vast majority of the whitespace changes will happen in
one megacommit, hopefully some time next week.
This does the first 5,000 lines or so.
this (in Emacs) by searching for
\s-+</para>
and replacing with
</para>
Do this for all occurences *except* where the element immediately before
the </para> is one of <itemizedlist>, <orderedlist>, <variablelist>,
<procedure>. The <para>...</para> wrapping these elements is mostly
redundant, and will be removed later.
<para> There is some leading space here.</para>
Get rid of it, doing an emacs search/replace for
<para> +\([^ ]\)
and replacing with
<para>\1
This can be done globally.
Some parts of the handbook had single spaces after stops, some had double
or triple. While the typographical convention for monospaced fonts may
be to use double spaces after them, that doesn't apply here. TeX will
ignore them, as will HTML. If we need them for a plain text version of the
Handbook then the stylesheet / conversion mechanism can insert them
as necessary.
Searching for
_\([;:!\.\?,]\) +_
in Emacs and replacing with
_\1 _
(ignore the '_', they're just to delineate the regexps) does the job
quite nicely. However, you can't do this everywhere, since some of the
double spaces might be in program listings or other literal sections
(e.g., the BSD Copyright), so you need to sit and bounce on the 'y' or
'n' key as appropriate for each occurance of a stop.
but moderated).
NOTE: All old subscribers were dropped, so if you were on the list
before and still want to be on the list now (after it has gone open),
you'll have to re-subscribe.