doc/en/handbook
Nik Clayton 01a9af0f16 Finished the first sweep for <filename>...</filename>.
My next commit opportunity will probably be this time next Monday. Stay
tuned. . .
1998-04-03 21:21:55 +00:00
..
handbook.sgml Finished the first sweep for <filename>...</filename>. 1998-04-03 21:21:55 +00:00
README Finished the first sweep for <filename>...</filename>. 1998-04-03 21:21:55 +00:00

This will be the location of the DocBook version of the FreeBSD Handbook,
which will eventually obsolete the version currently in doc/handbook/.

Interested parties should examine

  <URL:http://www.nothing-going-on.demon.co.uk/FreeBSD/docbook-migration.html>

and get in touch with Nik Clayton (either to nik@FreeBSD.ORG or via the
FreeBSD-doc mailing list) if they have specific questions.

All the scripts mentioned here can also be downloaded by doing to

  <URL:http://www.freebsd.org/~nik/script_name>

for example,

  <URL:http://www.freebsd.org/~nik/entity-cdata.pl>

  
   ------------------------------------------------------------------------
    The Handbook is midway through the conversion process. It will almost
		certainly not convert to other formats cleanly
   ------------------------------------------------------------------------


				   Actions

  This list explains what's been done so far, so the Japanese team can
  track my changes. All actions took place on freefall.

  1. Initial conversion to DocBook

     Checked out a copy of the doc repository to ~/cvs/. Then used 2 scripts
     to convert the handbook to its initial DocBook format. The 2 scripts are
     2docbook.sh and entity-cdata.pl, both of which can be found in ~nik/bin/.

     2docbook.sh calls entity-cdata.pl as necessary.

         % cd ~/cvs/doc/handbook
	 % 2docbook.sh

     This created handbook-db.sgml in ~/cvs/doc/handbook. This file contains
     syntactically valid (but quite ugly) SGML. This file was then moved to
     the doc/en/handbook directory and renamed to handbook.sgml. The
     conversion process left a few spurious changes in the old handbook files
     which I don't want to commit, so I removed them and updated the
     repository.

     The new file was then committed.

         % mv handbook-db.sgml ~/cvs/doc/en/handbook/handbook.sgml
	 % rm *.sgml
	 % cvs update
	 % cd ~/cvs/doc/en/handbook
	 % cvs add handbook.sgml
	 % cvs commit

  2. handbook.sgml was loaded into XEmacs 20.30 (straight from the ports
     collection) and sgml-mode was turned on. My .emacs file contains the
     following hook:

         (add-hook 'sgml-mode-hook
              (function
               (lambda()
                 (setq sgml-omittag nil)
                 (setq sgml-indent-data t))))

     This configures psgml to not omit any tags that the DTD lists as
     omittable, and to indent data in the same way that markup is indented.

     The following function was pasted into the *scratch* buffer, and then
     "M-x eval-current-buffer" was run.

         (defun sgml-indent-buffer
            "Indents the current buffer, one line at a time"
            (interactive "*")
            (save-excursion
              (goto-char (point-min))
              (while (= (forward-line 1) 0)
                (sgml-indent-or-tab))))

     In the handbook.sgml buffer, the point was placed on the first character
     of the first line, and "M-x sgml-indent-buffer" was run.

     The changes were then committed.

  3. Refilled the Handbook -- this rewraps the lines as necessary. This was
     done by placing the point on the first <book> tag, and running
     "M-x sgml-fill-element".

     This takes about 10 minutes to run.

     It also reformats some sections that should not be reformatted, including
     examples of text on the screen, PGP key blocks and so on. They will be
     fixed in a later commit.

  4. Removed spurious markup. The conversion process has left a lot of

         <para></para>

     entries in the handbook, and they need to be removed. There are a
     number of places this happens, and the rules are slightly different
     each time.

     For example,

     ====================================================================
       Original markup                      Changed markup
       --------------------------------------------------------------

       <listitem>                            <listitem>
         <para></para>                         <para>A real paragraph</para>

	 <para>A real paragraph</para>

       --------------------------------------------------------------

         <para>A real paragraph</para>         <para>A real paragraph</para>
	                                     </listitem>
         <para></para> 
	 
       </listitem>       

       --------------------------------------------------------------

         <para>A real paragraph</para>       <para>A real paragraph</para>

	 <para></para>                       <para>Another paragraph</para>

	 <para>Another paragraph</para>
     ====================================================================

     Notice the last example. It's not enough to simply put together a
     regexp that matches (all whitespace)<para></para>(allwhitespace)
     and removes it, since that would leave you with

       <para>A realparagraph</para>
       <para>Another paragraph</para>

     In the end I got bored of trying to write this using Emacs regexps,
     and knocked together a quick Perl script to do it. It's ~nik/bin/para.pl
     on freefall.

  5. Got halfway through looking for filenames, and marking them up as such.

     There are a lot ( :-( ) of filenames in the Handbook. The conversion
     process did a pretty good job of marking them as <filename>...</filename>
     but it wasn't perfect.

     I'm halfway through (line 16704) going through the Handbook, eyeballing
     each line and changing things like <emphasis remap="tt">...</emphasis>
     to <filename>...</filename> where appropriate.

     The remainder will follow tomorrow evening.

  6. Finished the first sweep marking up filenames.

     If it looked like a filename (but wasn't a command for the user to type
     in) it's been marked up with <filename>...</filename>.

     If it had already been marked up as <filename>...</filename> but wasn't
     a filename, the markup was changed to <emphasis remap="tt">...</emphasis>

     PSGML and Xemacs are very useful for this, using "C-c =" to change
     existing markup.
     
     Synchronising with changes 5 and 6 will involve examining the diffs
     and changing by hand I'm afraid. It could not be automated.