Edit for clarity and style. Try to persuade the hippo and the pogo stick
that they are not good for each other.
This commit is contained in:
parent
9ec2aca708
commit
5bbb3c8791
Notes:
svn2git
2020-12-08 03:00:23 +00:00
svn path=/head/; revision=42257
1 changed files with 225 additions and 246 deletions
|
@ -34,10 +34,10 @@
|
|||
<chapter id="xml-primer">
|
||||
<title>XML Primer</title>
|
||||
|
||||
<para>The majority of FDP documentation is written in applications
|
||||
of XML. This chapter explains exactly what that means, how to
|
||||
read and understand the source to the documentation, and the sort
|
||||
of XML tricks you will see used in the documentation.</para>
|
||||
<para>Most FDP documentation is written with markup languages based
|
||||
on <acronym>XML</acronym>. This chapter explains what that means, how to
|
||||
read and understand the documentation source, and the
|
||||
<acronym>XML</acronym> techniques used.</para>
|
||||
|
||||
<para>Portions of this section were inspired by Mark Galassi's
|
||||
<ulink
|
||||
|
@ -47,31 +47,31 @@
|
|||
<sect1 id="xml-primer-overview">
|
||||
<title>Overview</title>
|
||||
|
||||
<para>Way back when, electronic text was simple to deal with.
|
||||
Admittedly, you had to know which character set your document
|
||||
was written in (ASCII, EBCDIC, or one of a number of others) but
|
||||
<para>In the original days of computers, electronic text was simple.
|
||||
|
||||
There were a few character sets like <acronym>ASCII</acronym> or <acronym>EBCDIC</acronym>, but
|
||||
that was about it. Text was text, and what you saw really was
|
||||
what you got. No frills, no formatting, no intelligence.</para>
|
||||
|
||||
<para>Inevitably, this was not enough. Once you have text in a
|
||||
machine-usable format, you expect machines to be able to use it
|
||||
and manipulate it intelligently. You would like to indicate
|
||||
<para>Inevitably, this was not enough. When text is in a
|
||||
machine-usable format, machines are expected to be able to use
|
||||
and manipulate it intelligently. Authors want to indicate
|
||||
that certain phrases should be emphasized, or added to a
|
||||
glossary, or be hyperlinks. You might want filenames to be
|
||||
glossary, or made into hyperlinks. Filenames could be
|
||||
shown in a <quote>typewriter</quote> style font for viewing on
|
||||
screen, but as <quote>italics</quote> when printed, or any of a
|
||||
myriad of other options for presentation.</para>
|
||||
|
||||
<para>It was once hoped that Artificial Intelligence (AI) would
|
||||
make this easy. Your computer would read in the document and
|
||||
make this easy. The computer would read the document and
|
||||
automatically identify key phrases, filenames, text that the
|
||||
reader should type in, examples, and more. Unfortunately, real
|
||||
life has not happened quite like that, and our computers require
|
||||
some assistance before they can meaningfully process our
|
||||
life has not happened quite like that, and computers still require
|
||||
assistance before they can meaningfully process
|
||||
text.</para>
|
||||
|
||||
<para>More precisely, they need help identifying what is what.
|
||||
Let's look at this text:</para>
|
||||
Consider this text:</para>
|
||||
|
||||
<blockquote>
|
||||
<para>To remove <filename>/tmp/foo</filename> use
|
||||
|
@ -100,42 +100,40 @@
|
|||
document must typically be done by a person—after all, if
|
||||
computers could recognize the text sufficiently well to add the
|
||||
markup then there would be no need to add it in the first place.
|
||||
This <emphasis>increases the cost</emphasis> (i.e., the effort
|
||||
This <emphasis>increases the cost</emphasis> (the effort
|
||||
required) to create the document.</para>
|
||||
|
||||
<para>The previous example is actually represented in this
|
||||
document like this:</para>
|
||||
|
||||
<programlisting><![CDATA[<para>To remove <filename>/tmp/foo</filename> use &man.rm.1;.</para>
|
||||
<programlisting><sgmltag class="starttag">para</sgmltag>To remove <sgmltag class="starttag">filename</sgmltag>/tmp/foo<sgmltag class="endtag">filename</sgmltag> use &man.rm.1;.<sgmltag class="endtag">para</sgmltag>
|
||||
|
||||
<screen>&prompt.user; <userinput>rm /tmp/foo</userinput></screen>]]></programlisting>
|
||||
<sgmltag class="starttag">screen</sgmltag>&prompt.user; <sgmltag class="starttag">userinput</sgmltag>rm /tmp/foo<sgmltag class="endtag">userinput</sgmltag><sgmltag class="endtag">screen</sgmltag></programlisting>
|
||||
|
||||
<para>As you can see, the markup is clearly separate from the
|
||||
<para>The markup is clearly separate from the
|
||||
content.</para>
|
||||
|
||||
<para>Obviously, if you are going to use markup you need to define
|
||||
what your markup means, and how it should be interpreted. You
|
||||
will need a markup language that you can follow when marking up
|
||||
your documents.</para>
|
||||
<para>Markup languages define what
|
||||
what the markup means and how it should be interpreted.</para>
|
||||
|
||||
<para>Of course, one markup language might not be enough. A
|
||||
markup language for technical documentation has very different
|
||||
requirements than a markup language that was to be used for
|
||||
requirements than a markup language that is intended for
|
||||
cookery recipes. This, in turn, would be very different from a
|
||||
markup language used to describe poetry. What you really need
|
||||
is a first language that you use to write these other markup
|
||||
markup language used to describe poetry. What is really needed
|
||||
is a first language used to write these other markup
|
||||
languages. A <emphasis>meta markup language</emphasis>.</para>
|
||||
|
||||
<para>This is exactly what the eXtensible Markup
|
||||
Language (XML) is. Many markup languages have been written in
|
||||
XML, including the two most used by the FDP, XHTML and
|
||||
Language (<acronym>XML</acronym>) is. Many markup languages have been written in
|
||||
<acronym>XML</acronym>, including the two most used by the <acronym>FDP</acronym>, <acronym>XHTML</acronym> and
|
||||
DocBook.</para>
|
||||
|
||||
<para>Each language definition is more properly called a grammar,
|
||||
vocabulary, schema or Document Type Definition (DTD). There
|
||||
are various languages to specify an XML grammar, for example,
|
||||
DTD (yes, it also means the specification language itself),
|
||||
XML Schema (XSD) or RELANG NG. The schema specifies the name
|
||||
vocabulary, schema or Document Type Definition (<acronym>DTD</acronym>). There
|
||||
are various languages to specify an <acronym>XML</acronym> grammar, for example,
|
||||
<acronym>DTD</acronym> (yes, it also means the specification language itself),
|
||||
<acronym>XML</acronym> Schema (<acronym>XSD</acronym>) or <acronym>RELANG NG</acronym>. The schema specifies the name
|
||||
of the elements that can be used, what order they appear in (and
|
||||
whether some markup can be used inside other markup) and related
|
||||
information.</para>
|
||||
|
@ -144,7 +142,7 @@
|
|||
<emphasis>complete</emphasis> specification of all the elements
|
||||
that are allowed to appear, the order in which they should
|
||||
appear, which elements are mandatory, which are optional, and so
|
||||
forth. This makes it possible to write an XML
|
||||
forth. This makes it possible to write an <acronym>XML</acronym>
|
||||
<emphasis>parser</emphasis> which reads in both the schema and a
|
||||
document which claims to conform to the schema. The parser can
|
||||
then confirm whether or not all the elements required by the vocabulary
|
||||
|
@ -155,34 +153,34 @@
|
|||
<note>
|
||||
<para>This processing simply confirms that the choice of
|
||||
elements, their ordering, and so on, conforms to that listed
|
||||
in the grammar. It does <emphasis>not</emphasis> check that you
|
||||
have used <emphasis>appropriate</emphasis> markup for the
|
||||
content. If you tried to mark up all the filenames in your
|
||||
document as function names, the parser would not flag this as
|
||||
an error (assuming, of course, that your schema defines elements
|
||||
in the grammar. It does <emphasis>not</emphasis> check whether
|
||||
<emphasis>appropriate</emphasis> markup has been used for the
|
||||
content. If all the filenames in a
|
||||
document were marked up as function names, the parser would not flag this as
|
||||
an error (assuming, of course, that the schema defines elements
|
||||
for filenames and functions, and that they are allowed to
|
||||
appear in the same place).</para>
|
||||
</note>
|
||||
|
||||
<para>It is likely that most of your contributions to the
|
||||
Documentation Project will consist of content marked up in
|
||||
either XHTML or DocBook, rather than alterations to the schemas.
|
||||
For this reason this book will not touch on how to write a
|
||||
<para>It is likely that most contributions to the
|
||||
Documentation Project will be content marked up in
|
||||
either <acronym>XHTML</acronym> or DocBook, rather than alterations to the schemas.
|
||||
For this reason, this book will not touch on how to write a
|
||||
vocabulary.</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="xml-primer-elements">
|
||||
<title>Elements, Tags, and Attributes</title>
|
||||
|
||||
<para>All the vocabularies written in XML share certain characteristics.
|
||||
This is hardly surprising, as the philosophy behind XML will
|
||||
<para>All the vocabularies written in <acronym>XML</acronym> share certain characteristics.
|
||||
This is hardly surprising, as the philosophy behind <acronym>XML</acronym> will
|
||||
inevitably show through. One of the most obvious manifestations
|
||||
of this philosophy is that of <emphasis>content</emphasis> and
|
||||
<emphasis>elements</emphasis>.</para>
|
||||
|
||||
<para>Your documentation (whether it is a single web page, or a
|
||||
lengthy book) is considered to consist of content. This content
|
||||
is then divided (and further subdivided) into elements. The
|
||||
<para>Documentation, whether it is a single web page, or a
|
||||
lengthy book, is considered to consist of content. This content
|
||||
is then divided and further subdivided into elements. The
|
||||
purpose of adding markup is to name and identify the boundaries
|
||||
of these elements for further processing.</para>
|
||||
|
||||
|
@ -195,21 +193,21 @@
|
|||
that was direct speech, or the name of a character in the
|
||||
story.</para>
|
||||
|
||||
<para>You might like to think of this as <quote>chunking</quote>
|
||||
content. At the very top level you have one chunk, the book.
|
||||
Look a little deeper, and you have more chunks, the individual
|
||||
<para>It may be helpful to think of this as <quote>chunking</quote>
|
||||
content. At the very top level is one chunk, the book.
|
||||
Look a little deeper, and there are more chunks, the individual
|
||||
chapters. These are chunked further into paragraphs, footnotes,
|
||||
character names, and so on.</para>
|
||||
|
||||
<para>Notice how you can make this differentiation between
|
||||
different elements of the content without resorting to any XML
|
||||
terms. It really is surprisingly straightforward. You could do
|
||||
this with a highlighter pen and a printout of the book, using
|
||||
<para>Notice how this differentiation between
|
||||
different elements of the content can be made without resorting to any <acronym>XML</acronym>
|
||||
terms. It really is surprisingly straightforward. This could be done
|
||||
with a highlighter pen and a printout of the book, using
|
||||
different colors to indicate different chunks of content.</para>
|
||||
|
||||
<para>Of course, we do not have an electronic highlighter pen, so
|
||||
we need some other way of indicating which element each piece of
|
||||
content belongs to. In languages written in XML (XHTML,
|
||||
content belongs to. In languages written in <acronym>XML</acronym> (<acronym>XHTML</acronym>,
|
||||
DocBook, et al) this is done by means of
|
||||
<emphasis>tags</emphasis>.</para>
|
||||
|
||||
|
@ -223,59 +221,54 @@
|
|||
<para>For an element called
|
||||
<replaceable>element-name</replaceable> the start tag will
|
||||
normally look like
|
||||
<sgmltag><replaceable>element-name</replaceable></sgmltag>. The
|
||||
<sgmltag class="starttag"><replaceable>element-name</replaceable></sgmltag>. The
|
||||
corresponding closing tag for this element is
|
||||
<sgmltag>/<replaceable>element-name</replaceable></sgmltag>.</para>
|
||||
<sgmltag class="endtag"><replaceable>element-name</replaceable></sgmltag>.</para>
|
||||
|
||||
<example>
|
||||
<title>Using an Element (Start and End Tags)</title>
|
||||
|
||||
<para>XHTML has an element for indicating that the content
|
||||
<para><acronym>XHTML</acronym> has an element for indicating that the content
|
||||
enclosed by the element is a paragraph, called
|
||||
<sgmltag>p</sgmltag>.</para>
|
||||
|
||||
<programlisting><![CDATA[<p>This is a paragraph. It starts with the start tag for
|
||||
<programlisting><sgmltag class="starttag">p</sgmltag>This is a paragraph. It starts with the start tag for
|
||||
the 'p' element, and it will end with the end tag for the 'p'
|
||||
element.</p>
|
||||
element.<sgmltag class="endtag">p</sgmltag>
|
||||
|
||||
<p>This is another paragraph. But this one is much shorter.</p>]]></programlisting>
|
||||
<sgmltag class="starttag">p</sgmltag>This is another paragraph. But this one is much shorter.<sgmltag class="endtag">p</sgmltag></programlisting>
|
||||
</example>
|
||||
|
||||
<para>Some elements have no
|
||||
content. For example, in XHTML you can indicate that you want a
|
||||
horizontal line to appear in the document.</para>
|
||||
|
||||
<para>For such elements, that have no content at all, XML introduced
|
||||
a shorthand form, which is ccompletely equivalent to the above
|
||||
form:</para>
|
||||
|
||||
<programlisting><![CDATA[<hr/>]]></programlisting>
|
||||
content. For example, in <acronym>XHTML</acronym>, a
|
||||
horizontal line can be included in the document.
|
||||
For these <quote>empty</quote> elements, <acronym>XML</acronym> introduced
|
||||
a shorthand form that is completely equivalent to the two-tag
|
||||
version:</para>
|
||||
|
||||
<example>
|
||||
<title>Using an Element (Without Content)</title>
|
||||
<title>Using an Element Without Content</title>
|
||||
|
||||
<para>XHTML has an element for indicating a horizontal rule,
|
||||
<para><acronym>XHTML</acronym> has an element for indicating a horizontal rule,
|
||||
called <sgmltag>hr</sgmltag>. This element does not wrap
|
||||
content, so it looks like this.</para>
|
||||
content, so it looks like this:</para>
|
||||
|
||||
<programlisting><![CDATA[<p>One paragraph.</p>
|
||||
<hr></hr>
|
||||
<programlisting><sgmltag class="starttag">p</sgmltag>One paragraph.<sgmltag class="endtag">p</sgmltag>
|
||||
<sgmltag class="starttag">hr</sgmltag><sgmltag class="endtag">hr</sgmltag>
|
||||
|
||||
<p>This is another paragraph. A horizontal rule separates this
|
||||
from the previous paragraph.</p>]]></programlisting>
|
||||
<sgmltag class="starttag">p</sgmltag>This is another paragraph. A horizontal rule separates this
|
||||
from the previous paragraph.<sgmltag class="endtag">p</sgmltag></programlisting>
|
||||
|
||||
<para>For such elements, that have no content at all, XML introduced
|
||||
a shorthand form, which is ccompletely equivalent to the above
|
||||
form:</para>
|
||||
<para>The shorthand version consists of a single tag:</para>
|
||||
|
||||
<programlisting><![CDATA[<p>One paragraph.</p>
|
||||
<hr/>
|
||||
<programlisting><sgmltag class="starttag">p</sgmltag>One paragraph.<sgmltag class="endtag">p</sgmltag>
|
||||
<sgmltag class="emptytag">hr</sgmltag>
|
||||
|
||||
<p>This is another paragraph. A horizontal rule separates this
|
||||
from the previous paragraph.</p>]]></programlisting>
|
||||
<sgmltag class="starttag">p</sgmltag>This is another paragraph. A horizontal rule separates this
|
||||
from the previous paragraph.<sgmltag class="endtag">p</sgmltag></programlisting>
|
||||
</example>
|
||||
|
||||
<para>If it is not obvious by now, elements can contain other
|
||||
<para>As shown above, elements can contain other
|
||||
elements. In the book example earlier, the book element
|
||||
contained all the chapter elements, which in turn contained all
|
||||
the paragraph elements, and so on.</para>
|
||||
|
@ -283,11 +276,11 @@
|
|||
<example>
|
||||
<title>Elements within Elements; <sgmltag>em</sgmltag></title>
|
||||
|
||||
<programlisting><![CDATA[<p>This is a simple <em>paragraph</em> where some
|
||||
of the <em>words</em> have been <em>emphasized</em>.</p>]]></programlisting>
|
||||
<programlisting><sgmltag class="starttag">p</sgmltag>This is a simple <sgmltag class="starttag">em</sgmltag>paragraph<sgmltag class="endtag">em</sgmltag> where some
|
||||
of the <sgmltag class="starttag">em</sgmltag>words<sgmltag class="endtag">em</sgmltag> have been <sgmltag class="starttag">em</sgmltag>emphasized<sgmltag class="endtag">em</sgmltag>.<sgmltag class="endtag">p</sgmltag></programlisting>
|
||||
</example>
|
||||
|
||||
<para>The grammar will specify the rules detailing which elements can
|
||||
<para>The grammar consists of rules that describe which elements can
|
||||
contain other elements, and exactly what they can
|
||||
contain.</para>
|
||||
|
||||
|
@ -298,10 +291,10 @@
|
|||
|
||||
<para>An element is a conceptual part of your document. An
|
||||
element has a defined start and end. The tags mark where the
|
||||
element starts and end.</para>
|
||||
element starts and ends.</para>
|
||||
|
||||
<para>When this document (or anyone else knowledgeable about
|
||||
XML) refers to <quote>the <sgmltag>p</sgmltag> tag</quote>
|
||||
<acronym>XML</acronym>) refers to <quote>the <sgmltag class="starttag">p</sgmltag> tag</quote>
|
||||
they mean the literal text consisting of the three characters
|
||||
<literal><</literal>, <literal>p</literal>, and
|
||||
<literal>></literal>. But the phrase <quote>the
|
||||
|
@ -323,13 +316,13 @@
|
|||
take the form
|
||||
<literal><replaceable>attribute-name</replaceable>="<replaceable>attribute-value</replaceable>"</literal>.</para>
|
||||
|
||||
<para>In XHTML, the
|
||||
<para>In <acronym>XHTML</acronym>, the
|
||||
<sgmltag>p</sgmltag> element has an attribute called
|
||||
<sgmltag>align</sgmltag>, which suggests an alignment
|
||||
<sgmltag class="attribute">align</sgmltag>, which suggests an alignment
|
||||
(justification) for the paragraph to the program displaying the
|
||||
XHTML.</para>
|
||||
<acronym>XHTML</acronym>.</para>
|
||||
|
||||
<para>The <literal>align</literal> attribute can take one of four
|
||||
<para>The <sgmltag class="attribute">align</sgmltag> attribute can take one of four
|
||||
defined values, <literal>left</literal>,
|
||||
<literal>center</literal>, <literal>right</literal> and
|
||||
<literal>justify</literal>. If the attribute is not specified
|
||||
|
@ -338,59 +331,57 @@
|
|||
<example>
|
||||
<title>Using An Element with An Attribute</title>
|
||||
|
||||
<programlisting><![CDATA[<p align="left">The inclusion of the align attribute
|
||||
on this paragraph was superfluous, since the default is left.</p>
|
||||
<programlisting><sgmltag class="starttag">p align="left"</sgmltag>The inclusion of the align attribute
|
||||
on this paragraph was superfluous, since the default is left.<sgmltag class="endtag">p</sgmltag>
|
||||
|
||||
<p align="center">This may appear in the center.</p>]]></programlisting>
|
||||
<sgmltag class="starttag">p align="center"</sgmltag>This may appear in the center.<sgmltag class="endtag">p</sgmltag></programlisting>
|
||||
</example>
|
||||
|
||||
<para>Some attributes will only take specific values, such as
|
||||
<para>Some attributes only take specific values, such as
|
||||
<literal>left</literal> or <literal>justify</literal>. Others
|
||||
will allow you to enter anything you want.</para>
|
||||
allow any value.</para>
|
||||
|
||||
<example>
|
||||
<title>Single Quotes Around Attributes</title>
|
||||
|
||||
<programlisting><![CDATA[<p align='right'>I am on the right!</p>]]></programlisting>
|
||||
<programlisting><sgmltag class="starttag">p align='right'</sgmltag>I am on the right!<sgmltag class="endtag">p</sgmltag></programlisting>
|
||||
</example>
|
||||
|
||||
<para>XML requires you to quote each attribute value with either
|
||||
single or double quotes. It is more habitual to use double quotes
|
||||
but you may use single quotes, as well. Using single quotes is
|
||||
practical if you want to include double quotes in the attribute
|
||||
value.</para>
|
||||
<para>Attribute values in <acronym>XML</acronym> must be enclosed
|
||||
in either single or double quotes. Double quotes are
|
||||
traditional. Single quotes are useful when the attribute
|
||||
value contains double quotes.</para>
|
||||
|
||||
<para>The information on attributes, elements, and tags is stored
|
||||
in XML catalogs. The various Documentation Project tools use
|
||||
these catalog files to validate your work. The tools in
|
||||
<filename role="package">textproc/docproj</filename> include a
|
||||
variety of XML catalog files. The FreeBSD Documentation
|
||||
Project includes its own set of catalog files. Your tools need
|
||||
to know about both sorts of catalog files.</para>
|
||||
<para>Information about attributes, elements, and tags is stored
|
||||
in catalog files. The Documentation Project uses standard
|
||||
DocBook catalogs and includes additional catalogs for
|
||||
&os;-specific features. Paths to the catalog files are defined
|
||||
in an environment variable so they can be found by the document
|
||||
build tools.</para>
|
||||
|
||||
<sect2>
|
||||
<title>For You to Do…</title>
|
||||
<title>To Do…</title>
|
||||
|
||||
<para>In order to run the examples in this document you will
|
||||
need to install some software on your system and ensure that
|
||||
an environment variable is set correctly.</para>
|
||||
<para>Before running the examples in this document,
|
||||
application software must be installed and the catalog
|
||||
environment variable configured.</para>
|
||||
|
||||
<procedure>
|
||||
<step>
|
||||
<para>Download and install
|
||||
<para>Install
|
||||
<filename role="package">textproc/docproj</filename> from
|
||||
the FreeBSD ports system. This is a
|
||||
<emphasis>meta-port</emphasis> that should download and
|
||||
install all of the programs and supporting files that are
|
||||
used by the Documentation Project.</para>
|
||||
the &os; Ports Collection. This is a
|
||||
<emphasis>meta-port</emphasis> that downloads and
|
||||
installs the standard programs and supporting files needed
|
||||
by the Documentation Project.</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>Add lines to your shell startup files to set
|
||||
<envar>SGML_CATALOG_FILES</envar>. (If you are not working
|
||||
on the English version of the documentation, you will want
|
||||
to substitute the correct directory for your
|
||||
language.)</para>
|
||||
<para>Add lines to the shell startup files to set
|
||||
<envar>SGML_CATALOG_FILES</envar>. When working on non-English
|
||||
versions of the documentation, replace
|
||||
<replaceable>en_US.ISO8859-1</replaceable> with the appropriate directory for the
|
||||
target language.</para>
|
||||
|
||||
<example id="xml-primer-envars">
|
||||
<title><filename>.profile</filename>, for &man.sh.1; and
|
||||
|
@ -402,7 +393,7 @@ SGML_CATALOG_FILES=${SGML_ROOT}/docbook/4.1/catalog:$SGML_CATALOG_FILES
|
|||
SGML_CATALOG_FILES=${SGML_ROOT}/html/catalog:$SGML_CATALOG_FILES
|
||||
SGML_CATALOG_FILES=${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES
|
||||
SGML_CATALOG_FILES=/usr/doc/share/xml/catalog:$SGML_CATALOG_FILES
|
||||
SGML_CATALOG_FILES=/usr/doc/en_US.ISO8859-1/share/xml/catalog:$SGML_CATALOG_FILES
|
||||
SGML_CATALOG_FILES=/usr/doc/<replaceable>en_US.ISO8859-1</replaceable>/share/xml/catalog:$SGML_CATALOG_FILES
|
||||
export SGML_CATALOG_FILES</programlisting>
|
||||
</example>
|
||||
|
||||
|
@ -416,11 +407,11 @@ setenv SGML_CATALOG_FILES ${SGML_ROOT}/docbook/4.1/catalog:$SGML_CATALOG_FILES
|
|||
setenv SGML_CATALOG_FILES ${SGML_ROOT}/html/catalog:$SGML_CATALOG_FILES
|
||||
setenv SGML_CATALOG_FILES ${SGML_ROOT}/iso8879/catalog:$SGML_CATALOG_FILES
|
||||
setenv SGML_CATALOG_FILES /usr/doc/share/xml/catalog:$SGML_CATALOG_FILES
|
||||
setenv SGML_CATALOG_FILES /usr/doc/en_US.ISO8859-1/share/xml/catalog:$SGML_CATALOG_FILES</programlisting>
|
||||
setenv SGML_CATALOG_FILES /usr/doc/<replaceable>en_US.ISO8859-1</replaceable>/share/xml/catalog:$SGML_CATALOG_FILES</programlisting>
|
||||
</example>
|
||||
|
||||
<para>Then either log out, and log back in again, or run
|
||||
those commands from the command line to set the variable
|
||||
<para>After making these changes, either log out and log back in again, or run
|
||||
the commands from the command line to set the variable
|
||||
values.</para>
|
||||
</step>
|
||||
</procedure>
|
||||
|
@ -428,67 +419,65 @@ setenv SGML_CATALOG_FILES /usr/doc/en_US.ISO8859-1/share/xml/catalog:$SGML_CATAL
|
|||
<procedure>
|
||||
<step>
|
||||
<para>Create <filename>example.xml</filename>, and enter
|
||||
the following text:</para>
|
||||
this text:</para>
|
||||
|
||||
<programlisting><![CDATA[<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
<programlisting><sgmltag class="starttag">!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"</sgmltag>
|
||||
|
||||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||||
<head>
|
||||
<title>An Example XHTML File</title>
|
||||
</head>
|
||||
<sgmltag class="starttag">html xmlns="http://www.w3.org/1999/xhtml"</sgmltag>
|
||||
<sgmltag class="starttag">head</sgmltag>
|
||||
<sgmltag class="starttag">title</sgmltag>An Example XHTML File<sgmltag class="endtag">title</sgmltag>
|
||||
<sgmltag class="endtag">head</sgmltag>
|
||||
|
||||
<body>
|
||||
<p>This is a paragraph containing some text.</p>
|
||||
<sgmltag class="starttag">body</sgmltag>
|
||||
<sgmltag class="starttag">p</sgmltag>This is a paragraph containing some text.<sgmltag class="endtag">p</sgmltag>
|
||||
|
||||
<p>This paragraph contains some more text.</p>
|
||||
<sgmltag class="starttag">p</sgmltag>This paragraph contains some more text.<sgmltag class="endtag">p</sgmltag>
|
||||
|
||||
<p align="right">This paragraph might be right-justified.</p>
|
||||
</body>
|
||||
</html>]]></programlisting>
|
||||
<sgmltag class="starttag">p align="right"</sgmltag>This paragraph might be right-justified.<sgmltag class="endtag">p</sgmltag>
|
||||
<sgmltag class="endtag">body</sgmltag>
|
||||
<sgmltag class="endtag">html</sgmltag></programlisting>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>Try to validate this file using an XML parser.</para>
|
||||
<para>Try to validate this file using an <acronym>XML</acronym> parser.</para>
|
||||
|
||||
<para>Part of
|
||||
<filename role="package">textproc/docproj</filename> is
|
||||
<para><filename role="package">textproc/docproj</filename> includes
|
||||
the <command>xmllint</command>
|
||||
<link linkend="xml-primer-validating">validating
|
||||
parser</link>.</para>
|
||||
|
||||
<para>Use <command>xmllint</command> in the following way to
|
||||
check that your document is valid:</para>
|
||||
<para>Use <command>xmllint</command> to
|
||||
validate the document:</para>
|
||||
|
||||
<screen>&prompt.user; <userinput>xmllint --valid --noout example.xml</userinput></screen>
|
||||
|
||||
<para>As you will see, <command>xmllint</command> returns
|
||||
without displaying any output. This means that your
|
||||
<para><command>xmllint</command> returns
|
||||
without displaying any output, showing that the
|
||||
document validated successfully.</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>See what happens when required elements are omitted.
|
||||
Try removing the <sgmltag>title</sgmltag> and
|
||||
<sgmltag>/title</sgmltag> tags, and re-run the
|
||||
Delete the line with the <sgmltag class="starttag">title</sgmltag> and
|
||||
<sgmltag class="endtag">/title</sgmltag> tags, and re-run the
|
||||
validation.</para>
|
||||
|
||||
<screen>&prompt.user; <userinput>xmllint --valid --noout example.xml</userinput>
|
||||
example.xml:5: element head: validity error : Element head content does not follow the DTD, expecting ((script | style | meta | link | object | isindex)* , ((title , (script | style | meta | link | object | isindex)* , (base , (script | style | meta | link | object | isindex)*)?) | (base , (script | style | meta | link | object | isindex)* , title , (script | style | meta | link | object | isindex)*))), got ()</screen>
|
||||
|
||||
<para>This line tells you that the validation error comes from
|
||||
<para>This shows that the validation error comes from
|
||||
the <replaceable>fifth</replaceable> line of the
|
||||
<replaceable>example.xml</replaceable> file and that the
|
||||
content of the <sgmltag>head</sgmltag> is the part, which
|
||||
does not follow the rules described by the XHTML grammar.</para>
|
||||
content of the <sgmltag class="starttag">head</sgmltag> is the part which
|
||||
does not follow the rules of the <acronym>XHTML</acronym> grammar.</para>
|
||||
|
||||
<para>Below this line <command>xmllint</command> will show you
|
||||
the line where the error has been found and will also mark the
|
||||
exact character position with a ^ sign.</para>
|
||||
<para>Then <command>xmllint</command> shows
|
||||
the line where the error was found and marks the
|
||||
exact character position with a <literal>^</literal> sign.</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>Put the <sgmltag>title</sgmltag> element back
|
||||
in.</para>
|
||||
<para>Replace the <sgmltag>title</sgmltag> element.</para>
|
||||
</step>
|
||||
</procedure>
|
||||
</sect2>
|
||||
|
@ -497,17 +486,15 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<sect1 id="xml-primer-doctype-declaration">
|
||||
<title>The DOCTYPE Declaration</title>
|
||||
|
||||
<para>The beginning of each document that you write may specify
|
||||
the name of the DTD that the document conforms to in case you use
|
||||
the DTD specification language. Other specification languages, like
|
||||
XML Schema and RELAX NG are not referred in the source document.
|
||||
This DOCTYPE declaration serves the XML parsers so that they can
|
||||
determine the DTD and ensure that the document does conform to it.</para>
|
||||
<para>The beginning of each document can specify
|
||||
the name of the <acronym>DTD</acronym> to which the document conforms.
|
||||
This DOCTYPE declaration is used by <acronym>XML</acronym> parsers to
|
||||
identify the <acronym>DTD</acronym> and ensure that the document does conform to it.</para>
|
||||
|
||||
<para>A typical declaration for a document written to conform with
|
||||
version 1.0 of the XHTML DTD looks like this:</para>
|
||||
version 1.0 of the <acronym>XHTML</acronym> <acronym>DTD</acronym> looks like this:</para>
|
||||
|
||||
<programlisting><![CDATA[<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">]]></programlisting>
|
||||
<programlisting><sgmltag class="starttag">!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"</sgmltag></programlisting>
|
||||
|
||||
<para>That line contains a number of different components.</para>
|
||||
|
||||
|
@ -516,9 +503,8 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><literal><!</literal></term>
|
||||
|
||||
<listitem>
|
||||
<para>Is the <emphasis>indicator</emphasis> that indicates
|
||||
that this is an XML declaration. This line is declaring
|
||||
the document type.</para>
|
||||
<para>The <emphasis>indicator</emphasis> shows
|
||||
this is an <acronym>XML</acronym> declaration.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
|
@ -526,7 +512,7 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><literal>DOCTYPE</literal></term>
|
||||
|
||||
<listitem>
|
||||
<para>Shows that this is an XML declaration for the
|
||||
<para>Shows that this is an <acronym>XML</acronym> declaration of the
|
||||
document type.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -545,18 +531,18 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><literal>PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"</literal></term>
|
||||
|
||||
<listitem>
|
||||
<para>Lists the Formal Public Identifier (FPI)
|
||||
<para>Lists the Formal Public Identifier (<acronym>FPI</acronym>)
|
||||
<indexterm>
|
||||
<primary>Formal Public Identifier</primary>
|
||||
</indexterm>
|
||||
for the DTD that this document conforms to. Your XML
|
||||
parser will use this to find the correct DTD when
|
||||
for the <acronym>DTD</acronym> to which this document conforms. The <acronym>XML</acronym>
|
||||
parser uses this to find the correct <acronym>DTD</acronym> when
|
||||
processing this document.</para>
|
||||
|
||||
<para><literal>PUBLIC</literal> is not a part of the FPI,
|
||||
but indicates to the XML processor how to find the DTD
|
||||
referenced in the FPI. Other ways of telling the XML
|
||||
parser how to find the DTD are shown <link
|
||||
<para><literal>PUBLIC</literal> is not a part of the <acronym>FPI</acronym>,
|
||||
but indicates to the <acronym>XML</acronym> processor how to find the <acronym>DTD</acronym>
|
||||
referenced in the <acronym>FPI</acronym>. Other ways of telling the <acronym>XML</acronym>
|
||||
parser how to find the <acronym>DTD</acronym> are shown <link
|
||||
linkend="xml-primer-fpi-alternatives">later</link>.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -565,7 +551,7 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><literal>"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"</literal></term>
|
||||
|
||||
<listitem>
|
||||
<para>A local filename or an URL to find the DTD.</para>
|
||||
<para>A local filename or a <acronym>URL</acronym> to find the <acronym>DTD</acronym>.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
|
@ -573,25 +559,24 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><literal>></literal></term>
|
||||
|
||||
<listitem>
|
||||
<para>Returns to the document.</para>
|
||||
<para>Ends the declaration and returns to the document.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
|
||||
<sect2>
|
||||
<title>Formal Public Identifiers (FPIs)
|
||||
<title>Formal Public Identifiers (<acronym>FPI</acronym>s)
|
||||
<indexterm significance="preferred">
|
||||
<primary>Formal Public Identifier</primary>
|
||||
</indexterm></title>
|
||||
|
||||
<note>
|
||||
<para>You do not need to know this, but it is useful
|
||||
background, and might help you debug problems when your XML
|
||||
processor can not locate the DTD you are using.</para>
|
||||
<para>It is not necessary to know this, but it is useful
|
||||
background, and might help debug problems when the <acronym>XML</acronym>
|
||||
processor can not locate the <acronym>DTD</acronym>.</para>
|
||||
</note>
|
||||
|
||||
<para>FPIs must follow a specific syntax. This syntax is as
|
||||
follows:</para>
|
||||
<para><acronym>FPI</acronym>s must follow a specific syntax:</para>
|
||||
|
||||
<programlisting>"<replaceable>Owner</replaceable>//<replaceable>Keyword</replaceable> <replaceable>Description</replaceable>//<replaceable>Language</replaceable>"</programlisting>
|
||||
|
||||
|
@ -600,16 +585,16 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><replaceable>Owner</replaceable></term>
|
||||
|
||||
<listitem>
|
||||
<para>This indicates the owner of the FPI.</para>
|
||||
<para>The owner of the <acronym>FPI</acronym>.</para>
|
||||
|
||||
<para>If this string starts with <quote>ISO</quote> then
|
||||
this is an ISO owned FPI. For example, the FPI
|
||||
<para>The beginning of the string identifies the owner
|
||||
of the <acronym>FPI</acronym>. For example, the <acronym>FPI</acronym>
|
||||
<literal>"ISO 8879:1986//ENTITIES Greek
|
||||
Symbols//EN"</literal> lists
|
||||
<literal>ISO 8879:1986</literal> as being the owner for
|
||||
the set of entities for Greek symbols. ISO 8879:1986 is
|
||||
the ISO number for the SGML standard, the predecessor
|
||||
(and a superset) of XML.</para>
|
||||
the set of entities for Greek symbols. <acronym>ISO</acronym> 8879:1986 is
|
||||
the International Organization for Standardization (<acronym>ISO</acronym>) number for the <acronym>SGML</acronym> standard, the predecessor
|
||||
(and a superset) of <acronym>XML</acronym>.</para>
|
||||
|
||||
<para>Otherwise, this string will either look like
|
||||
<literal>-//<replaceable>Owner</replaceable></literal>
|
||||
|
@ -620,21 +605,21 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
|
||||
<para>If the string starts with <literal>-</literal> then
|
||||
the owner information is unregistered, with a
|
||||
<literal>+</literal> it identifies it as being
|
||||
<literal>+</literal> identifying it as
|
||||
registered.</para>
|
||||
|
||||
<para>ISO 9070:1991 defines how registered names are
|
||||
generated; it might be derived from the number of an ISO
|
||||
publication, an ISBN code, or an organization code
|
||||
assigned according to ISO 6523. In addition, a
|
||||
<para><acronym>ISO</acronym> 9070:1991 defines how registered names are
|
||||
generated. It might be derived from the number of an <acronym>ISO</acronym>
|
||||
publication, an <acronym>ISBN</acronym> code, or an organization code
|
||||
assigned according to <acronym>ISO</acronym> 6523. Additionally, a
|
||||
registration authority could be created in order to
|
||||
assign registered names. The ISO council delegated this
|
||||
assign registered names. The <acronym>ISO</acronym> council delegated this
|
||||
to the American National Standards Institute
|
||||
(ANSI).</para>
|
||||
(<acronym>ANSI</acronym>).</para>
|
||||
|
||||
<para>Because the FreeBSD Project has not been registered
|
||||
the owner string is <literal>-//FreeBSD</literal>. And
|
||||
as you can see, the W3C are not a registered owner
|
||||
<para>Because the &os; Project has not been registered,
|
||||
the owner string is <literal>-//&os;</literal>. As
|
||||
seen in the example, the <acronym>W3C</acronym> are not a registered owner
|
||||
either.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -648,10 +633,10 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
keywords are <literal>DTD</literal>,
|
||||
<literal>ELEMENT</literal>, <literal>ENTITIES</literal>,
|
||||
and <literal>TEXT</literal>. <literal>DTD</literal> is
|
||||
used only for DTD files, <literal>ELEMENT</literal> is
|
||||
usually used for DTD fragments that contain only entity
|
||||
used only for <acronym>DTD</acronym> files, <literal>ELEMENT</literal> is
|
||||
usually used for <acronym>DTD</acronym> fragments that contain only entity
|
||||
or element declarations. <literal>TEXT</literal> is
|
||||
used for XML content (text and tags).</para>
|
||||
used for <acronym>XML</acronym> content (text and tags).</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
|
@ -659,10 +644,10 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><replaceable>Description</replaceable></term>
|
||||
|
||||
<listitem>
|
||||
<para>Any description you want to supply for the contents
|
||||
<para>Any description can be given for the contents
|
||||
of this file. This may include version numbers or any
|
||||
short text that is meaningful to you and unique for the
|
||||
XML system.</para>
|
||||
short text that is meaningful and unique for the
|
||||
<acronym>XML</acronym> system.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
|
@ -670,7 +655,7 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<term><replaceable>Language</replaceable></term>
|
||||
|
||||
<listitem>
|
||||
<para>This is an ISO two-character code that identifies
|
||||
<para>An <acronym>ISO</acronym> two-character code that identifies
|
||||
the native language for the file. <literal>EN</literal>
|
||||
is used for English.</para>
|
||||
</listitem>
|
||||
|
@ -680,48 +665,45 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
<sect3>
|
||||
<title><filename>catalog</filename> Files</title>
|
||||
|
||||
<para>If you use the syntax above and process this document
|
||||
using an XML processor, the processor will need to have
|
||||
some way of turning the FPI into the name of the file on
|
||||
your computer that contains the DTD.</para>
|
||||
|
||||
<para>In order to do this it can use a catalog file. A
|
||||
<para>With the syntax above,
|
||||
an <acronym>XML</acronym> processor needs to have
|
||||
some way of turning the <acronym>FPI</acronym> into the name of the file
|
||||
containing the <acronym>DTD</acronym>. A
|
||||
catalog file (typically called <filename>catalog</filename>)
|
||||
contains lines that map FPIs to filenames. For example, if
|
||||
contains lines that map <acronym>FPI</acronym>s to filenames. For example, if
|
||||
the catalog file contained the line:</para>
|
||||
|
||||
<!-- XXX: mention XML catalog or maybe replace this totally and only cover XML catalog -->
|
||||
|
||||
<programlisting>PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "1.0/transitional.dtd"</programlisting>
|
||||
|
||||
<para>The XML processor would know to look up the DTD from
|
||||
<filename>transitional.dtd</filename> in the
|
||||
<filename>1.0</filename> subdirectory of whichever directory
|
||||
held the <filename>catalog</filename> file that contained
|
||||
that line.</para>
|
||||
<para>The <acronym>XML</acronym> processor knows that the <acronym>DTD</acronym> is
|
||||
called <filename>transitional.dtd</filename> in the
|
||||
<filename>1.0</filename> subdirectory of the directory that
|
||||
held the <filename>catalog</filename> file.</para>
|
||||
|
||||
<para>Look at the contents of
|
||||
<para>Examine the contents of
|
||||
<filename>/usr/local/share/xml/dtd/xhtml/catalog.xml</filename>.
|
||||
This is the catalog file for the XHTML DTDs that will have
|
||||
been installed as part of the <filename
|
||||
This is the catalog file for the <acronym>XHTML</acronym> <acronym>DTD</acronym>s that was
|
||||
installed as part of the <filename
|
||||
role="package">textproc/docproj</filename> port.</para>
|
||||
</sect3>
|
||||
|
||||
<sect3>
|
||||
<title><envar>SGML_CATALOG_FILES</envar></title>
|
||||
|
||||
<para>In order to locate a <filename>catalog</filename> file,
|
||||
your XML processor will need to know where to look. Many
|
||||
of them feature command line parameters for specifying the
|
||||
<para>To locate a <filename>catalog</filename> file,
|
||||
the <acronym>XML</acronym> processor must know where to look. Many
|
||||
feature command line parameters for specifying the
|
||||
path to one or more catalogs.</para>
|
||||
|
||||
<para>In addition, you can set
|
||||
<envar>SGML_CATALOG_FILES</envar> to point to the files.
|
||||
This environment variable should consist of a
|
||||
<para>In addition,
|
||||
<envar>SGML_CATALOG_FILES</envar> can be set to point to the files.
|
||||
This environment variable consists of a
|
||||
colon-separated list of catalog files (including their full
|
||||
path).</para>
|
||||
|
||||
<para>Typically, you will want to include the following
|
||||
<para>Typically, the list includes these
|
||||
files:</para>
|
||||
|
||||
<itemizedlist>
|
||||
|
@ -742,33 +724,30 @@ example.xml:5: element head: validity error : Element head content does not foll
|
|||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>You should <link linkend="xml-primer-envars">already
|
||||
have done this</link>.</para>
|
||||
<para>This was done <link linkend="xml-primer-envars">earlier</link>.</para>
|
||||
</sect3>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="xml-primer-fpi-alternatives">
|
||||
<title>Alternatives to FPIs</title>
|
||||
<title>Alternatives to <acronym>FPI</acronym>s</title>
|
||||
|
||||
<para>Instead of using an FPI to indicate the DTD that the
|
||||
document conforms to (and therefore, which file on the system
|
||||
contains the DTD) you can explicitly specify the name of the
|
||||
file.</para>
|
||||
<para>Instead of using an <acronym>FPI</acronym> to indicate the <acronym>DTD</acronym> to which
|
||||
the document conforms (and therefore, which file on the system
|
||||
contains the <acronym>DTD</acronym>), the filename can be explicitly specified.</para>
|
||||
|
||||
<para>The syntax for this is slightly different:</para>
|
||||
<para>The syntax is slightly different:</para>
|
||||
|
||||
<programlisting><![CDATA[<!DOCTYPE html SYSTEM "/path/to/file.dtd">]]></programlisting>
|
||||
<programlisting><sgmltag class="starttag">!DOCTYPE html SYSTEM "/path/to/file.dtd"</sgmltag></programlisting>
|
||||
|
||||
<para>The <literal>SYSTEM</literal> keyword indicates that the
|
||||
XML processor should locate the DTD in a system specific
|
||||
fashion. This typically (but not always) means the DTD will
|
||||
<acronym>XML</acronym> processor should locate the <acronym>DTD</acronym> in a system specific
|
||||
fashion. This typically (but not always) means the <acronym>DTD</acronym> will
|
||||
be provided as a filename.</para>
|
||||
|
||||
<para>Using FPIs is preferred for reasons of portability. You
|
||||
do not want to have to ship a copy of the DTD around with your
|
||||
document, and if you used the <literal>SYSTEM</literal>
|
||||
identifier then everyone would need to keep their DTDs in the
|
||||
same place.</para>
|
||||
<para>Using <acronym>FPI</acronym>s is preferred for reasons of portability.
|
||||
If the <literal>SYSTEM</literal>
|
||||
identifier is used, then the <acronym>DTD</acronym> must be provided and kept in the same location
|
||||
for everyone.</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
|
Loading…
Reference in a new issue