mirror of
git://git.code.sf.net/p/zsh/code
synced 2025-01-20 11:51:24 +01:00
24811: update introductory multibyte documentation
This commit is contained in:
parent
f125ca3293
commit
b1b941c30b
3 changed files with 49 additions and 42 deletions
|
@ -1,5 +1,8 @@
|
|||
2008-04-14 Peter Stephenson <pws@csr.com>
|
||||
|
||||
* 24811: Doc/Zsh/roadmap.yo, Etc/FAQ.yo: update introductory
|
||||
documentation on multibyte support.
|
||||
|
||||
* 24810 (slightly edited to move added text later):
|
||||
Src/Zle/zle_tricky.c: after unmetafying the command line ensure
|
||||
we're not on a combining character.
|
||||
|
|
|
@ -44,6 +44,13 @@ variables (referred to in the documentation as parameters) tt(HISTFILE),
|
|||
tt(HISTSIZE) and tt(SAVEHIST) in ifzman(zmanref(zshparam))\
|
||||
ifnzman(noderef(Parameters Used By The Shell)).
|
||||
|
||||
The shell now supports the UTF-8 character set (and also others if
|
||||
supported by the operating system). This is (mostly) handled transparently
|
||||
by the shell, but the degree of support in terminal emulators is variable.
|
||||
There is some discussion of this in the shell FAQ,
|
||||
http://zsh.dotsrc.org/FAQ/ . Note in particular that for combining
|
||||
characters to be handled the option tt(COMBINING_CHARS) needs to be set.
|
||||
|
||||
subsect(Completion)
|
||||
|
||||
Completion is a feature present in many shells. It allows the user to
|
||||
|
|
81
Etc/FAQ.yo
81
Etc/FAQ.yo
|
@ -126,11 +126,11 @@ Chapter 4: The mysteries of completion
|
|||
4.5. How do I get started with programmable completion?
|
||||
4.6. Suppose I want to complete all files during a special completion?
|
||||
|
||||
Chapter 5: Multibyte input
|
||||
Chapter 5: Multibyte input and output
|
||||
|
||||
5.1. What is multibyte input?
|
||||
5.2. How does zsh handle multibyte input?
|
||||
5.3. How do I ensure multibyte input works on my system?
|
||||
5.2. How does zsh handle multibyte input and output?
|
||||
5.3. How do I ensure multibyte input and output work on my system?
|
||||
5.4. How can I input characters that aren't on my keyboard?
|
||||
|
||||
Chapter 6: The future of zsh
|
||||
|
@ -1961,7 +1961,7 @@ sect(Suppose I want to complete all files during a special completion?)
|
|||
such as expansion or approximate completion.
|
||||
|
||||
|
||||
chapter(Multibyte input)
|
||||
chapter(Multibyte input and output)
|
||||
label(c5)
|
||||
|
||||
sect(What is multibyte input?)
|
||||
|
@ -2012,7 +2012,7 @@ sect(What is multibyte input?)
|
|||
in those formats.)
|
||||
|
||||
|
||||
sect(How does zsh handle multibyte input?)
|
||||
sect(How does zsh handle multibyte input and output?)
|
||||
|
||||
Until version 4.3, zsh didn't handle multibyte input properly at all.
|
||||
Each octet in a multibyte character would look to the shell like a
|
||||
|
@ -2021,50 +2021,44 @@ sect(How does zsh handle multibyte input?)
|
|||
cause all sorts of odd effects. (It was possible to edit in zsh using
|
||||
single-byte extensions of ASCII such as the ISO 8859 family, however.)
|
||||
|
||||
From version 4.3, multibyte input is handled in the line editor if zsh
|
||||
has been compiled with the appropriate definitions. This will happen
|
||||
automatically if the compiler defines __STDC_ISO_10646__, which is true
|
||||
for many recent GNU-based systems. On other systems you must configure
|
||||
zsh with the argument --enable-multibyte to configure. Explicit use of
|
||||
--enable-multibyte should work on many other recent UNIX systems; if it
|
||||
works on yours, and that's not mentioned in the shell documentation,
|
||||
please report this to zsh-workers@sunsite.dk, and if it doesn't but you
|
||||
can work out why not we'd also be interested in hearing.
|
||||
From version 4.3.4, multibyte input is handled in the line editor if zsh
|
||||
has been compiled with the appropriate definitions, and is automatically
|
||||
activated. This is indicated by the option tt(MULTIBYTE), which is
|
||||
set by default on shells that support multibyte mode. Hence you
|
||||
can test this with a standard option test: `tt([[ -o multibyte ]])'.
|
||||
|
||||
(The reason for the test for __STDC_ISO_10646__ is that its presence
|
||||
happens to indicate that the required library support is likely to be
|
||||
present, short-circuiting a large number of configuration tests. This
|
||||
isn't strictly guaranteed, since the definition indicates the rather more
|
||||
limited fact that the wide character representation used internally by
|
||||
the shell is Unicode. However, in practice such systems provide the
|
||||
right level of support for zsh to use. It would be better to test
|
||||
individually for the library features the shell needs; unfortunately
|
||||
there are a lot of them.)
|
||||
The tt(MULTIBYTE) option affects the entire shell: parameter expansion,
|
||||
pattern matching, etc. count valid multibyte character strings as a
|
||||
single character. You can unset the option locally in a function to
|
||||
revert to single-byte operation.
|
||||
|
||||
You can test if multibyte handling is compiled into your version of the
|
||||
shell by running:
|
||||
verb(
|
||||
(bindkey -m)
|
||||
)
|
||||
which should output a warning:
|
||||
verb(
|
||||
bindkey: warning: `bindkey -m' disables multibyte support
|
||||
)
|
||||
If it doesn't, you don't have multibyte support in your shell. The
|
||||
parentheses are there to run the command in a subshell, which protects
|
||||
your interactive shell from the effects being warned about.
|
||||
Note that if the shell is emulating a Bourne shell the tt(MULTIBYTE)
|
||||
option is unset by default. This allows various POSIX modes to
|
||||
work normally (POSIX does not deal with multibyte characters). If
|
||||
you use a "sh" or "ksh" emulation interactively you shouldprobably
|
||||
set the tt(MULTIBYTE) option.
|
||||
|
||||
Multibyte strings are not yet handled anywhere else in the shell. This
|
||||
means, for example, patterns treat multibyte characters as a set of single
|
||||
octets and the ${#var} syntax counts octets, not characters. There will
|
||||
probably be new syntax to ensure that zsh can work both in its traditional
|
||||
way as well as when interpreting multibyte characters.
|
||||
The other option that affects multibyte support is tt(COMBINING_CHARS),
|
||||
new in version 4.3.7. When this is set, any zero-length punctuation
|
||||
characters that follow an alphanumeric character (the base character) are
|
||||
assumed to be modifications (accents etc.) to the base character and to
|
||||
be displayed within the same screen area as the base character. As not
|
||||
all terminals handle this, even if they correctly display the base
|
||||
multibyte character, this option is not on by default. The KDE terminal
|
||||
emulator tt(konsole) is known to handle combining characters.
|
||||
|
||||
The tt(COMBINING_CHARS) option only affects output; combining characters
|
||||
may always be input, but when the option is off will be displayed
|
||||
specially. By default this is as a code point (the index of the
|
||||
character in the character set) between angle brackets, usually
|
||||
in inverse video. Highlighting of such special characters can
|
||||
be modified using the new array parameter tt(zle_highlight).
|
||||
|
||||
|
||||
sect(How do I ensure multibyte input works on my system?)
|
||||
sect(How do I ensure multibyte input and output work on my system?)
|
||||
|
||||
Once you have a version of zsh with multibyte support, you need to
|
||||
ensure the envivronment is correct. We'll assume you're using UTF-8.
|
||||
ensure the environment is correct. We'll assume you're using UTF-8.
|
||||
Many modern systems may come set up correctly already. Try one of
|
||||
the editing widgets described in the next section to see.
|
||||
|
||||
|
@ -2163,6 +2157,9 @@ url(http://www.unicode.org/charts/)(http://www.unicode.org/charts/).
|
|||
however, using UTF-8 massively extends the number of valid characters
|
||||
that can be produced.
|
||||
|
||||
See also url(http://www.cl.cam.ac.uk/~mgk25/unicode.html#input)http://www.cl.cam.ac.uk/~mgk25/unicode.html#input)
|
||||
for general information on entering Unicode characters from a keyboard.
|
||||
|
||||
|
||||
chapter(The future of zsh)
|
||||
|
||||
|
|
Loading…
Reference in a new issue