mirror of
git://git.code.sf.net/p/zsh/code
synced 2025-01-19 23:41:31 +01:00
89 lines
3.3 KiB
Text
89 lines
3.3 KiB
Text
COMMENT(!MOD!zsh/pcre
|
|
Interface to the PCRE library.
|
|
!MOD!)
|
|
cindex(regular expressions, perl-compatible)
|
|
The tt(zsh/pcre) module makes some commands available as builtins:
|
|
|
|
startitem()
|
|
findex(pcre_compile)
|
|
item(tt(pcre_compile) [ tt(-aimxs) ] var(PCRE))(
|
|
Compiles a perl-compatible regular expression.
|
|
|
|
Option tt(-a) will force the pattern to be anchored.
|
|
Option tt(-i) will compile a case-insensitive pattern.
|
|
Option tt(-m) will compile a multi-line pattern; that is,
|
|
tt(^) and tt($) will match newlines within the pattern.
|
|
Option tt(-x) will compile an extended pattern, wherein
|
|
whitespace and tt(#) comments are ignored.
|
|
Option tt(-s) makes the dot metacharacter match all characters,
|
|
including those that indicate newline.
|
|
)
|
|
findex(pcre_study)
|
|
item(tt(pcre_study))(
|
|
Studies the previously-compiled PCRE which may result in faster
|
|
matching.
|
|
)
|
|
findex(pcre_match)
|
|
item(tt(pcre_match) [ tt(-v) var(var) ] [ tt(-a) var(arr) ] \
|
|
[ tt(-n) var(offset) ] [ tt(-b) ] var(string))(
|
|
Returns successfully if tt(string) matches the previously-compiled
|
|
PCRE.
|
|
|
|
Upon successful match,
|
|
if the expression captures substrings within parentheses,
|
|
tt(pcre_match) will set the array tt(match) to those
|
|
substrings, unless the tt(-a) option is given, in which
|
|
case it will set the array var(arr). Similarly, the variable
|
|
tt(MATCH) will be set to the entire matched portion of the
|
|
string, unless the tt(-v) option is given, in which case the variable
|
|
var(var) will be set.
|
|
No variables are altered if there is no successful match.
|
|
A tt(-n) option starts searching for a match from the
|
|
byte var(offset) position in var(string). If the tt(-b) option is given,
|
|
the variable tt(ZPCRE_OP) will be set to an offset pair string,
|
|
representing the byte offset positions of the entire matched portion
|
|
within the var(string). For example, a tt(ZPCRE_OP) set to "32 45" indicates
|
|
that the matched portion began on byte offset 32 and ended on byte offset 44.
|
|
Here, byte offset position 45 is the position directly after the matched
|
|
portion. Keep in mind that the byte position isn't necessarily the same
|
|
as the character position when UTF-8 characters are involved.
|
|
Consequently, the byte offset positions are only to be relied on in the
|
|
context of using them for subsequent searches on var(string), using an offset
|
|
position as an argument to the tt(-n) option. This is mostly
|
|
used to implement the "find all non-overlapping matches" functionality.
|
|
|
|
A simple example of "find all non-overlapping matches":
|
|
|
|
example(string="The following zip codes: 78884 90210 99513"
|
|
pcre_compile -m "\d{5}"
|
|
accum=()
|
|
pcre_match -b -- $string
|
|
while [[ $? -eq 0 ]] do
|
|
b=($=ZPCRE_OP)
|
|
accum+=$MATCH
|
|
pcre_match -b -n $b[2] -- $string
|
|
done
|
|
print -l $accum)
|
|
)
|
|
enditem()
|
|
|
|
The tt(zsh/pcre) module makes available the following test condition:
|
|
|
|
startitem()
|
|
findex(pcre-match)
|
|
item(var(expr) tt(-pcre-match) var(pcre))(
|
|
Matches a string against a perl-compatible regular expression.
|
|
|
|
For example,
|
|
|
|
example([[ "$text" -pcre-match ^d+$ ]] &&
|
|
print text variable contains only "d's".)
|
|
|
|
pindex(REMATCH_PCRE)
|
|
pindex(NO_CASE_MATCH)
|
|
If the tt(REMATCH_PCRE) option is set, the tt(=~) operator is equivalent to
|
|
tt(-pcre-match), and the tt(NO_CASE_MATCH) option may be used. Note that
|
|
tt(NO_CASE_MATCH) never applies to the tt(pcre_match) builtin, instead use
|
|
the tt(-i) switch of tt(pcre_compile).
|
|
)
|
|
enditem()
|