PCRE (Perl Compatible Regular Expression) Textprozessor (cli)
PCRE (Perl Compatible Regular Expression) text processor (cli)
A small CLI program that allows to replace/extract text from stdin
using patterns
defined by the command line to stdout
. The patterns are as known from
Perl
/PCRE
.
(This application is basically the sample application of the class published
here. It links statically against libpcre
/libpcrecpp
and
dynamically against libc
/libstdc++
.)
Ein kleines CLI-Programm zum Suchen/Ersetzen von text (von stdin
) mittels
Suchmustern (nach stdout
). Die Suchmuster sind bekannt von
Perl
/PCRE
.
(Das Programm ist die Beispielanwendung der hier veröffentlichten
Klasse. Das Programm ist statisch gegen libpcre
/libpcrecpp
gelinkt und hat
daher außer libc
und libstdc++
keine Abhänigkeiten.)
Binaries (download)
Kompilierte Programme (Download)
Help text
Hilfe-text
NAME
pcref
SYNOPSIS
pcref [-h|--help|-v] '<pattern1>' ['<pattern2>'] [...]
DESCRIPTION
Perl Compatible Regular Expression text Filter.
The program allows performing match-extract / search-replace operations
with pattern known from PCRE (or Perl: stdout = stdin =~ <pattern>), where
one of [`/`, `|`, `#`] can be chosen as pattern separator.
`pcref 'm/pattern/'` or `pcref '/pattern/'`: Prints first match to
stdout.
`pcref 's/pattern/replace/': Replaces all occurrences ot the pattern with
the replace text (accepted subexpression references are `\1`,`\2`, etc,
and `$1`,`$2` etc, both have the same meaning).
Modifiers:
Modifiers are appended to the pattern as known from Perl / PCRE
(`pcref /pattern/modifiers` or `pcref/pattern/replace/modifiers`).
`i` Ignore case (as in Perl).
`x` Permit whitespaces and comments in the pattern (as in Perl).
`m` Multi line: `^` and `$` match start/end of the whole text (as in Perl).
`s` `.` matches newlines as well (as in Perl).
`g` Replace not only first match, but all matches.
`$` `$` matches only at the end (else normal dollar sign).
`!` Meaning of `*?` and `*` swapped (`*?` now consumes as much as possible).
`*` Disable parenthesise (subexpression) matching.
`X` Extra (PCRE strict escape parsing).
`U` Disable UTF support.
Sequential execution:
You can specify multiple expressions as command line arguments, they
will be processed sequentially, and the final result will be printed
to stdout. E.g.
echo 'ABC DEF YES' | pcref 's/ABC[\s]?/X/' '/(\w+)\s(\w+)/$1=$2/'
( --> XDEF YES) ( --> XDEF=YES)
Examples:
- Remove tailing spaces of each line:
pcref 's/^(.*?)[\s]+(\n|$)/$1$2/gm'
- Extract body from HTML:
pcref '|< [\s]* body .*? > (.*?) <[\s]* / [\s]* body |$1|smix'
- Section of an ini-file to json object:
pcref '/(.*)/\n$1\n/sm' \
'/.*? \n \[SECTION_NAME\] [\s]* (.*?) \n (\[|$) /$1/smix' \
's/^([\w]+) [\s]* = [\s]* (.*) ($|\n)/$1: "$2"/imgx' \
's#\n#, #imgx' \
'm|(.*)|{ $1 }|g'
Annotations:
- The replace function is non-global by default. You can switch on
to replace all matches one using the modifier `g`.
- The match operation (optionally) takes a replace part to rearrange
the matched string using subexpressions (`m/<pattern>/replace/mods`),
so that the match operation is practically an extract operation.
- Replace returns the input string if no pattern matches, extract an
empty string if a pattern does not match.
- The program always reads the complete text (to memory) before processing.
Hence, large texts cause a higher memory consumption.
- On error the program does not return any text to stdout.
- The program understands common escape sequences in the replace text:
\n, \r, \t, \v, \f, \a, \b.
ARGUMENTS
-h, --help Show this help
-v, --verbose Increased verbosity (outputs to stderr)
-vv, --debug High verbosity (debug information if compiled with)
<pattern> A perl compatible regex pattern as described above.
RETURN VALUES
returns 0 on success,
1 on error
SEE ALSO
perlre, pcregrep, grep, egrep, sed, awk, ex
pcref v1.0, stfwi; credits to libpcre author(s).