aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/manual.md254
1 files changed, 37 insertions, 217 deletions
diff --git a/doc/manual.md b/doc/manual.md
index aa3d7013..b57b964a 100644
--- a/doc/manual.md
+++ b/doc/manual.md
@@ -1618,8 +1618,9 @@ as running it.
### Requirements for Linux and BSD
-First, Linux and BSD systems need either the [GNU C compiler][] (*gcc*) or
-[Clang][] (*clang*), as well as [GNU Make][] (*make* or *gmake*). BSD users
+First, Linux and BSD systems need either the [GNU C compiler][] (*gcc*) version
+4.9 or later (circa early 2014) or [Clang][] (*clang*), [libstdc++][] 4.9 or
+later (circa early 2014), and [GNU Make][] (*make* or *gmake*). BSD users
additionally need to have [pkg-config][] and [libiconv][] installed. All of
these should be available for your distribution through a package manager. For
example, Ubuntu includes these tools in the "build-essential" package.
@@ -1639,6 +1640,7 @@ users _also_ need "libncursesw5-dev".)
[GNU C compiler]: http://gcc.gnu.org
[Clang]: http://clang.llvm.org/
+[libstdc++]: http://gcc.gnu.org
[GNU Make]: http://www.gnu.org/software/make/
[pkg-config]: http://www.freedesktop.org/wiki/Software/pkg-config/
[libiconv]: http://www.gnu.org/software/libiconv/
@@ -1649,15 +1651,18 @@ users _also_ need "libncursesw5-dev".)
Compiling Textadept on Windows is no longer supported. The preferred way to
compile for Windows is cross-compiling from Linux. In order to do so, you need
-[MinGW][] with the Windows header files. Your package manager should offer them.
+[MinGW][] or [mingw-w64][] version 4.9 or later with the Windows header files.
+Your package manager should offer them.
Note: compiling on Windows requires a C compiler that supports the C99 standard,
-the [GTK+ for Windows bundle][] (2.24 is recommended), and
+a C++ compiler that supports the C++11 standard, a C++ standard library that
+supports C++11, the [GTK+ for Windows bundle][] version 2.24, and
[libiconv for Windows][] (the "Developer files" and "Binaries" zip files). The
terminal (pdcurses) version requires my [win32curses bundle][] instead of GTK+
and libiconv.
[MinGW]: http://mingw.org
+[mingw-w64]: http://mingw-w64.org/
[GTK+ for Windows bundle]: http://www.gtk.org/download/windows.php
[libiconv for Windows]: http://gnuwin32.sourceforge.net/packages/libiconv.htm
[win32curses bundle]: download/win32curses.zip
@@ -1665,10 +1670,15 @@ and libiconv.
### Requirements for Mac OSX
Compiling Textadept on Mac OSX is no longer supported. The preferred way is
-cross-compiling from Linux. In order to do so, you need the
-[Apple Cross-compiler][] binaries.
+cross-compiling from Linux. In order to do so, you need install an [OSX cross
+toolchain][] _with GCC_ version 4.9 or later. You will need to run
+`./build_binutils.sh` _before_ `./build_gcc.sh`. OSX SDK tarballs like
+*MacOSX10.5.tar.gz* can be found readily on the internet.
-[Apple Cross-compiler]: https://launchpad.net/~flosoft/+archive/cross-apple
+Note that building an OSX toolchain can easily take 30 minutes or more and
+ultimately consume nearly 3.5GB of disk space.
+
+[OSX cross toolchain]: https://github.com/tpoechtrager/osxcross
## Compiling
@@ -1729,12 +1739,13 @@ Similarly, `make curses` and `make curses install` installs the curses version.
When cross-compiling from within Linux, first make a note of your MinGW
compiler names. You may have to either modify the `CROSS` variable in the
-"win32" block of *src/Makefile* or append something like "CROSS=i486-mingw32-"
-when running `make`. After considering your MinGW compiler names, run
-`make win32-deps` or `make CROSS=i486-mingw32- win32-deps` to prepare the build
-environment followed by `make win32` or `make CROSS=i486-mingw32- win32` to
-build *../textadept.exe* and *../textadeptjit.exe*. Finally, copy the dll files
-from *src/win32gtk/bin/* to the directory containing the Textadept executables.
+"win32" block of *src/Makefile* or append something like
+"CROSS=i586-mingw32msvc-" when running `make`. After considering your MinGW
+compiler names, run `make win32-deps` or
+`make CROSS=i586-mingw32msvc- win32-deps` to prepare the build environment
+followed by `make win32` or `make CROSS=i586-mingw32msvc- win32` to build
+*../textadept.exe* and *../textadeptjit.exe*. Finally, copy the dll files from
+*src/win32gtk/bin/* to the directory containing the Textadept executables.
Similarly for the terminal version, run `make win32-curses` or its variant as
suggested above to build *../textadept-curses.exe* and
@@ -1869,210 +1880,12 @@ Textadept has a [mailing list][] and a [wiki][].
## Regular Expressions
-Textadept uses [TRE][] as its regular expression library. TRE is a "lightweight,
-robust, and efficient POSIX compliant regexp matching library".
-
-The following is from the [TRE Regexp Syntax][].
-
-This section describes the POSIX 1003.2 extended RE (ERE) syntax as implemented
-by TRE, and the TRE extensions to the ERE syntax. A simple Extended Backus-Naur
-Form (EBNF) style notation is used to describe the grammar.
+Textadept's regular expressions are based on the C++11 standard for ECMAScript.
+There are a number of references for this syntax on the internet including:
-**Alternation operator**
-
- extended-regexp ::= branch
- | extended-regexp "|" branch
-
-An extended regexp (ERE) is one or more branches, separated by `|`. An ERE
-matches anything that matches one or more of the branches.
-
-**Catenation of REs**
-
- branch ::= piece
- | branch piece
-
-A branch is one or more pieces concatenated. It matches a match for the first
-piece, followed by a match for the second piece, and so on.
-
- piece ::= atom
- | atom repeat-operator
- | atom approx-settings
-
-A piece is an atom possibly followed by a repeat operator or an expression
-controlling approximate matching parameters for the atom.
-
- atom ::= "(" extended-regexp ")"
- | bracket-expression
- | "."
- | assertion
- | literal
- | back-reference
- | "(?#" comment-text ")"
- | "(?" options ")" extended-regexp
- | "(?" options ":" extended-regexp ")"
-
-An atom is either an ERE enclosed in parenthesis, a bracket expression, a `.`
-(period), an assertion, or a literal.
-
-The dot (`.`) matches any single character.
-
-Comment-text can contain any characters except for a closing parenthesis `)`.
-The text in the comment is completely ignored by the regex parser and it used
-solely for readability purposes.
-
-**Repeat operators**
-
- repeat-operator ::= "*"
- | "+"
- | "?"
- | bound
- | "*?"
- | "+?"
- | "??"
- | bound ?
-
-An atom followed by `*` matches a sequence of 0 or more matches of the atom. `+`
-is similar to `*`, matching a sequence of 1 or more matches of the atom. An atom
-followed by `?` matches a sequence of 0 or 1 matches of the atom.
-
-A bound is one of the following, where *m* and *n* are unsigned decimal integers
-between 0 and `RE_DUP_MAX`:
-
-1. {*m*,*n*}
-2. {*m*,}
-3. {*m*}
-
-An atom followed by [1] matches a sequence of *m* through *n* (inclusive)
-matches of the atom. An atom followed by [2] matches a sequence of *m* or more
-matches of the atom. An atom followed by [3] matches a sequence of exactly *m*
-matches of the atom.
-
-Adding a `?` to a repeat operator makes the subexpression minimal, or
-non-greedy. Normally a repeated expression is greedy, that is, it matches as
-many characters as possible. A non-greedy subexpression matches as few
-characters as possible. Note that this does not (always) mean the same thing as
-matching as many or few repetitions as possible.
-
-**Bracket expressions**
-
- bracket-expression ::= "[" item+ "]"
- | "[^" item+ "]"
-
-A bracket expression specifies a set of characters by enclosing a nonempty list
-of items in brackets. Normally anything matching any item in the list is
-matched. If the list begins with `^` the meaning is negated; any character
-matching no item in the list is matched.
-
-An item is any of the following:
-
-* A single character, matching that character.
-* Two characters separated by `-`. This is shorthand for the full range of
- characters between those two (inclusive) in the collating sequence. For
- example, `[0-9]` in ASCII matches any decimal digit.
-* A collating element enclosed in `[.` and `.]`, matching the collating element.
- This can be used to include a literal `-` or a multi-character collating
- element in the list.
-* A collating element enclosed in `[=` and `=]` (an equivalence class), matching
- all collating elements with the same primary collation weight as that element,
- including the element itself.
-* The name of a character class enclosed in `[:` and `:]`, matching any
- character belonging to the class. The set of valid names depends on the
- `LC_CTYPE` category of the current locale, but the following names are valid
- in all locales:
- + `alnum` -- alphanumeric characters
- + `alpha` -- alphabetic characters
- + `blank` -- blank characters
- + `cntrl` -- control characters
- + `digit` -- decimal digits (0 through 9)
- + `graph` -- all printable characters except space
- + `lower` -- lower-case letters
- + `print` -- printable characters including space
- + `punct` -- printable characters not space or alphanumeric
- + `space` -- white-space characters
- + `upper` -- upper case letters
- + `xdigit` -- hexadecimal digits
-
-To include a literal `-` in the list, make it either the first or last item, the
-second endpoint of a range, or enclose it in `[.` and `.]` to make it a
-collating element. To include a literal `]` in the list, make it either the
-first item, the second endpoint of a range, or enclose it in `[.` and `.]`. To
-use a literal `-` as the first endpoint of a range, enclose it in `[.` and `.].`
-
-**Assertions**
-
- assertion ::= "^"
- | "$"
- | "\" assertion-character
-
-The expressions `^` and `$` are called "left anchor" and "right anchor",
-respectively. The left anchor matches the empty string at the beginning of the
-string. The right anchor matches the empty string at the end of the string.
-
-An assertion-character can be any of the following:
-
-* `<` -- Beginning of word
-* `>` -- End of word
-* `b` -- Word boundary
-* `B` -- Non-word boundary
-* `d` -- Digit character (equivalent to `[[:digit:]]`)
-* `D` -- Non-digit character (equivalent to `[^[:digit:]]`)
-* `s` -- Space character (equivalent to `[[:space:]]`)
-* `S` -- Non-space character (equivalent to `[^[:space:]]`)
-* `w` -- Word character (equivalent to `[[:alnum:]_]`)
-* `W` -- Non-word character (equivalent to `[^[:alnum:]_]`)
-
-**Literals**
-
- literal ::= ordinary-character
- | "\x" ["1"-"9" "a"-"f" "A"-"F"]{0,2}
- | "\x{" ["1"-"9" "a"-"f" "A"-"F"]* "}"
- | "\" character
-
-A literal is either an ordinary character (a character that has no other
-significance in the context), an 8 bit hexadecimal encoded character (e.g.
-`\x1B`), a wide hexadecimal encoded character (e.g. `\x{263a}`), or an escaped
-character. An escaped character is a `\` followed by any character, and matches
-that character. Escaping can be used to match characters which have a special
-meaning in regexp syntax. A `\` cannot be the last character of an ERE. Escaping
-also allows you to include a few non-printable characters in the regular
-expression. These special escape sequences include:
-
-* `\a` -- Bell character (ASCII code 7)
-* `\e` -- Escape character (ASCII code 27)
-* `\f` -- Form-feed character (ASCII code 12)
-* `\n` -- New-line/line-feed character (ASCII code 10)
-* `\r` -- Carriage return character (ASCII code 13)
-* `\t` -- Horizontal tab character (ASCII code 9)
-
-An ordinary character is just a single character with no other significance, and
-matches that character. A `{` followed by something else than a digit is
-considered an ordinary character.
-
-**Back references**
-
- back-reference ::= "\" ["1"-"9"]
-
-A back reference is a backslash followed by a single non-zero decimal digit *d*.
-It matches the same sequence of characters matched by the *d*th parenthesized
-subexpression.
-
-**Options**
-
- options ::= ["i" "n" "r" "U"]* ("-" ["i" "n" "r" "U"]*)?
-
-Options allow compile time options to be turned on/off for particular parts of
-the regular expression. If the option is specified in the first section, it is
-turned on. If it is specified in the second section (after the `-`), it is
-turned off.
-
-* `i` -- Case insensitive.
-* `n` -- Forces special handling of the new line character.
-* `r` -- Causes the regex to be matched in a right associative manner rather than
- the normal left associative manner.
-* `U` -- Forces repetition operators to be non-greedy unless a `?` is appended.
-
-[TRE]: https://github.com/laurikari/tre
-[TRE Regexp Syntax]: http://laurikari.net/tre/documentation/regex-syntax/
+* [ECMAScript syntax C++ reference](http://www.cplusplus.com/reference/regex/ECMAScript/)
+* [Modified ECMAScript regular expression grammar](http://en.cppreference.com/w/cpp/regex/ecmascript)
+* [Regular Expressions (C++)](https://docs.microsoft.com/en-us/cpp/standard-library/regular-expressions-cpp)
## Lua Patterns
@@ -2277,12 +2090,19 @@ Simply copying the contents of your *~/.textadept/properties.lua* into
Lexers are now written in a more object-oriented way. Legacy lexers are still
supported, but it is recommended that you [migrate them][].
+[migrate them]: api.html#lexer.Migrating.Legacy.Lexers
+
#### Key Bindings Changes
The terminal version's key sequence for `Ctrl+Space` is now `'c '` instead of
`'c@'`.
-[migrate them]: api.html#lexer.Migrating.Legacy.Lexers
+#### Regex Changes
+
+Textadept now uses [C++11's ECMAScript regex syntax](#Regular.Expressions)
+instead of [TRE][].
+
+[TRE]: https://github.com/laurikari/tre
### Textadept 8 to 9