Module pl.lexer
Lexical scanner for creating a sequence of tokens from text.
lexer.scan(s)
returns an iterator over all tokens found in the
string s
. This iterator returns two values, a token type string
(such as ‘string’ for quoted string, ‘iden’ for identifier) and the value of the
token.
Versions specialized for Lua and C are available; these also handle block comments and classify keywords as ‘keyword’ tokens. For example:
> s = 'for i=1,n do'
> for t,v in lexer.lua(s) do print(t,v) end
keyword for
iden i
= =
number 1
, ,
iden n
keyword do
See the Guide for further discussion
Functions
scan (s, matches, filter, options) | create a plain token iterator from a string or file-like object. |
insert (tok, a1, a2) | insert tokens into a stream. |
getline (tok) | get everything in a stream upto a newline. |
lineno (tok) | get current line number. |
getrest (tok) | get the rest of the stream. |
get_keywords () | get the Lua keywords as a set-like table. |
lua (s, filter, options) | create a Lua token iterator from a string or file-like object. |
cpp (s, filter, options) | create a C/C++ token iterator from a string or file-like object. |
get_separated_list (tok, endtoken, delim) | get a list of parameters separated by a delimiter from a stream. |
skipws (tok) | get the next non-space token from the stream. |
expecting (tok, expected_type, no_skip_ws) | get the next token, which must be of the expected type. |
Functions
- scan (s, matches, filter, options)
-
create a plain token iterator from a string or file-like object.
Parameters:
s
: the stringmatches
: an optional match table (set of pattern-action pairs)filter
: a table of token types to exclude, by default {space=true}options
: a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
- insert (tok, a1, a2)
-
insert tokens into a stream.
Parameters:
tok
: a token streama1
: a string is the type, a table is a token list and a function is assumed to be a token-like iterator (returns type & value)a2
: a string is the value
- getline (tok)
-
get everything in a stream upto a newline.
Parameters:
tok
: a token stream
Returns:
-
a string
- lineno (tok)
-
get current line number.
Only available if the input source is a file-like object.Parameters:
tok
: a token stream
Returns:
-
the line number and current column
- getrest (tok)
-
get the rest of the stream.
Parameters:
tok
: a token stream
Returns:
-
a string
- get_keywords ()
-
get the Lua keywords as a set-like table.
So
res[“and”]
etc would betrue
.Returns:
-
a table
- lua (s, filter, options)
-
create a Lua token iterator from a string or file-like object.
Will return the token type and value.
Parameters:
s
: the stringfilter
: a table of token types to exclude, by default {space=true,comments=true}options
: a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
- cpp (s, filter, options)
-
create a C/C++ token iterator from a string or file-like object.
Will return the token type type and value.
Parameters:
s
: the stringfilter
: a table of token types to exclude, by default {space=true,comments=true}options
: a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
- get_separated_list (tok, endtoken, delim)
-
get a list of parameters separated by a delimiter from a stream.
Parameters:
tok
: the token streamendtoken
: end of list (default ‘)’). Can be ‘\n’delim
: separator (default ‘,’)
Returns:
-
a list of token lists.
- skipws (tok)
-
get the next non-space token from the stream.
Parameters:
tok
: the token stream.
- expecting (tok, expected_type, no_skip_ws)
-
get the next token, which must be of the expected type.
Throws an error if this type does not match!
Parameters:
tok
: the token streamexpected_type
: the token typeno_skip_ws
: whether we should skip whitespace