Module pl.lexer

Lexical scanner for creating a sequence of tokens from text.

lexer.scan(s) returns an iterator over all tokens found in the string s. This iterator returns two values, a token type string (such as ‘string’ for quoted string, ‘iden’ for identifier) and the value of the token.

Versions specialized for Lua and C are available; these also handle block comments and classify keywords as ‘keyword’ tokens. For example:

> s = 'for i=1,n do'
> for t,v in lexer.lua(s)  do print(t,v) end
keyword for
iden    i
=       =
number  1
,       ,
iden    n
keyword do

See the Guide for further discussion

Functions

scan (s, matches, filter, options) create a plain token iterator from a string or file-like object.
insert (tok, a1, a2) insert tokens into a stream.
getline (tok) get everything in a stream upto a newline.
lineno (tok) get current line number.
getrest (tok) get the rest of the stream.
get_keywords () get the Lua keywords as a set-like table.
lua (s, filter, options) create a Lua token iterator from a string or file-like object.
cpp (s, filter, options) create a C/C++ token iterator from a string or file-like object.
get_separated_list (tok, endtoken, delim) get a list of parameters separated by a delimiter from a stream.
skipws (tok) get the next non-space token from the stream.
expecting (tok, expected_type, no_skip_ws) get the next token, which must be of the expected type.


Functions

scan (s, matches, filter, options)
create a plain token iterator from a string or file-like object.

Parameters:

  • s: the string
  • matches: an optional match table (set of pattern-action pairs)
  • filter: a table of token types to exclude, by default {space=true}
  • options: a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
insert (tok, a1, a2)
insert tokens into a stream.

Parameters:

  • tok: a token stream
  • a1: a string is the type, a table is a token list and a function is assumed to be a token-like iterator (returns type & value)
  • a2: a string is the value
getline (tok)
get everything in a stream upto a newline.

Parameters:

  • tok: a token stream

Returns:

    a string
lineno (tok)
get current line number.
Only available if the input source is a file-like object.

Parameters:

  • tok: a token stream

Returns:

    the line number and current column
getrest (tok)
get the rest of the stream.

Parameters:

  • tok: a token stream

Returns:

    a string
get_keywords ()
get the Lua keywords as a set-like table. So res[“and”] etc would be true.

Returns:

    a table
lua (s, filter, options)
create a Lua token iterator from a string or file-like object. Will return the token type and value.

Parameters:

  • s: the string
  • filter: a table of token types to exclude, by default {space=true,comments=true}
  • options: a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
cpp (s, filter, options)
create a C/C++ token iterator from a string or file-like object. Will return the token type type and value.

Parameters:

  • s: the string
  • filter: a table of token types to exclude, by default {space=true,comments=true}
  • options: a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
get_separated_list (tok, endtoken, delim)
get a list of parameters separated by a delimiter from a stream.

Parameters:

  • tok: the token stream
  • endtoken: end of list (default ‘)’). Can be ‘\n’
  • delim: separator (default ‘,’)

Returns:

    a list of token lists.
skipws (tok)
get the next non-space token from the stream.

Parameters:

  • tok: the token stream.
expecting (tok, expected_type, no_skip_ws)
get the next token, which must be of the expected type. Throws an error if this type does not match!

Parameters:

  • tok: the token stream
  • expected_type: the token type
  • no_skip_ws: whether we should skip whitespace
generated by LDoc 1.2