-
Notifications
You must be signed in to change notification settings - Fork 35
Lexer
Olivier Duhart edited this page Aug 31, 2020
·
13 revisions
CSLY comes with two kinds of lexer :
- a regex based lexer inpired by this post So it's not a very efficient lexer. Indeed this lexer is slow and is the bottleneck of the whole lexer/parser.
- from version 2.0.0, a "GenericLexer" that is an FSM backed lexer designed for performance though restricting the lexer.
- Generic and Regex lexemes can not be mixed
The full lexer configuration is done in a C# enum
:
The enum
is listing all the possible tokens (no special constraint here except public visibility)
Each enum
value has a [Lexeme]
attribute to mark it has a lexeme.
For better description look at the Lexer section
The lexer can be used apart from the parser. It provides a method that returns an IEnumerable<Token<T>>
(where T is the tokens enum
) from a string
IList<Token<T>> tokens = Lexer.Tokenize(source).ToList<Token<T>>();
You can also build only a lexer using :
var source = "some source to be lexed"
ILexer<ExpressionToken> lexer = LexerBuilder.BuildLexer<ExpressionToken>();
var tokens = lexer.Tokenize(source).ToList();