Releases: ehwan/RustyLR
v3.3.0
v3.0.0
Full Changelog: v2.7.0...v3.0.0
Breaking API
-
removed
Parser::feed()
,Parser::begin()
Context
can be created withContext::new()
feed()
moved toContext::feed()
-
removed
lalr!
macro- LALR parser can be generated with
%lalr
directive.
- LALR parser can be generated with
-
cleaned README and cargo doc
-
changed LICENSE from MIT to MIT OR APACHE 2.0
v2.7.0
Full Changelog: v2.6.0...v2.7.0
Resolving Ambiguities
In GLR parser, there can be both shift/reduce action possible at the same time, this leads to ambiguity of the grammar.
You can resolve the ambiguties through the reduce action.
- Returning
Result::Err(Error)
from the reduce action will revoke current reducing path. - Setting predefined variable
shift: &mut bool
tofalse
will revoke the shift action with lookahead token.
Consider the following example:
E : E plus E
| E star E
| digit
;
And you are trying to feed 1 + 2 * 3 + 4 eof
to the parser.
There are 5 ways to represent the input sequence:
((1 + 2) * 3) + 4
(1 + (2 * 3)) + 4
1 + ((2 * 3) + 4)
1 + (2 * (3 + 4))
(1 + 2) * (3 + 4)
However, we know the 2nd path is the only correct one,
since the star
has higher precedence than plus
, and both are left-associative.
To resolve the ambiguity, you can write the reduce action as follows:
E : E plus E {
match *lookahead {
'*' => {
// no reduce if the next token is '*'
// this prevent
// E + E / *
// ^ lookahead
// to be E * ...
// ^ (E + E)
return Err("".to_string());
}
_ => {
// revoke the shift action
// this prevent
// E + E / +
// ^ lookahead
// to be E + E + ...
// and enforce the reduced token takes place
// E + ...
// ^ (E + E)
*shift = false;
}
}
}
| E star E {
*shift = false;
}
| Number
;
v2.6.0
Add GLR parser generator
Adding %glr;
directive will switch the parser generation to GLR parser.
With this directive, any Shift/Reduce, Reduce/Reduce conflicts will not be treated as errors.
Resolving Ambiguities
You can resolve the ambiguties through the reduce action.
Simply, returning Result::Err(Error)
from the reduce action will revoke current path.
Add tree
feature
[dependencies]
rusty_lr = { version = "2.6.0", features=["tree"] }
For debugging purpose, tree
feature enables the automatic construction of the syntax tree.
You can call context.to_tree_list()
to get the reduced syntax tree.
let parser = Parser::new();
let mut context = parser.begin();
/// feed tokens...
println!( "{:?}", context.to_tree_list() ); // print tree list with `Debug` trait
println!( "{}", context.to_tree_list() ); // print tree list with `Display` trait
TreeList
├─A
│ └─M
│ └─P
│ └─Number
│ ├─WS0
│ │ └─_space_Star1
│ │ └─_space_Plus0
│ │ ├─_space_Plus0
│ │ │ └─' '
│ │ └─' '
│ ├─_Digit_Plus3
│ │ └─Digit
│ │ └─_TerminalSet2
│ │ └─'1'
│ └─WS0
│ └─_space_Star1
│ └─_space_Plus0
│ └─' '
├─'+'
├─M
│ └─P
... continue
v2.5.0-test
Full Changelog: v2.4.0...v2.5.0
Add GLR parser generator
Adding %glr;
directive will switch the parser generation to GLR parser.
With this directive, any Shift/Reduce, Reduce/Reduce conflicts will not be treated as errors.
Resolving Ambiguities
You can resolve the ambiguties through the reduce action.
Simply, returning Result::Err(Error)
from the reduce action will revoke current path.
Note on GLR Parser
- Still in development, not have been tested enough (patches are welcome!).
- Since there are multiple paths, the reduce action can be called multiple times, even if the result will be thrown away in the future.
- Every
RuleType
andTerm
must implementClone
trait.
- Every
- User must be aware of the point where shift/reduce or reduce/reduce conflicts occur.
Every time the parser diverges, the calculation cost will increase.
Add lookahead
predefined variable in reduce action
You can refer via lookahead
variable in the reduce action, which token caused the reduce action.
v2.4.0
Full Changelog: v2.1.0...v2.4.0
syntax added - token with lookahead
P / term
P / [term1 term_start-term_last]
P / [^term1 term_start-term_last]
PatternP
followed by one of given terminal set. Lookaheads are not consumed.
syntax added - parenthesis group patterns
E : A p=( P1 P2 P3 ) B { A + p.0 + p.1 + p.2 + B } ;
Captures subsequence P1 P2 P3
as single token.
- If none of the patterns hold value, the group itself will not hold any value.
- If only one of the patterns holds value, the group will hold the value of the very pattern. And the variable name will be same as the pattern.
(i.e. IfP1
holds value, and others don't, then(P1 P2 P3)
will hold the value ofP1
, and can be accessed via nameP1
) - If there are multiple patterns holding value, the group will hold
Tuple
of the values. There is no default variable name for the group, you must define the variable name explicitly by=
operator.
NoRuleType: ... ;
I(i32): ... ;
// I will be chosen
A: (NoRuleType I NoRuleType) {
println!( "Value of I: {:?}", I ); // can access by 'I'
I
};
// ( i32, i32 )
B: i2=( I NoRuleType I ) {
println!( "Value of I: {:?}", i2 ); // must explicitly define the variable name
};
v2.1.0
Add feature for build.rs
support
- This buildscripting tool will provide much more detailed, pretty-printed error messages than the procedural macros, at compile time.
- Generated code will contain the same structs and functions as the procedural macros.
- In your actual source code, you can
include!
the generated file.
You can enable the featurebuild
to use in the build script.
[build-dependencies]
rusty_lr = { version = "...", features = ["build"] }
// build.rs
use rusty_lr::build;
fn main() {
println!("cargo::rerun-if-changed=src/parser.rs");
let output = format!("{}/parser.rs", std::env::var("OUT_DIR").unwrap());
build::Builder::new()
.file("src/parser.rs") // path to the input file
// .lalr() // to generate LALR(1) parser
.build(&output); // path to the output file
}
The program searches for %%
in the input file, not the lr1!
, lalr1!
macro. The contents before %%
will be copied into the output file as it is. And the context-free grammar must be followed by %%
.
If there is any errors when building a grammar, it will print error messages to stderr and then panic. This will make the messages to be shown during compilation.
v2.0.0
Full Changelog: v1.6.0...v2.0.0
v2.0.0 Release (stable)
- removed
feed_callback()
- fixed
ParseError
variants
-InvalidNonTerm
is deleted since it never happens. Moved it tounreachable!
-CallbackError
is deleted - add type aliases for
Rule
,State
,ParseError
in generated structs - add exclamation
!
pattern to ignore<RuleType>
of token in production rule
v1.6.0
v1.5.0
Full Changelog: v1.3.0...v1.5.0
add feature fxhash
[dependencies]
rusty_lr = { version = "1", features = ["fxhash"] }
This replace std::collections::HashMap
by FxHashMap
of DFA.
removed trait bound PartialOrd
, Ord
from terminal symbol type.
This enable std::mem::discriminant
can be used for enum Token type