Skip to content

Releases: ehwan/RustyLR

v3.3.0

23 Oct 05:20
Compare
Choose a tag to compare

Full Changelog: v3.0.0...v3.3.0

v3.3.0

  • add error and tree feature for debug purpose
  • add Context::can_feed() for checking if given terminal can be feeded to current status without chaning context

v3.0.0

02 Sep 15:40
Compare
Choose a tag to compare

Full Changelog: v2.7.0...v3.0.0

Breaking API

  • removed Parser::feed(), Parser::begin()

    • Context can be created with Context::new()
    • feed() moved to Context::feed()
  • removed lalr! macro

    • LALR parser can be generated with %lalr directive.
  • cleaned README and cargo doc

  • changed LICENSE from MIT to MIT OR APACHE 2.0

v2.7.0

01 Sep 05:04
Compare
Choose a tag to compare

Full Changelog: v2.6.0...v2.7.0

Resolving Ambiguities

In GLR parser, there can be both shift/reduce action possible at the same time, this leads to ambiguity of the grammar.
You can resolve the ambiguties through the reduce action.

  • Returning Result::Err(Error) from the reduce action will revoke current reducing path.
  • Setting predefined variable shift: &mut bool to false will revoke the shift action with lookahead token.

Consider the following example:

E : E plus E
  | E star E
  | digit
  ;

And you are trying to feed 1 + 2 * 3 + 4 eof to the parser.
There are 5 ways to represent the input sequence:

  • ((1 + 2) * 3) + 4
  • (1 + (2 * 3)) + 4
  • 1 + ((2 * 3) + 4)
  • 1 + (2 * (3 + 4))
  • (1 + 2) * (3 + 4)

However, we know the 2nd path is the only correct one,
since the star has higher precedence than plus, and both are left-associative.

To resolve the ambiguity, you can write the reduce action as follows:

E : E plus E {
      match *lookahead {
          '*' => {
              // no reduce if the next token is '*'
              // this prevent
              // E + E   /   *
              //             ^ lookahead
              // to be  E *  ...
              //        ^ (E + E)
              return Err("".to_string());
          }
          _ => {
              // revoke the shift action
              // this prevent
              // E + E   /  +
              //            ^ lookahead
              // to be E + E +  ...
              // and enforce the reduced token takes place
              // E + ...
              // ^ (E + E)
              *shift = false;
          }

      }
  }
  | E star E {
      *shift = false;
  }
  | Number
  ;

v2.6.0

29 Aug 10:53
Compare
Choose a tag to compare

Add GLR parser generator

Adding %glr; directive will switch the parser generation to GLR parser.
With this directive, any Shift/Reduce, Reduce/Reduce conflicts will not be treated as errors.

Resolving Ambiguities

You can resolve the ambiguties through the reduce action.
Simply, returning Result::Err(Error) from the reduce action will revoke current path.


Add tree feature

[dependencies]
rusty_lr = { version = "2.6.0", features=["tree"] }

For debugging purpose, tree feature enables the automatic construction of the syntax tree.
You can call context.to_tree_list() to get the reduced syntax tree.

let parser = Parser::new();
let mut context = parser.begin();
/// feed tokens...
println!( "{:?}", context.to_tree_list() ); // print tree list with `Debug` trait
println!( "{}", context.to_tree_list() );   // print tree list with `Display` trait
TreeList
├─A
│ └─M
│   └─P
│     └─Number
│       ├─WS0
│       │ └─_space_Star1
│       │   └─_space_Plus0
│       │     ├─_space_Plus0
│       │     │ └─' '
│       │     └─' '
│       ├─_Digit_Plus3
│       │ └─Digit
│       │   └─_TerminalSet2
│       │     └─'1'
│       └─WS0
│         └─_space_Star1
│           └─_space_Plus0
│             └─' '
├─'+'
├─M
│ └─P
... continue

v2.5.0-test

24 Aug 11:05
Compare
Choose a tag to compare
v2.5.0-test Pre-release
Pre-release

Full Changelog: v2.4.0...v2.5.0


Add GLR parser generator

Adding %glr; directive will switch the parser generation to GLR parser.
With this directive, any Shift/Reduce, Reduce/Reduce conflicts will not be treated as errors.

Resolving Ambiguities

You can resolve the ambiguties through the reduce action.
Simply, returning Result::Err(Error) from the reduce action will revoke current path.

Note on GLR Parser

  • Still in development, not have been tested enough (patches are welcome!).
  • Since there are multiple paths, the reduce action can be called multiple times, even if the result will be thrown away in the future.
    • Every RuleType and Term must implement Clone trait.
  • User must be aware of the point where shift/reduce or reduce/reduce conflicts occur.
    Every time the parser diverges, the calculation cost will increase.

Add lookahead predefined variable in reduce action

You can refer via lookahead variable in the reduce action, which token caused the reduce action.


v2.4.0

21 Aug 07:21
Compare
Choose a tag to compare

Full Changelog: v2.1.0...v2.4.0

syntax added - token with lookahead

  • P / term
  • P / [term1 term_start-term_last]
  • P / [^term1 term_start-term_last]
    Pattern P followed by one of given terminal set. Lookaheads are not consumed.

syntax added - parenthesis group patterns

E : A p=( P1 P2 P3 ) B { A + p.0 + p.1 + p.2 + B } ;

Captures subsequence P1 P2 P3 as single token.

  • If none of the patterns hold value, the group itself will not hold any value.
  • If only one of the patterns holds value, the group will hold the value of the very pattern. And the variable name will be same as the pattern.
    (i.e. If P1 holds value, and others don't, then (P1 P2 P3) will hold the value of P1, and can be accessed via name P1)
  • If there are multiple patterns holding value, the group will hold Tuple of the values. There is no default variable name for the group, you must define the variable name explicitly by = operator.
NoRuleType: ... ;

I(i32): ... ;

// I will be chosen
A: (NoRuleType I NoRuleType) {
    println!( "Value of I: {:?}", I ); // can access by 'I'
    I
};

// ( i32, i32 )
B: i2=( I NoRuleType I ) {
    println!( "Value of I: {:?}", i2 ); // must explicitly define the variable name
};

v2.1.0

18 Aug 06:15
Compare
Choose a tag to compare

Add feature for build.rs support

  • This buildscripting tool will provide much more detailed, pretty-printed error messages than the procedural macros, at compile time.
  • Generated code will contain the same structs and functions as the procedural macros.
  • In your actual source code, you can include! the generated file.
    You can enable the feature build to use in the build script.
[build-dependencies]
rusty_lr = { version = "...", features = ["build"] }
// build.rs
use rusty_lr::build;

fn main() {
    println!("cargo::rerun-if-changed=src/parser.rs");

    let output = format!("{}/parser.rs", std::env::var("OUT_DIR").unwrap());
    build::Builder::new()
        .file("src/parser.rs") // path to the input file
    //  .lalr()                // to generate LALR(1) parser
        .build(&output);       // path to the output file
}

The program searches for %% in the input file, not the lr1!, lalr1! macro. The contents before %% will be copied into the output file as it is. And the context-free grammar must be followed by %%.

If there is any errors when building a grammar, it will print error messages to stderr and then panic. This will make the messages to be shown during compilation.

image

v2.0.0

14 Aug 13:50
Compare
Choose a tag to compare

Full Changelog: v1.6.0...v2.0.0

v2.0.0 Release (stable)

  • removed feed_callback()
  • fixed ParseError variants
    - InvalidNonTerm is deleted since it never happens. Moved it to unreachable!
    - CallbackError is deleted
  • add type aliases for Rule, State, ParseError in generated structs
  • add exclamation ! pattern to ignore <RuleType> of token in production rule

v1.6.0

12 Aug 15:39
Compare
Choose a tag to compare

Full Changelog: v1.5.0...v1.6.0

executable rustylr now prints pretty error message

Screenshot 2024-08-12 at 11 06 03 PM

v1.5.0

11 Aug 13:22
Compare
Choose a tag to compare

Full Changelog: v1.3.0...v1.5.0

add feature fxhash

[dependencies]
rusty_lr = { version = "1", features = ["fxhash"] }

This replace std::collections::HashMap by FxHashMap of DFA.

removed trait bound PartialOrd, Ord from terminal symbol type.

This enable std::mem::discriminant can be used for enum Token type