WIP: Use nom-locate to add span information in the AST #129

alixinne · 2020-07-16T14:44:29Z

The main target for this PR is issue #74. By using nom_locate we can rewrite the parsing functions to operate on input with span information added in. I'll post an update when the functionality is usable (for now I've just changed the parser and tests to use nom_locate::LocatedSpan).

This is flagged as WIP to discuss how to integrate this in the current architecture. One potential side-effect is allowing to parse comments (as requested in #116). We could have a way to associate "non-significant" spans (whitespace and comments) to the relevant syntax nodes in order to parse comments relative to the syntax tree, as suggested in https://www.oilshell.org/blog/2017/02/11.html.

This one is more of a stretch and it might be more sensible to be a second iteration on this feature, but this would allow preserving pre-processing directives in non-top-level contexts in the same way, which is currently broken (see issues #117 and #64).

hadronized · 2020-07-23T18:49:54Z

Thanks! I’ll try to dedicate some time to read this asap!

alixinne · 2020-07-24T17:47:39Z

In the last commit I decided to try implementing a ParseInput type. This allows passing in the LocatedSpan from nom_locate as well as a parsing context. The role of the parsing context is for now (to test this strategy) to record the comments as they are successfully parsed.

This requires quite a lot of changes since this introduces lifetimes everywhere, and the ParseContext structure is using a RefCell to allow the ParseInput type to be Copy using only a shared reference. This might not be the prettiest solution but it does work without having to rewrite the entire parser, and also since many of the nom combinators won't accept an FnMut which would appear if we held an exclusive reference to the parsing context.

Using a parsing context will allow replacing the Node structure with one that only holds an Option<NonZeroUSize> to reference a span by its identifier in the parsing context list of spans (not yet implemented). This will make the memory impact of these changes minimal, and also optional (for example, programmatically-generated syntax trees might not have span information so we can just use None and no context).

alixinne · 2020-07-24T17:50:51Z

glsl/src/parse_tests.rs


  let expected = syntax::TranslationUnit(syntax::NonEmpty(vec![block]));

-  assert_eq!(translation_unit(src), Ok(("", expected)));
+  assert_eq_parser(translation_unit, src, Ok(("", expected)), &ctx);


Currently broken on Windows because of how new lines are checked out by git with autocrlf on

alixinne · 2020-08-31T21:50:46Z

Spans are now owned by nodes instead of the parsing context. This simplifies a big part of the code (and probably more when I finish refactoring). I think it's a better idea to have span information working first, and then to optimize how this is handled if it proves necessary.

To avoid having too many changes required on the parsing tests, there is now a NodeContentsEq trait with an associated assert_ceq! macro to compare syntax trees ignoring span information (PartialEq still compares everything in the tree). Its implementation is handled by the proc macro in glsl-impl.

alixinne · 2020-09-01T14:41:55Z

With the proc-macro in place we can also automatically generate type aliases for OldNodeType -> pub type OldNodeType = Node<RenamedOldNodeType> which allows integrating span information in the AST with minimal breakage to the API (the type names remain the same, except when explicitely matching on enum variants. This also avoids having Node<T> everywhere in the API. The into_node() method was also removed using this, so into() will convert from node data into a node with no span information.

alixinne commented Jul 24, 2020

View reviewed changes

Update the parser to use nom_locate

3fedabc

alixinne mentioned this pull request Feb 1, 2021

Try to optimize the expression parser #54

Open

alixinne closed this by deleting the head repository Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Use nom-locate to add span information in the AST #129

WIP: Use nom-locate to add span information in the AST #129

alixinne commented Jul 16, 2020

hadronized commented Jul 23, 2020

alixinne commented Jul 24, 2020

alixinne Jul 24, 2020

alixinne commented Aug 31, 2020

alixinne commented Sep 1, 2020

WIP: Use nom-locate to add span information in the AST #129

WIP: Use nom-locate to add span information in the AST #129

Conversation

alixinne commented Jul 16, 2020

hadronized commented Jul 23, 2020

alixinne commented Jul 24, 2020

alixinne Jul 24, 2020

Choose a reason for hiding this comment

alixinne commented Aug 31, 2020

alixinne commented Sep 1, 2020