-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntactical extensions for GRIN #32
Comments
Please could you give examples how they are now and what will be the intended result? |
How FName will fit to the current analyses? Currently we have unified Names everywhere. When projecting back to source code, type env, chould this introduce difficulties? |
Last return Lets use the |
This design decision was taken to be compatible with the recursion schemes library, which already I was considering introduce a type safe layer which would use a GADT to construct only meaningful programs, but I rejected the idea out of pure laziness (pun intended). Maybe it is a good idea to have that layer for the future frontends, forcing them (another pun intended) to create a correct syntax. But for us GRIN contributors I think it is overkill to create such a type-safety when implementing transformations, analyses, at least for now, as we plan to refactor the IR. If we want to do such a type safe internal representation, I would suggest to implement it when we plan to do the first official release. |
The arguments to the respective data constructors would require |
During pretty printing, I don't think it would introduce any difficulties. These are just wrappers. |
Alright. |
Makes sense. I did not consider this. |
Naming: |
Please could you update the description of the proposal? |
Exactly. Constants could only be introduced in the above form. Everywhere else, we would be just passing variables around. |
The separation of external and ordinary function application has been scrapped. The reason being is that it makes writing code for function application cumbersome. Whenever we need the fact that the function is an external, we will usually have to look it up from the external mapping anyway. This lookup will determine whether the function is an external or not. This means, that the pattern match on the type of the function application is redundant. |
Extended the proposal with the removal of low-level GRIN construct.
|
As part of the ICFP tutorial there is a GADT, which helps to create syntactically expected GRIN programs. The rule indexed expressions show what can be combined how. This approach will live in the Syntax.GADT module. This change can be found in the #44 PR. |
I added a brief description for the GADT-style AST, and the new datalog based model for the AST. The DL model will need named case alternatives. |
Added a detailed explanation for named case alternatives. |
Syntactical extensions for GRIN
This document proposes new syntactic constructs for the GRIN IR, as well as modifications to some of the already existing ones.
Motivation
The current syntax of GRIN can pose some difficulties for analyses. As an example, the created-by analysis requires each binding to have a variable pattern. That is, each binding must have name. Also, analyzing programs where every intermediate value has an explicit name is much easier.
Furthermore, the current Haskell definition of the syntax allows certain erroneous programs. Currently, we rely on the linter to reject these programs, but a more rigid definition of the syntax could prevent errors statically.
Datalog based analysis
Currently, the analyses are implemented via compiled abstract interpretation, where the abstarct program is generated from Haskell and it can be run in both Haskell and C++. However, in the future, we plan to transition to Soufflé Datalog based analyses. In order to build a Datalog model for the GRIN AST, every single instruction must have a name. It is not a question of convenince, the model requires it. This is yet another reason to give an explicit name to everything in the AST.
Note: Soufflé has many advantages over our current implementation, but this is out of the scope of this proposal.
Naming
These syntactic constraints mainly deal with the naming of intermediate values. They make sure, that most of the currently available values can always be referred to by an explicit name. In other words, constants could only be introduced through the
pure
function. Everywhere else, we would be passing variables around. All of these constraints can be implemented trivially. Namely, we only have to change the below constructions so that they require aName
instead of a value.store
argumentAn observant reader might have noticed, that the pattern matching construct for bindings isn't addressed above. That is because we will deal with them separately. Also, note that
fetch
andupdate
already only accept variable names as arguments.New
These syntactic constructs are completely new to the GRIN language.
@patterns
As mentioned earlier, each binding should have a name, the variable it binds. Currently, the syntax allows for pattern bindings, which do not bind a variable to the left-hand side computation. This problem could be fixed by introducing @patterns.
@patterns combine the flexibility of value patterns with the rigidness of the naming convention. By removing value patterns from the language, and introducing @patterns, we could achieve the same expressive power as before, meanwhile making sure that everything can be referenced by an explicit name.
Note that unlike the @pattern in Haskell, here the variable name and the pattern are swapped. This is to keep the syntax consistent with named case alternatives.
Furthermore, we should restrict the available patterns only to nodes. Currently literals and even variable names are allowed. This is too general.
Function application(scrapped)Currently, normal functions and externals share the same syntactic node for application. Externals are stored together with GRIN programs, and they are differentiated from normal functions by looking up the applied function's name in the stored external list. This makes analyzing function applications quite inconvenient.
We could introduce different syntactic nodes for normal functions and externals, but that would pose an unnecessary overhead on analyses and transformations in certain use cases. Instead, we will introduce different types of names for functions and externals. The application node will differentiate functions from externals by wrapping their names in different data constructors.
Named case alternatives
To model the GRIN AST in Datalog, each syntactic construct must have an identifier. Currently, the alternatives in case expression don't have unique identifiers. We can solve this issue by naming each alternative.
The syntax would be similar to that of the @patterns, but the variable name and the pattern would be swapped. This is to improve the readabilityo of the code: with long variable names, the patterns can get lost. Also, there could be an arbitrary number of spaces between the @ symbol and the pattern/variable. This is also to improve readability though proper indentation. Readability is only important for unit tests, not for generated GRIN code.
Semantics
The names of the case alternatives would be the scrutinee restricted to the given alternative. For example, in the above example, we would know that
v1
must be a node costructed with theCNil
tag. Similarly,v2
is also a node, but is constructed with theCCons
tag.Named alternatives would make dataflow more explicit for case expressions. Currently, the analyses restrict the scrutinee when they are interpreting a given alternative (they are path sensitive), but with these changes, that kind of dataflow would made visible.
Structural
These modifications impose certain structural constraints on GRIN programs.
Last pure
Binding sequences should always end in a
pure
. This will make the control flow a little bit more explicit. However, this change could be impacted by the introduction of basic blocks. It might be wiser to delay this change.Program, function, definition
The above-mentioned notions should be different syntactic constructs. They should form a hierarchy, so it is always clear whether a transformation/analysis works on an entire GRIN program, a single function, or only on an expression.
Currently available, but has to be more precise
These modifications would turn currently linter-checked properties into static constraints. For example, only real expressions could appear on the left-hand side of a binding, or the alternatives of a case expression could really only be alternatives.
No longer needed
By introducing the above mentioned syntactic modification, some currently available constraints become redundant:
LPat
,SimpleVal
.Also, some of the low-level GRIN constructs will be removed from the AST:
VarTag
node and indexedFetch
. These constructs were originally needed for RISC code generation. Since then, GRIN transitioned to LLVM, and over the years these constructs proved to be unnecessary.Questions
Parsing function applications(scrapped)The current syntax does not differentiate normal function application from external function application. The new syntax will differentiate them. This means, we must decide whether a name corresponds to a normal function or to an external while parsing. Two solutions that might work are the following.
Use previoulsy parsed information(scrapped)We could add some sort of state to the parsers that keeps track of the available externals. Since externals can be added through the
PrimOpsPrelude
, we would also need to pass those as an extra argument to the parser.Naming convention for externals(scrapped)We could introduce a new naming convention for externals. They would always begin with an undersocre.
Implicit parsing of unit patterns
Currently, the parser can implictly parse bindings that have a unit pattern. The most common example for this is
update
. The string"update p v"
is parsed as if it was"() <- update p v"
. The new syntax does not allow unit patterns. We must think of an alternative way to express this. Maybe a wild card patterns?Also, wild card patterns could be parsed implicitly as well:
Answer
Every instruction must have a name, even function calls that only return a unit. Binding their result to a variable is necessary for unique identification, hence, the idea of unit and wildcard patterns are scrapped.
Future
Basic blocks
As for the future, we plan to introduce basic blocks into the language. This will bring its own syntactic modifications, but they will be mostly independent of the above-discussed changes.
GADT-style AST
Also, GRIN will have a GADT-style based AST as well. This is for the front end users of GRIN who want to generate well-structured GRIN programs. It will make all syntactic restrictions explicit, hence it will force the user to build a syntactically sound AST.
Represesnting the AST with GADTs has a serious drawback though: recursion schemes don't support GADTs at the moment. This means that neither the currently implemented analyses, nor the transformations will work on the GADT-style AST. This is the reason why we will only use it for the GRIN code generation. The front end will generate a GADT-style GRIN AST, then we will transform it into a plain old ADT, so that we continue working with it.
Prototype AST
The below examples only include the changes regarding to the naming convention.
New constructs
Changes
The text was updated successfully, but these errors were encountered: