Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use smart-pointer and generate C++ code for bison/flex #410

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

hangingman
Copy link

@hangingman hangingman commented Jan 29, 2022

ref #293

PR description:
I modified large amount of source code related with issue #293 .
I would like to take a review. If there are any problems in this PR, please point them out.
And If it's OK, tell me what unit tests should be added.

PR Summary

  • bnfc

    • Changes on Makefile and file extension
    • Switching C++ AST/Test code generator by flag
    • use using instead of typedef
    • Introduce std::shared_ptr
    • Modified Bison/Flex code genrator
  • bnfc-system-tests

    • Make enable filtering of bnfc-system-tests

PR contents

  • Makefile and file extension changes:

    • If bnfc not received --ansi flag, Makefile would use -std=c++14 compile option otherwise use --ansi flag.
    • If bnfc not received --ansi flag, bnfc will generate .cc, .hh, .yy, .ll files for bison/flex. In the nutshell, it represents C++ things.
  • C++ AST/Test code generator switching

    • If bnfc not received --ansi flag, bnfc would generate codes using smart-pointer (std::shared_ptr), otherwise bnfc generate codes same as before.
  • use using instead of typedef

/********************   TypeDef Section    ********************/

using Integer = int;
using Char = char;
using Double = double;
using String = std::string;
using Ident = std::string;

using Hex = std::string;
using Label = std::string;
  • Introduce std::shared_ptr, like following:
    • So far, just my personal opinion, bnfc generated C++ code requires users to recursively destroy created instances after parsed AST. Because, AST classes hold new-ed class instances and there is no code to release those instances. (... but it was not so required because after parsed something and exited the program, memory will be released by OS. So I wrote it as a my personal opinion.)
    • After introduced std::unique_ptr, it's not necessary to release memory manually. It's convenient. (std::unique_ptr is not copy-able. This is not good for storing parsed AST and returning it. On the other hand, std::shared_ptr is copy-able , furthermore it will be automatically released after their reference count will be 0. So I introduced shared_ptr. We can find several implmentations using std::shared_ptr as bison's semantic type in the world GitHub. So, this is common implementation.)
/********************   Abstract Syntax Classes    ********************/

class Program : public Visitable
{
public:
    virtual std::shared_ptr<Program> clone() const = 0;

};

...

class Prog : public Program
{
public:
    std::shared_ptr<ListStatement> liststatement_;

    Prog(std::shared_ptr<ListStatement> p1)
    : Program(), liststatement_{p1}
    {};


    virtual void accept(Visitor *v) override;
    std::shared_ptr<Program>  clone() const override;
};
  • I modified Impl class as simple.
void Prog::accept(Visitor *v)
{
    v->visitProg(this);
}

std::shared_ptr<Program> Prog::clone() const
{
    return std::make_shared<Prog>(*this);
}
  • Bison/Flex code genrator modified:
    • To generate C++ code by bison, I selected lalr1.cc as skeleton.
    • Use variant instead of union. It requires bison >= 3.2.
    • bnfc will generate "Driver" class to handle parser/lexer classes. It will hold parsed AST as their field. I referred Flex and Bison in C++ to implement it.

bison's yy file will be like following; You can find bison's yy file using variant, and shared_ptr.

/* Generate header file for lexer. */
%defines "bison.hh"
%define api.namespace {Xxxx}
/* Specify the namespace for the C++ parser class. */

/* Reentrant parser */
/* lalr1.cc always pure parser. needless to define %define api.pure full */

%define api.parser.class {XxxxParser}
%code top {
#include <memory>
}
%code requires{
#include "absyn.hh"

    namespace Xxxx {
        class XxxxScanner;
        class XxxxDriver;
    }
}
%parse-param { XxxxScanner  &scanner  }
%parse-param { XxxxDriver  &driver  }

%locations
/* variant based implementation of semantic values for C++ */
%require "3.2"
%define api.value.type variant
/* 'yacc.c' does not support variant, so use skeleton 'lalr1.cc' */
%skeleton "lalr1.cc"

%code{
/* Begin C++ preamble code */
#include <algorithm> /* for std::reverse */
#include <iostream>
#include <cstdlib>
#include <fstream>

/* include for all driver functions */
#include "driver.hh"

#undef yylex
#define yylex scanner.lex
}


%token              _ERROR_

...

%type <std::shared_ptr<Program>> Program
%type <std::shared_ptr<ListStatement>> ListStatement

%start Program

%%

Program : ListStatement { $1->reverse();$$ = std::make_shared<Prog>($1); driver.program_ = $$; }
;
ListStatement : Statement { $$ = std::make_shared<ListStatement>(); $$->cons($1); driver.liststatement_ = $$; }
  | Statement ListStatement { $2->cons($1); $$ = $2; driver.liststatement_ = $$; }
  • I modified bnfc to generate entrypoint has std::istream &stream as their parameters. Because it can handle file contents and stdin data. Entrypoints will be generated in "Driver" class.
namespace Xxxx{

class  XxxxDriver{
public:

...
    std::shared_ptr<Program> pProgram(std::istream &stream);
  • Use smart pointer in test C++ source
    • Finally, we can see smart-pointer edition of Test.cc ( Compiler Front-End Test )
int main(int argc, char ** argv)
{
    int quiet = 0;
    char *filename = NULL;

    if (argc > 1) {
        if (strcmp(argv[1], "-s") == 0) {
            quiet = 1;
            if (argc > 2) {
                filename = argv[2];
            }
        } else {
            filename = argv[1];
        }
    }

    /* The default entry point is used. For other options see Parser.H */
    std::shared_ptr<Program> parse_tree = nullptr;
    try {

        auto driver = std::make_unique<Xxxx::XxxxDriver>();
        if (filename) {
            std::ifstream input(filename);
            if ( ! input.good() ) {
                usage();
                exit(1);
            }
            parse_tree = driver->pProgram(input);
        } else {
            parse_tree = driver->pProgram(std::cin);
        }

    } catch( parse_error &e) {
        std::cerr << "Parse error on line " << e.getLine() << "\n";
    }

    if (parse_tree)
    {
        printf("\nParse Successful!\n");
        if (!quiet) {
            printf("\n[Abstract Syntax]\n");
            auto s = std::make_unique<ShowAbsyn>(ShowAbsyn());
            printf("%s\n\n", s->show(parse_tree.get()));
            printf("[Linearized Tree]\n");
            auto p = std::make_unique<PrintAbsyn>(PrintAbsyn());
            printf("%s\n\n", p->print(parse_tree.get()));
      }

      return 0;
    }
    return 1;
}

@andreasabel
Copy link
Member

Thanks for the update, @hangingman ! I will have a look soon, but I decided to make 2.9.4 a conservative release, and only then merge new features.

@andreasabel andreasabel force-pushed the issues/293_enhance_new_cpp_poc_2 branch from 5f6eebb to cc2c700 Compare March 3, 2022 10:36
@andreasabel
Copy link
Member

Thanks @hangingman for the PR!
Please make sure the tests still work. You can run the testsuite from the /testing directory:

/testing$ make

I pushed a commit that puts the C++ tests first.

At the moment, they error if flag -l is supplied (which adds position information).

Also, the plain --ansi version fails to compile:

Calc.y:40:33: error: typedef redefinition with different types ('struct yy_buffer_state *' vs
      'struct calc__buffer_state *')
typedef struct yy_buffer_state *YY_BUFFER_STATE;

(This is for bnfc -m --cpp --ansi /examples/Calc.cf && make.)

@andreasabel andreasabel added AST Concerning the generated abstract syntax C++ labels Mar 3, 2022
@hangingman
Copy link
Author

@andreasabel Thank you !

At the moment, they error if flag -l is supplied (which adds position information).

Got it. I'll look into it.

Also, the plain --ansi version fails to compile:

OK, I will fix it.

@andreasabel
Copy link
Member

andreasabel commented Mar 13, 2022

Looks like Haskell-CI is currently broken because of

I'll fix a push to master and then this PR can be rebased onto master.

@hangingman
Copy link
Author

@andreasabel Oh, I got it.

@andreasabel
Copy link
Member

I pushed the fix for #416 to master just now.

@hangingman hangingman force-pushed the issues/293_enhance_new_cpp_poc_2 branch from 67111e6 to a700506 Compare March 13, 2022 13:16
@andreasabel
Copy link
Member

Sorry broke the stack build by accident, but now both builds should work. (Fixed in #417.)

hangingman and others added 7 commits March 14, 2022 08:22
* Modify include file and Parser header file name as psXXX
* Add compile option to `src/BNFC/Backend/CPP/Makefile.hs`
* Modify C++ generator module for mainly using unique_ptr
* Implement switching bison %union and variant
* Implement C++ parser with driver setting
* Implement C++ driver/scanner class
* Fix file extension
* Fix token type
* Fix flex buffer related code
* Refactor makefile compile option
* Implement reverse function in ListXXX class
Add member fields for Driver class to receive parsed objects
Update CFtoBisonC and CFtoSTLAbs for wrapping std::uniuqe_ptr
Update C++ Abs files and Printer class
Update PrettyPrinter code for using smart-ptr
Enable to compile testcc
* add driver entry codes
* modify SkelSTL.hs for header file extension
* Enable debugging of lexer/parser

Change bison yy file to use shared_ptr instead of unique_ptr
Fix following:
* Dont' generate `override` keyword on `--ansi` mode 
* Remove doubly defined YY_BUFFER_STATE  in `--ansi` mode
* Switch `--ansi` and `-std=c++14` mode in `source/src/BNFC/Backend/CPP/PrettyPrinter.hs`
* Fix Parser.h, switch entry points interfaces
* Use C++ location/position interface
@hangingman hangingman force-pushed the issues/293_enhance_new_cpp_poc_2 branch from a700506 to e0d21a2 Compare March 13, 2022 23:22
@hangingman
Copy link
Author

hangingman commented Mar 22, 2022

@andreasabel CI is passed. But, I realized test cases under the testing directory is failing.
I'm checking it.

working on this PR hangingman#5

)

* Make enable filtering of bnfc-system-tests
* Remove override keyword from clone() C++
* Add work-around of "transition table overflow, automaton is too big" https://stackoverflow.com/a/63461031/2565527
* Add yylval->emplace<char>(); for escaped charcters
* Deal with C++ NO STL
* Add switching of smart-pointer on pp and absyn
* Fix skelton code generator
* Absyn list should append the last when adding the element
* Remove dummy FlexLexer class
* Fix C/C++ namespace problem
* Fix bnfc define statement problem
* Switch using shared-ptr or not, according to base-classes or token-classes
* Fix cons impl in definedRules
* Fix error handling
* deal with reversible classes such that it can do left recursion
@hangingman
Copy link
Author

@andreasabel Hi,
I make it passed all C/C++ system tests. hangingman#5 (comment)
I think bnfc code in this branch can generate decent C++ code.
Please take a look it.

@tokomine
Copy link

@andreasabel @VBeatrice Hi,
I have already tried this code in my project and it works very well. Could you please tell me when you will be able to merge this pull request? Thank you for your time and attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AST Concerning the generated abstract syntax C++
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants