file renaming

karlstroetmann · Nov 2, 2024 · a5e5146 · a5e5146
1 parent d634236
commit a5e5146
Show file tree

Hide file tree

Showing 12 changed files with 488 additions and 17 deletions.
diff --git a/Lecture-Notes/formal-languages.pdf b/Lecture-Notes/formal-languages.pdf
diff --git a/Python/Chapter-07/Calculator.g4 → Python/ANTLR/Calculator.g4 b/Python/Chapter-07/Calculator.g4 → Python/ANTLR/Calculator.g4
diff --git a/Python/Chapter-07/Differentiator.g4 → Python/ANTLR/Differentiator.g4 b/Python/Chapter-07/Differentiator.g4 → Python/ANTLR/Differentiator.g4
diff --git a/Python/Chapter-07/Expr.g4 → Python/ANTLR/Expr.g4 b/Python/Chapter-07/Expr.g4 → Python/ANTLR/Expr.g4
diff --git a/Python/Chapter-07/Interpreter.ipynb → Python/ANTLR/Interpreter.ipynb b/Python/Chapter-07/Interpreter.ipynb → Python/ANTLR/Interpreter.ipynb
diff --git a/Python/Chapter-07/Program.g4 → Python/ANTLR/Program.g4 b/Python/Chapter-07/Program.g4 → Python/ANTLR/Program.g4
diff --git a/Python/Chapter-07/Pure.g4 → Python/ANTLR/Pure.g4 b/Python/Chapter-07/Pure.g4 → Python/ANTLR/Pure.g4
diff --git a/Python/ANTLR/PureParser.ipynb b/Python/ANTLR/PureParser.ipynb
@@ -0,0 +1,372 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from IPython.core.display import HTML\n",
+    "with open('../style.css') as file:\n",
+    "    css = file.read()\n",
+    "HTML(css)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Testing an <span style=\"font-variant:small-caps;\">Antlr</span> Grammar via `grun`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In order for the examples using <span style=\"font-variant:small-caps;\">Antlr</span> to work, \n",
+    "we first have to install <span style=\"font-variant:small-caps;\">Antlr</span>.  This can be done by executing \n",
+    "the following commands in an *Anaconda environment* that has been activated:\n",
+    "```\n",
+    "conda install -y -c conda-forge antlr4-python3-runtime\n",
+    "conda install -y -c conda-forge antlr\n",
+    "```\n",
+    "Alternatively, you can download https://www.antlr.org/download/antlr-4.13.1-complete.jar.  I will assume that this `.jar`file is \n",
+    "stored in the directory `/usr/local/lib/`.  Furthermore, I assume that both a *java runtime*\n",
+    "and a *java compiler* are available. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!conda install -y -c conda-forge antlr4-python3-runtime"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!conda install -y -c conda-forge antlr"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Our grammar is stored in the file `Expr.g4`.  In order to inspect it, we use the command line tool `cat`.  This will work with MacOs and Linux.  On Windows,\n",
+    "either use the power shell, which understands `cat`,  or use the command `type` instead.  The option  `-n` of `cat` provides numbered output."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!cat -n Expr.g4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that this grammar does not contain any *embedded actions*.  \n",
+    "Hence we cannot compute anything with it.  We will only be able to \n",
+    "check whether a given string is generated by this grammar.  We can generate both the scanner and the parser using the following command:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!antlr4 -Dlanguage=Python3 Expr.g4"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!ls -l"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The files `ExprLexer.py` and `ExprParser.py` contain the generated scanner and parser, respectively.\n",
+    "If we want to test the parser in this notebook, we have to import these files."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from ExprLexer  import ExprLexer\n",
+    "from ExprParser import ExprParser"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Of course, we also have to import `antlr4`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import antlr4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we are able to parse a string.  The function `parser_string` takes the string `s` as its argument and checks,\n",
+    "whether this string can be parsed as an arithmetic expression.  This is done in five steps:\n",
+    "- The string is converted into an `antlr4.InputStream`.\n",
+    "- The input stream is converted into a lexer.\n",
+    "- The lexer is converted into an `antlr4.CommonTokenStream`.\n",
+    "- The token stream is converted into a parser.\n",
+    "- The parser tries to parse with `start` symbol."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def parse_string(string): \n",
+    "    inputStream = antlr4.InputStream(string)\n",
+    "    lexer       = ExprLexer(inputStream)\n",
+    "    tokenStream = antlr4.CommonTokenStream(lexer)\n",
+    "    parser      = ExprParser(tokenStream)\n",
+    "    parser.expr()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "parse_string('1 + 2 * 3 - 4')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As there is no syntax error, the string `'1 + 2 * 3 - 4'` adheres to the specification given by our grammar.\n",
+    "Lets try a string that is not generated by our grammar."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "parse_string('1 + 2 * 3 ** 4')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As the operator `**` is not supported by our grammar, we get a *syntax error* at the \n",
+    "last occurrence of the character `*` in the given string.  \n",
+    "Note that the column count starts at `0`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "parse_string('1 < 2')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This time we get a *lexical error* as the character `<` is not a legal token."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also generate a *parse tree* with our grammar.  However, for this to work <span style=\"font-variant:small-caps;\">Antlr</span>\n",
+    "first has to generate a `java` parser.  Hence we have to call `antlr4` again, but this time with `Java` as the target language."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!java -jar /usr/local/lib/antlr-4.13.1-complete.jar -Dlanguage=Java Expr.g4 "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This command has generated some files for us that contain a both a lexer and a parser.  \n",
+    "However, this time these are `.java`-files."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!ls -l *.java"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We have to compile the generated `.java` files.  Below, you might have to change the path to \n",
+    "the file `antlr-4.8-complete.jar` to make this work."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!javac -cp .:/usr/local/lib/antlr-4.13.1-complete.jar *.java"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next, we can start the so called *TestRig* to generate and display the <em style=\"color:blue\">parse tree</em> for a given string."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!echo \"1+2*3-4\" | java -cp .:/usr/local/lib/antlr-4.13.1-complete.jar org.antlr.v4.gui.TestRig Expr expr -gui"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let us clean up the working directory."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!ls"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!rm *.py *.tokens *.interp *.java *.class\n",
+    "!rm -r __pycache__/"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!ls -l"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.5"
+  },
+  "varInspector": {
+   "cols": {
+    "lenName": 16,
+    "lenType": 16,
+    "lenVar": 40
+   },
+   "kernels_config": {
+    "python": {
+     "delete_cmd_postfix": "",
+     "delete_cmd_prefix": "del ",
+     "library": "var_list.py",
+     "varRefreshCmd": "print(var_dic_list())"
+    },
+    "r": {
+     "delete_cmd_postfix": ") ",
+     "delete_cmd_prefix": "rm(",
+     "library": "var_list.r",
+     "varRefreshCmd": "cat(var_dic_list()) "
+    }
+   },
+   "types_to_exclude": [
+    "module",
+    "function",
+    "builtin_function_or_method",
+    "instance",
+    "_Feature"
+   ],
+   "window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/Python/ANTLR/RegularExpressions.g4 b/Python/ANTLR/RegularExpressions.g4
@@ -0,0 +1,20 @@
+grammar RegularExpressions;
+
+regExp returns [result]
+    : e=regExp '+' p=product {$result = ('+', $e.result, $p.result)}
+    | p=product              {$result = $p.result                  }    
+    ;
+
+product returns [result]
+    : p=product '⋅' f=factor {$result = ('⋅', $p.result, $f.result)}
+    | f=factor               {$result = $f.result                  }
+    ;
+
+factor returns [result]
+    : f=factor '*'           {$result = ('*', $f.result) }
+    | '(' e=regExp ')'       {$result = $e.result        }
+    | c=LETTER               {$result = $c.text          }
+    ;
+
+LETTER : [a-zA-Z];
+WS     : [ \t\n\r] -> skip; 
diff --git a/Python/Chapter-07/Simple.g4 → Python/ANTLR/Simple.g4 b/Python/Chapter-07/Simple.g4 → Python/ANTLR/Simple.g4