Skip to content

Commit

Permalink
more renaming
Browse files Browse the repository at this point in the history
  • Loading branch information
karlstroetmann committed Nov 3, 2024
1 parent 7428e7e commit e668458
Show file tree
Hide file tree
Showing 4 changed files with 877 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"There is only a single token that need to be defined via a regular expression. \n",
"The other tokens, namely the operators `+` and `*` and the parenthesis consist \n",
"We only need to define four tokens. \n",
"The other tokens, namely the operators `+`, `-`, `*`, and `/` and the parenthesis consist \n",
"only of a single character and can therefore be defined as literals."
]
},
Expand Down
198 changes: 198 additions & 0 deletions Python/Chapter-07/02-RegExp-Parser.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%load_ext nb_mypy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parsing Regular Expressions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The grammar for regular expressions is stored in the file `RegularExpressions.g4`. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!cat -n RegularExpressions.g4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We start by generating both the scanner and the parser. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!antlr4 -Dlanguage=Python3 RegularExpressions.g4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The files `CalculatorLexer.py` and `CalculatorParser.py` contain the generated scanner and parser, respectively. We have to import these files. Furthermore, the runtime of \n",
"<span style=\"font-variant:small-caps;\">Antlr</span>\n",
"needs to be imported."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from RegularExpressionsLexer import RegularExpressionsLexer\n",
"from RegularExpressionsParser import RegularExpressionsParser\n",
"import antlr4"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from typing import TypeVar\n",
"NestedTuple = TypeVar('NestedTuple')\n",
"NestedTuple = str | tuple[NestedTuple, ...]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The function `ast` takes a string `s` as input. This string is then parsed and the resulting abstract syntax tree is printed. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def ast(s: str) -> NestedTuple:\n",
" input_stream = antlr4.InputStream(s)\n",
" lexer = RegularExpressionsLexer(input_stream)\n",
" token_stream = antlr4.CommonTokenStream(lexer)\n",
" parser = RegularExpressionsParser(token_stream)\n",
" return parser.regExp().result"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%nb_mypy Off"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ast('a+b*⋅(a+b⋅b*)')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!rm *.py *.tokens *.interp\n",
"!rm -r __pycache__/"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -l"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading

0 comments on commit e668458

Please sign in to comment.