Skip to content

Commit

Permalink
renaming
Browse files Browse the repository at this point in the history
  • Loading branch information
karlstroetmann committed Nov 3, 2024
1 parent da6ece7 commit e1648c0
Showing 1 changed file with 198 additions and 0 deletions.
198 changes: 198 additions & 0 deletions Python/ANTLR/02-RegExp-Parser.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%load_ext nb_mypy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parsing Regular Expressions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The grammar for regular expressions is stored in the file `RegularExpressions.g4`. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!cat -n RegularExpressions.g4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We start by generating both the scanner and the parser. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!antlr4 -Dlanguage=Python3 RegularExpressions.g4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The files `CalculatorLexer.py` and `CalculatorParser.py` contain the generated scanner and parser, respectively. We have to import these files. Furthermore, the runtime of \n",
"<span style=\"font-variant:small-caps;\">Antlr</span>\n",
"needs to be imported."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from RegularExpressionsLexer import RegularExpressionsLexer\n",
"from RegularExpressionsParser import RegularExpressionsParser\n",
"import antlr4"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from typing import TypeVar\n",
"NestedTuple = TypeVar('NestedTuple')\n",
"NestedTuple = str | tuple[NestedTuple, ...]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The function `ast` takes a string `s` as input. This string is then parsed and the resulting abstract syntax tree is printed. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def ast(s: str) -> NestedTuple:\n",
" input_stream = antlr4.InputStream(s)\n",
" lexer = RegularExpressionsLexer(input_stream)\n",
" token_stream = antlr4.CommonTokenStream(lexer)\n",
" parser = RegularExpressionsParser(token_stream)\n",
" return parser.regExp().result"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%nb_mypy Off"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ast('a+b*⋅(a+b⋅b*)')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!rm *.py *.tokens *.interp\n",
"!rm -r __pycache__/"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -l"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}

0 comments on commit e1648c0

Please sign in to comment.