Skip to content

Commit

Permalink
Merge branch 'main' into feature/multi_column_delete_to_exist
Browse files Browse the repository at this point in the history
  • Loading branch information
aman-db authored Nov 11, 2024
2 parents 8058738 + 27b143b commit 4ae2519
Show file tree
Hide file tree
Showing 62 changed files with 3,746 additions and 3,932 deletions.
7 changes: 7 additions & 0 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,9 @@ jobs:
cache-dependency-path: '**/pyproject.toml'
python-version: '3.10'

- name: Install hatch
run: pip install hatch==$HATCH_VERSION

- name: Initialize Python virtual environment for StandardInputPythonSubprocess
run: make dev

Expand Down Expand Up @@ -220,6 +223,9 @@ jobs:
cache-dependency-path: '**/pyproject.toml'
python-version: '3.10'

- name: Install hatch
run: pip install hatch==$HATCH_VERSION

- name: Install Databricks CLI
uses: databricks/setup-cli@main

Expand Down Expand Up @@ -272,3 +278,4 @@ jobs:

- name: Run Lint Test with Maven
run: mvn compile -DskipTests --update-snapshots -B exec:java -pl linter --file pom.xml -Dexec.args="-i core/src/main/antlr4 -o .venv/linter/grammar -c true"
continue-on-error: true
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,5 @@ spark-warehouse/
remorph_transpile/
/linter/gen/
/linter/src/main/antlr4/library/gen/
.databricks-login.json
.databricks-login.json
/core/src/main/antlr4/com/databricks/labs/remorph/parsers/*/gen/
33 changes: 33 additions & 0 deletions CHANGELOG.md

Large diffs are not rendered by default.

36 changes: 34 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,34 @@ Map(log_level -> disabled, name -> foo)

This section provides a step-by-step guide to set up and start working on the project. These steps will help you set up your project environment and dependencies for efficient development.

To begin, run `make dev` to install [Hatch](https://github.com/pypa/hatch), create the default environment and install development dependencies, assuming you've already cloned the github repo.
To begin, install prerequisites:

`wget` is required by the maven installer
```shell
brew install wget
```

`maven` is the dependency manager for JVM based languages
```shell
brew install maven
```

`jdk11` is the jdk used by remorph
download it from [OpenJDK11](https://www.openlogic.com/openjdk-downloads?field_java_parent_version_target_id=406&field_operating_system_target_id=431&field_architecture_target_id=391&field_java_package_target_id=396) and install it

`python` is the dependency manager for JVM based languages
```shell
brew install maven
```

`hatch` is a Python project manager
```shell
pip install hatch
```

Then run project-specific install scripts

`make dev` creates the default environment and installs development dependencies, assuming you've already cloned the github repo.
```shell
make dev
```
Expand All @@ -89,7 +115,13 @@ To ensure your integrated development environment (IDE) uses the newly created v
hatch run python -c "import sys; print(sys.executable)"
```

Configure your IDE to use this Python path so that you work within the virtual environment when developing the project:
As of writing, we only support IntelliJ IDEA CE 2024.1. Development using more recent versions doesn't work (yet!).
Download and install [IntelliJ IDEA](https://www.jetbrains.com/idea/download/other.html)

Configure your IDE to:
- use OpenJDK11 as the SDK for the project
- install the IntelliJ Scala plugin version 2024.1.25. Do not use more recent versions, they don't work!!!
- use this Python venv path so that you work within the virtual environment when developing the project:
![IDE Setup](docs/img/remorph_intellij.gif)

Before every commit, apply the consistent formatting of the code, as we want our codebase look consistent:
Expand Down
1 change: 0 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ clean:
rm -fr .venv clean htmlcov .mypy_cache .pytest_cache .ruff_cache .coverage coverage.xml

dev:
pip3 install hatch
hatch env create
hatch run pip install -e '.[test]'
hatch run which python
Expand Down
13 changes: 12 additions & 1 deletion NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,16 @@ cryptography - https://github.com/pyca/cryptography
Copyright 2013-2023 The Python Cryptographic Authority and individual contributors.
License - https://github.com/pyca/cryptography/blob/main/LICENSE

circe - https://github.com/circe/circe
Copyright (c) 2015, Ephox Pty Ltd, Mark Hibberd, Sean Parsons, Travis Brown, and other contributors. All rights reserved.
https://github.com/circe/circe/blob/series/0.14.x/LICENSE

circe-generic-extras - https://github.com/circe/circe-generic-extras
https://github.com/circe/circe-generic-extras/blob/main/LICENSE

circe-jackson - https://github.com/circe/circe-jackson
https://github.com/circe/circe-jackson/blob/main/LICENSE

This software contains code from the following open source projects, licensed under the BSD license:

ANTLR v4 - https://github.com/antlr/antlr4
Expand All @@ -77,4 +87,5 @@ Copyright (2023) Databricks, Inc.
https://github.com/databrickslabs/blueprint/blob/main/LICENSE

Databricks Connect - https://pypi.org/project/databricks-connect/
Copyright (2019) Databricks Inc.
Copyright (2019) Databricks Inc.

103 changes: 60 additions & 43 deletions core/pom.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.databricks.labs</groupId>
Expand All @@ -24,8 +24,7 @@
<scala-logging.version>3.9.5</scala-logging.version>
<databricks-sdk-java.version>0.27.1</databricks-sdk-java.version>
<databricks-connect.version>15.1.0</databricks-connect.version>
<ujson.version>4.0.0</ujson.version>
<upickle.version>4.0.0</upickle.version>
<circe.version>0.14.2</circe.version>
<mssql-jdbc.version>12.8.0.jre8</mssql-jdbc.version>
<snowflake-jdbc.version>3.18.0</snowflake-jdbc.version>
<os-lib.version>0.10.1</os-lib.version>
Expand Down Expand Up @@ -123,15 +122,24 @@
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.lihaoyi</groupId>
<artifactId>upickle_2.12</artifactId>
<version>${upickle.version}</version>
<groupId>io.circe</groupId>
<artifactId>circe-core_${scala.binary.version}</artifactId>
<version>${circe.version}</version>
</dependency>
<dependency>
<groupId>com.lihaoyi</groupId>
<artifactId>ujson_${scala.binary.version}</artifactId>
<version>${ujson.version}</version>
<scope>compile</scope>
<groupId>io.circe</groupId>
<artifactId>circe-generic_${scala.binary.version}</artifactId>
<version>${circe.version}</version>
</dependency>
<dependency>
<groupId>io.circe</groupId>
<artifactId>circe-generic-extras_${scala.binary.version}</artifactId>
<version>${circe.version}</version>
</dependency>
<dependency>
<groupId>io.circe</groupId>
<artifactId>circe-jackson215_${scala.binary.version}</artifactId>
<version>${circe.version}</version>
</dependency>
<dependency>
<groupId>com.lihaoyi</groupId>
Expand Down Expand Up @@ -211,6 +219,15 @@
<listener>false</listener>
<sourceDirectory>src/main/antlr4</sourceDirectory>
<treatWarningsAsErrors>true</treatWarningsAsErrors>
<libDirectory>${project.basedir}/src/main/antlr4/com/databricks/labs/remorph/parsers/lib</libDirectory>
<outputDirectory>${project.build.directory}/generated-sources/antlr4</outputDirectory>
<includes>
<include>**/*.g4</include> <!-- Include all .g4 files -->
</includes>
<excludes>
<exclude>**/lib/*.g4</exclude> <!-- But exclude the library grammars-->
<exclude>**/basesnowflake.g4</exclude>
</excludes>
</configuration>
</plugin>
<plugin>
Expand Down Expand Up @@ -255,39 +272,39 @@
<artifactId>exec-maven-plugin</artifactId>
<version>3.4.1</version>
<executions>
<!-- &lt;!&ndash; npm install antlr-format-cli &ndash;&gt;-->
<!-- <execution>-->
<!-- <id>npm install antlr-format antlr-format-cli</id>-->
<!-- <phase>validate</phase>-->
<!-- <goals>-->
<!-- <goal>exec</goal>-->
<!-- </goals>-->
<!-- <configuration>-->
<!-- <executable>${project.build.directory}/node/npm</executable>-->
<!-- <arguments>-->
<!-- <argument>install</argument>-->
<!-- <argument>-g</argument>-->
<!-- <argument>&#45;&#45;save-dev</argument>-->
<!-- <argument>antlr-format</argument>-->
<!-- <argument>antlr-format-cli</argument>-->
<!-- </arguments>-->
<!-- </configuration>-->
<!-- </execution>-->
<!-- &lt;!&ndash; antlr-fmt &ndash;&gt;-->
<!-- <execution>-->
<!-- <id>antlr-fmt</id>-->
<!-- <phase>validate</phase>-->
<!-- <goals>-->
<!-- <goal>exec</goal>-->
<!-- </goals>-->
<!-- <configuration>-->
<!-- <executable>${project.build.directory}/bin/antlr-format</executable>-->
<!-- <arguments>-->
<!-- <argument>-v</argument>-->
<!-- <argument>${basedir}/src/main/antlr4/**/*.g4</argument>-->
<!-- </arguments>-->
<!-- </configuration>-->
<!-- </execution>-->
<!-- &lt;!&ndash; npm install antlr-format-cli &ndash;&gt;-->
<!-- <execution>-->
<!-- <id>npm install antlr-format antlr-format-cli</id>-->
<!-- <phase>validate</phase>-->
<!-- <goals>-->
<!-- <goal>exec</goal>-->
<!-- </goals>-->
<!-- <configuration>-->
<!-- <executable>${project.build.directory}/node/npm</executable>-->
<!-- <arguments>-->
<!-- <argument>install</argument>-->
<!-- <argument>-g</argument>-->
<!-- <argument>&#45;&#45;save-dev</argument>-->
<!-- <argument>antlr-format</argument>-->
<!-- <argument>antlr-format-cli</argument>-->
<!-- </arguments>-->
<!-- </configuration>-->
<!-- </execution>-->
<!-- &lt;!&ndash; antlr-fmt &ndash;&gt;-->
<!-- <execution>-->
<!-- <id>antlr-fmt</id>-->
<!-- <phase>validate</phase>-->
<!-- <goals>-->
<!-- <goal>exec</goal>-->
<!-- </goals>-->
<!-- <configuration>-->
<!-- <executable>${project.build.directory}/bin/antlr-format</executable>-->
<!-- <arguments>-->
<!-- <argument>-v</argument>-->
<!-- <argument>${basedir}/src/main/antlr4/**/*.g4</argument>-->
<!-- </arguments>-->
<!-- </configuration>-->
<!-- </execution>-->
<execution>
<goals>
<goal>java</goal>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# ANTLR Grammar Library

This directory contains ANTLR grammar files that are common to more than one SQL dialect. Such as the grammar that covers stored procedures, which all
dialects of SQL support in some form, and for which we have a universal grammar.

ANTLR processes included grammars as pure text, in the same way that say the C pre-processor processes `#include` directives.
This means that you must be careful to ensure that:
- if you define new tokens in an included grammar, that they do not clash with tokens in the including grammar.
- if you define new rules in an included grammar, that they do not clash with rules in the including grammar.
In particular, you must avoid creating ambiguities in rule/token prediction, where ANTLR will try to create
a parser anyway, but generate code that performs extremely long token lookahead, and is therefore very slow.

In other words, you cannot just arbitrarily throw together some common Lexer and Parser rules and expect them
to just work.
Loading

0 comments on commit 4ae2519

Please sign in to comment.