Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge ParBlockDiagonal #18

Merged
merged 4 commits into from
Sep 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
33 changes: 33 additions & 0 deletions .github/workflows/DocCleanup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Doc Preview Cleanup

on:
pull_request:
types: [closed]

# Ensure that only one "Doc Preview Cleanup" workflow is force pushing at a time
concurrency:
group: doc-preview-cleanup
cancel-in-progress: false

jobs:
doc-preview-cleanup:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout gh-pages branch
uses: actions/checkout@v4
with:
ref: gh-pages
- name: Delete preview and history + push changes
run: |
if [ -d "${preview_dir}" ]; then
git config user.name "Documenter.jl"
git config user.email "[email protected]"
git rm -rf "${preview_dir}"
git commit -m "delete preview"
git branch gh-pages-new $(echo "delete history" | git commit-tree HEAD^{tree})
git push --force origin gh-pages-new:gh-pages
fi
env:
preview_dir: previews/PR${{ github.event.number }}
22 changes: 22 additions & 0 deletions .github/workflows/Documenter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Documenter
on:
push:
branches:
- main
tags: '*'
pull_request:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@latest
with:
version: '1.9.4'
- name: Install dependencies
run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
- name: Build and deploy
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
run: julia --project=docs/ docs/make.jl
31 changes: 31 additions & 0 deletions .github/workflows/TagBot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: TagBot
on:
issue_comment:
types:
- created
workflow_dispatch:
inputs:
lookback:
default: 3
permissions:
actions: read
checks: read
contents: write
deployments: read
issues: read
discussions: read
packages: read
pages: read
pull-requests: read
repository-projects: read
security-events: read
statuses: read
jobs:
TagBot:
if: github.event_name == 'workflow_dispatch' || github.actor == 'JuliaTagBot'
runs-on: ubuntu-latest
steps:
- uses: JuliaRegistries/TagBot@v1
with:
token: ${{ secrets.GITHUB_TOKEN }}
ssh: ${{ secrets.DOCUMENTER_KEY }}
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ examples/.ipynb_checkpoints
*.svg
*.png
*.dot
!logo.png
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2022 SLIM Group @ Georgia Institute of Technology
Copyright (c) 2024 SLIM Group @ Georgia Institute of Technology

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
20 changes: 18 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "ParametricOperators"
uuid = "db9e0614-c73c-4112-a40c-114e5b366d0d"
authors = ["Thomas Grady <tgrady@gatech.edu>", "Richard Rex <richardr2926@gatech.edu>"]
version = "0.1.0"
authors = ["Richard Rex <richardr2926@gatech.edu>", "Thomas Grady <tgrady@gatech.edu>"]
version = "1.1.3"

[deps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Expand All @@ -19,3 +19,19 @@ Match = "7eb4fadd-790c-5f42-8a69-bfa0b872bfbf"
OMEinsum = "ebe7aa44-baf0-506c-a96f-8464559b3922"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"

[compat]
CUDA = "5.2.0"
ChainRulesCore = "1.23.0"
Combinatorics = "1.0.2"
DataStructures = "0.18.18"
FFTW = "1.8.0"
Flux = "0.14.15"
GraphViz = "0.2.0"
JSON = "0.21.4"
LRUCache = "1.6.1"
LaTeXStrings = "1.3.1"
MPI = "0.20.19"
Match = "2.0.0"
OMEinsum = "0.8.1"
julia = "1.9.4"
48 changes: 48 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# ParametricOperators.jl

[![][license-img]][license-status]
[![Documenter](https://github.com/slimgroup/ParametricOperators.jl/actions/workflows/Documenter.yml/badge.svg)](https://github.com/slimgroup/ParametricOperators.jl/actions/workflows/Documenter.yml)
[![TagBot](https://github.com/slimgroup/ParametricOperators.jl/actions/workflows/TagBot.yml/badge.svg)](https://github.com/slimgroup/ParametricOperators.jl/actions/workflows/TagBot.yml)

<!-- [![][zenodo-img]][zenodo-status] -->

`ParametricOperators.jl` is a Julia Language-based scientific library designed to facilitate the creation and manipulation of tensor operations involving large-scale data using Kronecker products. It provides an efficient and mathematically consistent way to express tensor programs and distribution in the context of machine learning.

> [!NOTE]
> [`ParametericDFNOs.jl`](https://github.com/slimgroup/ParametericDFNOs.jl/) is built on `ParametricOperators.jl`

## Features
- <b>Kronecker Product Operations:</b> Implement tensor operations using Kronecker products for linear operators acting along multiple dimensions.
- <b>Parameterization Support:</b> Enables parametric functions in tensor programs, crucial for statistical optimization algorithms.
- <b>High-Level Abstractions:</b> Close to the underlying mathematics, providing a seamless user experience for scientific practitioners.
- <b>Distributed Computing:</b> Scales Kronecker product tensor programs and gradient computation to multi-node distributed systems.
- <b>Domain-Specific Language:</b> Optimized for Julia's just-in-time compilation, allowing for the construction of complex operators entirely at compile time.

## Setup

```julia
julia> using Pkg
julia> Pkg.activate("path/to/your/environment")
julia> Pkg.add("ParametricOperators")
```

This will add `ParametricOperators.jl` as dependency to your project

## Documentation

Check out the [Documentation](https://slimgroup.github.io/ParametricOperators.jl) for more or get started by running some [examples](https://github.com/turquoisedragon2926/ParametricOperators.jl-Examples)!

## Issues

This section will contain common issues and corresponding fixes. Currently, we only provide support for Julia-1.9

## Authors

Richard Rex, [[email protected]](mailto:[email protected]) <br/>
Thomas Grady <br/>
Mark Glines <br/>

[license-status]:LICENSE
<!-- [zenodo-status]:https://doi.org/10.5281/zenodo.6799258 -->
[license-img]:http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat?style=plastic
<!-- [zenodo-img]:https://zenodo.org/badge/DOI/10.5281/zenodo.3878711.svg?style=plastic -->
Binary file added docs/.DS_Store
Binary file not shown.
2 changes: 2 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
build/
site/
3 changes: 3 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[deps]
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
ParametricOperators = "db9e0614-c73c-4112-a40c-114e5b366d0d"
31 changes: 31 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
using Documenter
using ParametricOperators

makedocs(
sitename = "ParametricOperators.jl",
format = Documenter.HTML(),
# modules = [ParametricOperators],
pages=[
"Introduction" => "index.md",
"Quick Start" => "quickstart.md",
"Distribution" => "distribution.md",
"Examples" => [
"3D FFT" => "examples/3D_FFT.md",
"Distributed 3D FFT" => "examples/3D_DFFT.md",
"3D Conv" => "examples/3D_Conv.md",
"Distributed 3D Conv" => "examples/3D_DConv.md",
],
"API" => "api.md",
"Future Work" => "future.md",
"Citation" => "citation.md"
]
)

# Automatically deploy documentation to gh-pages.
deploydocs(
repo = "github.com/slimgroup/ParametricOperators.jl.git",
push_preview=true,
devurl = "dev",
devbranch = "main",
versions = ["dev" => "dev", "stable" => "v^"],
)
Binary file added docs/src/.DS_Store
Binary file not shown.
3 changes: 3 additions & 0 deletions docs/src/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### API usage for different operators

Coming soon...
Binary file added docs/src/assets/.DS_Store
Binary file not shown.
Binary file added docs/src/assets/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions docs/src/citation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
### Citation

If you use `ParametricOperators.jl`, please cite the following:
```
@presentation {rex2023ML4SEISMIClsp,
title = {Large-scale parametric PDE approximations with model-parallel Fourier neural operators},
year = {2023},
month = {11},
url = {https://slim.gatech.edu/Publications/Public/Conferences/ML4SEISMIC/2023/rex2023ML4SEISMIClsp},
author = {Richard Rex and Thomas J. Grady II and Rishi Khan and Ziyi Yin and Felix J. Herrmann}
}
```
162 changes: 162 additions & 0 deletions docs/src/distribution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Distribution as Linear Algebra

We adapt an approach of looking at distribution of tensor computation as Linear Algebra operations.

This allows `ParametricOperators.jl` to offer several high level API in order to perform controlled parallelism as part of your tensor program in the context of machine learning.

## Kronecker Distribution

### Distributed Fourier Transform

Let's consider the example of Fourier Transform as seen in the [Fourier Transform Example](@ref)

```julia
# Define type and the size of our global tensor
T = Float32
gx, gy, gz = 10, 20, 30

Fx = ParDFT(T, gx)
Fy = ParDFT(Complex{T}, gy)
Fz = ParDFT(Complex{T}, gz)

F = Fz ⊗ Fy ⊗ Fx
```

Assume that our data is partitioned across multiple machine according to the following scheme:

```julia
partition = [1, 1, 2]
```

Each element of `partition` denotes the number of processing elements that divide our input tensor along that dimension.

For eg. given the above partition and global size, our local tensor would be of size:

```julia
x = rand(T, 10, 20, 15)
```

OR in other terms:

```julia
localx, localy, localz = [gx, gy, gz] .÷ partition
x = rand(T, localx, localy, localz)
```

Now, following the method seen in several recent works (Grady et al., [2022](https://arxiv.org/pdf/2204.01205.pdf)) and [traditional distributed FFTs](https://jipolanco.github.io/PencilFFTs.jl/dev/tutorial/), we can distribute the application of our linearly separable transform across multiple processing elements by simply doing:

```julia
F = distribute(F, partition)
```

Now, to apply the Fourier Transform to our tensor, one can do:

```julia
F * vec(x)
```

Another out-of-box example can be seen at [Distributed FFT of a 3D Tensor](@ref)

### Distributed Convolution

!!! note "Definition of Convolution"
Convolution here refers to the application of a linear transform along the channel dimension

Now, in order to extend this to a convolution layer, lets consider the following partitioned tensor:

```julia
T = Float32

gx, gy, gc = 10, 30, 50
partition = [2, 2, 1]

nx, ny, nc = [gx, gy, gc] .÷ partition
x = rand(T, nx, ny, nc)
```

Our tensor is sharded across x and y dimensions by 2 processing element along each dimension.

We can define the operators of our convolution as:

```julia
Sx = ParIdentity(T, gx)
Sy = ParIdentity(T, gy)
Sc = ParMatrix(T, gc, gc)
```

Chain our operators and distribute them:

```julia
S = Sc ⊗ Sy ⊗ Sx
S = distribute(S, partition)
```

Parametrize and apply our transform:

```julia
θ = init(S)
S(θ) * vec(x)
```

Take the gradient of the parameters w.r.t to some objective by simply doing:

```julia
θ′ = gradient(θ -> sum(S(θ) * vec(x)), θ)
```

Another out-of-box example can be seen at [Distributed Parametrized Convolution of a 3D Tensor](@ref)

## Sharing Weights

Sharing weights can be thought of as a broadcasting operation.

In order to share weights of an operator across multiple processing elements, we can do:

```julia
A = ParMatrix(T, 20, 20)
A = distribute(A)
```

Assume the following partition and tensor shape:

```julia
gc, gx = 20, 100
partition = [1, 4]

nc, nx = [gc, gx] .÷ partition
x = rand(T, nc, nx)
```

Initialize and apply the matrix operator on the sharded tensor:

```julia
θ = init(A)
A(θ) * x
```

Compute the gradient by doing:

```julia
θ′ = gradient(θ -> sum(A(θ) * x), θ)
```

## Reduction Operation

In order to perform a reduction operation, more commonly known as an `ALL_REDUCE` operation, we can define:

```julia
R = ParReduce(T)
```

Given any local vector or matrix, we can do:

```julia
x = rand(T, 100)
R * x
```

To compute the gradient of the input w.r.t some objective:

```julia
x′ = gradient(x -> sum(R * x), x)
```
Loading
Loading