ALDS paper and v2.1.0 release

lucaslie · Oct 12, 2021 · 0932cbc · 0932cbc
1 parent b753745
commit 0932cbc
Show file tree

Hide file tree

Showing 154 changed files with 9,546 additions and 164 deletions.
diff --git a/README.md b/README.md
@@ -1,8 +1,8 @@
 # Neural Network Pruning
-[Lucas Liebenwein](http://www.mit.edu/~lucasl/), 
+[Lucas Liebenwein](https://people.csail.mit.edu/lucasl/), 
 [Cenk Baykal](http://www.mit.edu/~baykal/),
+[Alaa Maalouf](https://www.linkedin.com/in/alaa-maalouf/),
 [Igor Gilitschenski](https://www.gilitschenski.org/igor/), 
-[Harry Lang](https://www.csail.mit.edu/person/harry-lang), 
 [Dan Feldman](http://people.csail.mit.edu/dannyf/),
 [Daniela Rus](http://danielarus.csail.mit.edu/)
 
@@ -15,6 +15,7 @@ This repository contains code to reproduce the results from the following
 papers: 
 | Paper | Venue | Title & Link | 
 | :---: | :---: | :---         |
+| **ALDS** | NeurIPS 2021 | [Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition](https://arxiv.org/abs/2107.11442) |
 | **Lost** | MLSys 2021 | [Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy](https://proceedings.mlsys.org/paper/2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract.html) |
 | **PFP** | ICLR 2020 | [Provable Filter Pruning for Efficient Neural Networks](https://openreview.net/forum?id=BJxkOlSYDH) |
 | **SiPP** | arXiv | [SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks](https://arxiv.org/abs/1910.05422) |
@@ -34,16 +35,16 @@ about the paper and scripts and parameter configuration to reproduce the exact
 results from the paper.
 | Paper | Location |
 | :---: | :---:    |
+| **ALDS** | [paper/alds](./paper/alds) |
 | **Lost** | [paper/lost](./paper/lost) |
 | **PFP**  | [paper/pfp](./paper/pfp)   |
 | **SiPP** | [paper/sipp](./paper/sipp) |
 
 ## Setup
 We provide three ways to install the codebase:
-1. [Github repo + full conda
-   environment](./README.md#1.-github-repo)
-2. [Installation via pip](./README.md#2.-pip-installation)
-3. [Docker image](./README.md#3.-docker-image)
+1. [Github repo + full conda environment](#1-github-repo)
+2. [Installation via pip](#2-pip-installation)
+3. [Docker image](#3-docker-image)
 
 ### 1. Github Repo
 Clone the github repo:
@@ -97,14 +98,27 @@ using the codebase.
 | --- | --- |
 | [src/torchprune/README.md](./src/torchprune) | more details to prune neural networks, how to use and setup the data sets, how to implement custom pruning methods, and how to add your data sets and networks. |   
 | [src/experiment/README.md](./src/experiment) | more details on how to configure and run your own experiments, and more information on how to re-produce the results. |
-| [paper/lost/README.md](./paper/lost) | check out for more information on the [Lost](https://openreview.net/forum?id=BJxkOlSYDH) paper. |
+| [paper/alds/README.md](./paper/alds) | check out for more information on the [ALDS](https://arxiv.org/abs/2107.11442) paper. |
+| [paper/lost/README.md](./paper/lost) | check out for more information on the [Lost](https://proceedings.mlsys.org/paper/2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract.html) paper. |
 | [paper/pfp/README.md](./paper/pfp) | check out for more information on the [PFP](https://openreview.net/forum?id=BJxkOlSYDH) paper. |
 | [paper/sipp/README.md](./paper/sipp) | check out for more information on the  [SiPP](https://arxiv.org/abs/1910.05422) paper. |
 
 ## Citations
 Please cite the respective papers when using our work.
 
-### [Lost In Pruning](https://openreview.net/forum?id=BJxkOlSYDH) (Pruning Study)
+### [Towards Determining the Optimal Layer-wise Decomposition](https://arxiv.org/abs/2107.11442)
+```
+@inproceedings{liebenwein2021alds,
+ author = {Lucas Liebenwein and Alaa Maalouf and Dan Feldman and Daniela Rus},
+ booktitle = {Advances in Neural Information Processing Systems},
+ title = {Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition},
+ url = {https://arxiv.org/abs/2107.11442},
+ volume = {34},
+ year = {2021}
+}
+```
+
+### [Lost In Pruning](https://proceedings.mlsys.org/paper/2021/hash/2a79ea27c279e471f4d180b08d62b00a-Abstract.html)
 ```
 @article{liebenwein2021lost,
 title={Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy},
@@ -115,7 +129,7 @@ year={2021}
 }
 ```
 
-### [Provable Filter Pruning](https://openreview.net/forum?id=BJxkOlSYDH) (Filter Pruning)
+### [Provable Filter Pruning](https://openreview.net/forum?id=BJxkOlSYDH)
 ```
 @inproceedings{liebenwein2020provable,
 title={Provable Filter Pruning for Efficient Neural Networks},
@@ -126,12 +140,12 @@ url={https://openreview.net/forum?id=BJxkOlSYDH}
 }
 ```
 
-### [SiPPing Neural Networks](https://arxiv.org/abs/1910.05422) (Weight Pruning)
+### [SiPPing Neural Networks](https://arxiv.org/abs/1910.05422)
 ```
 @article{baykal2019sipping,
 title={SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks},
 author={Baykal, Cenk and Liebenwein, Lucas and Gilitschenski, Igor and Feldman, Dan and Rus, Daniela},
 journal={arXiv preprint arXiv:1910.05422},
 year={2019}
 }
-```
+```
diff --git a/misc/Dockerfile b/misc/Dockerfile
@@ -31,6 +31,9 @@ RUN touch /etc/bashrc_custom && \
 # tell bash to source the custom bashrc when a shell is started
 ENV BASH_ENV /etc/bashrc_custom
 
+# Ensure that GPUs are not visible by default
+ENV NVIDIA_VISIBLE_DEVICES none
+
 # Copy source files and requirements file
 COPY ./src /src
 COPY ./misc /misc

diff --git a/misc/imgs/alds_overview.png b/misc/imgs/alds_overview.png
diff --git a/paper/alds/README.md b/paper/alds/README.md
@@ -0,0 +1,106 @@
+# Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition
+[Lucas Liebenwein*](https://people.csail.mit.edu/lucasl/), 
+[Alaa Maalouf*](https://www.linkedin.com/in/alaa-maalouf/),
+[Dan Feldman](http://people.csail.mit.edu/dannyf/),
+[Daniela Rus](http://danielarus.csail.mit.edu/)
+
+***Equal contribution**
+
+<p align="center">
+  <img align="center" src="../../misc/imgs/alds_overview.png" width="100%">
+</p>
+<!-- <br clear="left"/> -->
+
+We present a global compression framework for deep neural networks that
+automatically analyzes each layer to identify the optimal per-layer compression
+ratio, while simultaneously achieving the desired overall compression. Our
+algorithm hinges on the idea of compressing each convolutional (or
+fully-connected) layer by slicing its channels into multiple groups and
+decomposing each group via low-rank decomposition. 
+
+We frame the compression problem as an optimization problem where we wish to
+minimize the maximum compression error across layers and propose an efficient
+algorithm towards a solution.
+
+Compared to manual solution (i.e. manual compression of each layer) our
+algorithm (_Automatic Layer-wise Decomposition Selector_ or `ALDS`) automatically
+determines the decomposition for each layer enabling higher compression ratios
+for the same level of accuracy without requiring substantial manual
+hyperparameter tuning. 
+
+## Setup
+Check out the main [README.md](../../README.md) and the respective packages for
+more information on the code base. 
+
+## Overview
+
+### Implementation of main algorithm (`ALDS`)
+Our main algorithm (_Automatic Layer-wise Decomposition Selector_ or `ALDS`) is
+integrated into the [`torchprune`](../../src/torchprune) package. 
+
+The implementation can be found
+[here](../../src/torchprune/torchprune/method/alds). 
+
+### Run compression experiments
+The experiment configurations are located [here](./param). To reproduce the
+experiments for a specific configuration, run: 
+```bash
+python -m experiment.main param/cifar/prune/resnet20.yaml
+```
+
+### Visualize results
+
+You should be able to retrieve the nominal prune-accuracy trade-offs 
+from the `data/results` folder.
+
+You can also visualize the results using the 
+[`results_viewer.py`](./script/results_viewer.py) script:
+```bash
+python results_viewer.py
+```
+Run it from inside the [`script`](./script) folder. The script can
+also be run interactively as Jupyter notebook.
+
+### Load network checkpoint
+
+If you want to use the network checkpoints in your own experiments or code,
+follow the [load_networks.py](./script/load_networks.py) script. It should be
+self-explanatory.
+
+### Hyperparameter sweeps 
+
+If you want to run a hyperparameter sweep over different amounts of retraining,
+you can run 
+```bash
+python -m experiment.main param/cifar/retrainsweep/resnet20.yaml
+```
+
+Note that this experiment will be quite expensive to run in terms of required
+compute since it will repeat the compression experiment for different amounts
+of retraining and different compression methods over multiple repetitions.
+
+To visualize the results use the
+[retrain_sweep.py](./script/retrain_sweep.py) script:
+```bash
+python retrain_sweep.py
+```
+Run it from inside the [`script`](./script) folder. The script can
+also be run interactively as Jupyter notebook.
+
+## Citation
+Please cite the following paper when using our work.
+
+### Paper link
+[Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition](https://arxiv.org/abs/2107.11442)
+
+### Bibtex
+```
+@inproceedings{liebenwein2021alds,
+ author = {Lucas Liebenwein and Alaa Maalouf and Dan Feldman and Daniela Rus},
+ booktitle = {Advances in Neural Information Processing Systems},
+ title = {Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition},
+ url = {https://arxiv.org/abs/2107.11442},
+ volume = {34},
+ year = {2021}
+}
+```
diff --git a/paper/alds/param/cifar/prune/common.yaml b/paper/alds/param/cifar/prune/common.yaml
@@ -0,0 +1,28 @@
+retraining:
+  numEpochs: 0
+
+experiments:
+  methods:
+    - "ALDSNet"
+    - "PCANet"
+    - "SVDFrobeniusNet"
+    - "SVDNet"
+    - "LearnedRankNetScheme0"
+    - "FilterThresNet"
+    - "PFPNet"
+  mode: "retrain"
+
+  numRepetitions: 1
+  numNets: 3
+
+  plotting:
+    minVal: 0.2
+    maxVal: 0.95
+
+  spacing:
+    - type: "geometric"
+      numIntervals: 20
+      maxVal: 0.97
+      minVal: 0.2
+
+  retrainIterations: -1
diff --git a/paper/alds/param/cifar/prune/densenet22.yaml b/paper/alds/param/cifar/prune/densenet22.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "densenet22"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/densenet.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
diff --git a/paper/alds/param/cifar/prune/resnet110.yaml b/paper/alds/param/cifar/prune/resnet110.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "resnet110"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/resnet.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
diff --git a/paper/alds/param/cifar/prune/resnet20.yaml b/paper/alds/param/cifar/prune/resnet20.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "resnet20"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/resnet.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
diff --git a/paper/alds/param/cifar/prune/resnet20_plus.yaml b/paper/alds/param/cifar/prune/resnet20_plus.yaml
@@ -0,0 +1,30 @@
+network:
+  name: "resnet20"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/resnet.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
+
+experiments:
+  methods:
+    - "ALDSNetPlus"
+    - "ALDSNet"
+  mode: "retrain"
+
+  numRepetitions: 1
+  numNets: 3
+
+  plotting:
+    minVal: 0.2
+    maxVal: 0.95
+
+  spacing:
+    - type: "geometric"
+      numIntervals: 20
+      maxVal: 0.97
+      minVal: 0.2
+
+  retrainIterations: -1
diff --git a/paper/alds/param/cifar/prune/resnet56.yaml b/paper/alds/param/cifar/prune/resnet56.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "resnet56"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/resnet.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
diff --git a/paper/alds/param/cifar/prune/vgg16.yaml b/paper/alds/param/cifar/prune/vgg16.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "vgg16_bn"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/vgg.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
diff --git a/paper/alds/param/cifar/prune/wrn16_8.yaml b/paper/alds/param/cifar/prune/wrn16_8.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "wrn16_8"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/wrn.yaml"
+
+file: "paper/alds/param/cifar/prune/common.yaml"
diff --git a/paper/alds/param/cifar/pruneablation/common.yaml b/paper/alds/param/cifar/pruneablation/common.yaml
@@ -0,0 +1,27 @@
+retraining:
+  numEpochs: 0
+
+experiments:
+  methods:
+    - "ALDSNet"
+    - "ALDSNetErrorOnly"
+    - "ALDSNetSimple"
+    - "ALDSNetSimple5"
+    - "MessiNet"
+    - "SVDErrorNet"
+  mode: "retrain"
+
+  numRepetitions: 1
+  numNets: 3
+
+  plotting:
+    minVal: 0.2
+    maxVal: 0.95
+
+  spacing:
+    - type: "geometric"
+      numIntervals: 20
+      maxVal: 0.96
+      minVal: 0.2
+
+  retrainIterations: -1
diff --git a/paper/alds/param/cifar/pruneablation/resnet20.yaml b/paper/alds/param/cifar/pruneablation/resnet20.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "resnet20"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/resnet.yaml"
+
+file: "paper/alds/param/cifar/pruneablation/common.yaml"
diff --git a/paper/alds/param/cifar/pruneablation/wrn16_8.yaml b/paper/alds/param/cifar/pruneablation/wrn16_8.yaml
@@ -0,0 +1,9 @@
+network:
+  name: "wrn16_8"
+  dataset: "CIFAR10"
+  outputSize: 10
+
+training:
+  file: "training/cifar/wrn.yaml"
+
+file: "paper/alds/param/cifar/pruneablation/common.yaml"