Merge branch 'dev_metadata' of https://github.com/theislab/pertpy int…

…o dev_metadata * 'dev_metadata' of https://github.com/theislab/pertpy: Documentation examples (#391) [pre-commit.ci] pre-commit autoupdate (#395) Speed up tests by subsampling (#398) Installation Apple Silicon (#393) Add new distances (#304) Fix cinema OT test (#392) [pre-commit.ci] pre-commit autoupdate (#390) wasserstein distance return type float (#386) fix naming of example data in doc examples (#387) Add test for test_distances.py Catches error as reported in Issue #385. Fix mypy warning for distances Type hint for `groups` reverted, Iterable is too general.
scverse · Oct 16, 2023 · 5eeec8f · 5eeec8f
2 parents 862a5c3 + 252b10a
commit 5eeec8f
Show file tree

Hide file tree

Showing 33 changed files with 1,471 additions and 234 deletions.
diff --git a/.github/workflows/publish_docs.yml b/.github/workflows/publish_docs.yml
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -25,12 +25,12 @@ repos:
             # https://github.com/jupyterlab/jupyterlab/issues/12675
             language_version: "17.9.1"
     - repo: https://github.com/astral-sh/ruff-pre-commit
-      rev: v0.0.291
+      rev: v0.0.292
       hooks:
           - id: ruff
             args: [--fix, --exit-non-zero-on-fix]
     - repo: https://github.com/pre-commit/pre-commit-hooks
-      rev: v4.4.0
+      rev: v4.5.0
       hooks:
           - id: detect-private-key
           - id: check-ast

diff --git a/docs/contributing.md b/docs/contributing.md
@@ -13,10 +13,16 @@ In addition to the packages needed to _use_ this package, you need additional py
 the documentation_. It's easy to install them using `pip`:
 
 ```bash
+git clone https://github.com/theislab/pertpy.git
 cd pertpy
 pip install -e ".[dev,test,doc]"
 ```
 
+```{note}
+If you're working on an Apple Silicon machine, the installation is slightly more complex. In that case, follow
+the steps described in the Apple Silicon section of the installation guide and replace the last two steps described there with the code above.
+```
+
 ## Code-style
 
 This project uses [pre-commit][] to enforce consistent code-styles. On every commit, pre-commit checks will either

diff --git a/docs/installation.md b/docs/installation.md
@@ -40,8 +40,53 @@ Once you have a copy of the source, you can install it with:
 $ make install
 ```
 
+## Apple Silicon
+
+If you want to install and use pertpy on a machine with macOS and M-Chip, the installation is slightly more complex.
+This is because pertpy depends on [scvi-tools], which can currently only run on Apple Silicon machines when installed
+using a native python version (due to a dependency on jax, which cannot be run via Rosetta).
+
+Follow these steps to install pertpy on an Apple Silicon machine (tested on a MacBook Pro with M1 chip and macOS 14.0):
+
+1. Install [Homebrew]
+
+2. Install Apple Silicon version of Mambaforge (If you already have Anaconda/Miniconda installed, make sure
+   having both mamba and conda won't cause conflicts)
+
+    ```console
+    $ brew install --cask mambaforge
+    ```
+
+3. Create a new environment using mamba (here with python 3.10) and activate it
+
+    ```console
+    $ mamba create -n pertpy-env python=3.10
+    $ mamba activate pertpy-env
+    ```
+
+4. Clone the GitHub Repository
+
+    ```console
+    $ git clone https://github.com/theislab/pertpy.git
+    ```
+
+5. Go inside the pertpy folder and install pertpy
+
+    ```console
+    $ cd pertpy
+    $ pip install .
+    ```
+
+    Now you're ready to use pertpy as usual within the environment (`import pertpy`).
+
+    ```
+
+    ```
+
 [github repo]: https://github.com/theislab/pertpy
 [pip]: https://pip.pypa.io
 [poetry]: https://python-poetry.org/
 [python installation guide]: http://docs.python-guide.org/en/latest/starting/installation/
 [tarball]: https://github.com/theislab/pertpy/tarball/master
+[scvi-tools]: https://docs.scvi-tools.org/en/latest/installation.html
+[Homebrew]: https://brew.sh/
diff --git a/pertpy/plot/_augur.py b/pertpy/plot/_augur.py
@@ -26,6 +26,23 @@ def dp_scatter(results: pd.DataFrame, top_n=None, ax: Axes = None, return_figure
 
         Returns:
             Axes of the plot.
+
+        Examples:
+            >>> import pertpy as pt
+            >>> adata = pt.dt.bhattacherjee()
+            >>> ag_rfc = pt.tl.Augur("random_forest_classifier")
+
+            >>> data_15 = ag_rfc.load(adata, condition_label="Maintenance_Cocaine", treatment_label="withdraw_15d_Cocaine")
+            >>> adata_15, results_15 = ag_rfc.predict(data_15, random_state=None, n_threads=4)
+            >>> adata_15_permute, results_15_permute = ag_rfc.predict(data_15, augur_mode="permute", n_subsamples=100, random_state=None, n_threads=4)
+
+            >>> data_48 = ag_rfc.load(adata, condition_label="Maintenance_Cocaine", treatment_label="withdraw_48h_Cocaine")
+            >>> adata_48, results_48 = ag_rfc.predict(data_48, random_state=None, n_threads=4)
+            >>> adata_48_permute, results_48_permute = ag_rfc.predict(data_48, augur_mode="permute", n_subsamples=100, random_state=None, n_threads=4)
+
+            >>> pvals = ag_rfc.predict_differential_prioritization(augur_results1=results_15, augur_results2=results_48, \
+                permuted_results1=results_15_permute, permuted_results2=results_48_permute)
+            >>> pt.pl.ag.dp_scatter(pvals)
         """
         x = results["mean_augur_score1"]
         y = results["mean_augur_score2"]
@@ -69,6 +86,14 @@ def important_features(
 
         Returns:
             Axes of the plot.
+
+        Examples:
+            >>> import pertpy as pt
+            >>> adata = pt.dt.sc_sim_augur()
+            >>> ag_rfc = pt.tl.Augur("random_forest_classifier")
+            >>> loaded_data = ag_rfc.load(adata)
+            >>> v_adata, v_results = ag_rfc.predict(loaded_data, subsample_size=20, select_variance_features=True, n_threads=4)
+            >>> pt.pl.ag.important_features(v_results)
         """
         if isinstance(data, AnnData):
             results = data.uns[key]
@@ -115,6 +140,14 @@ def lollipop(
 
         Returns:
             Axes of the plot.
+
+        Examples:
+            >>> import pertpy as pt
+            >>> adata = pt.dt.sc_sim_augur()
+            >>> ag_rfc = pt.tl.Augur("random_forest_classifier")
+            >>> loaded_data = ag_rfc.load(adata)
+            >>> v_adata, v_results = ag_rfc.predict(loaded_data, subsample_size=20, select_variance_features=True, n_threads=4)
+            >>> pt.pl.ag.lollipop(v_results)
         """
         if isinstance(data, AnnData):
             results = data.uns[key]
@@ -157,6 +190,15 @@ def scatterplot(
 
         Returns:
             Axes of the plot.
+
+        Examples:
+            >>> import pertpy as pt
+            >>> adata = pt.dt.sc_sim_augur()
+            >>> ag_rfc = pt.tl.Augur("random_forest_classifier")
+            >>> loaded_data = ag_rfc.load(adata)
+            >>> h_adata, h_results = ag_rfc.predict(loaded_data, subsample_size=20, n_threads=4)
+            >>> v_adata, v_results = ag_rfc.predict(loaded_data, subsample_size=20, select_variance_features=True, n_threads=4)
+            >>> pt.pl.ag.scatterplot(v_results, h_results)
         """
         cell_types = results1["summary_metrics"].columns
 

diff --git a/pertpy/plot/_coda.py b/pertpy/plot/_coda.py
@@ -108,6 +108,15 @@ def stacked_barplot(  # pragma: no cover
 
         Returns:
             A :class:`~matplotlib.axes.Axes` object
+
+        Examples:
+            Example with scCODA:
+            >>> import pertpy as pt
+            >>> haber_cells = pt.dt.haber_2017_regions()
+            >>> sccoda = pt.tl.Sccoda()
+            >>> mdata = sccoda.load(haber_cells, type="cell_level", generate_sample_level=True, cell_type_identifier="cell_label", \
+                sample_identifier="batch", covariate_obs=["condition"])
+            >>> pt.pl.coda.stacked_barplot(mdata, feature_name="samples")
         """
         if isinstance(data, MuData):
             data = data[modality_key]
@@ -196,6 +205,17 @@ def effects_barplot(  # pragma: no cover
         Returns:
             Depending on `plot_facets`, returns a :class:`~matplotlib.axes.Axes` (`plot_facets = False`)
             or :class:`~sns.axisgrid.FacetGrid` (`plot_facets = True`) object
+
+        Examples:
+            Example with scCODA:
+            >>> import pertpy as pt
+            >>> haber_cells = pt.dt.haber_2017_regions()
+            >>> sccoda = pt.tl.Sccoda()
+            >>> mdata = sccoda.load(haber_cells, type="cell_level", generate_sample_level=True, cell_type_identifier="cell_label", \
+                sample_identifier="batch", covariate_obs=["condition"])
+            >>> mdata = sccoda.prepare(mdata, formula="condition", reference_cell_type="Endocrine")
+            >>> sccoda.run_nuts(mdata, num_warmup=100, num_samples=1000, rng_key=42)
+            >>> pt.pl.coda.effects_barplot(mdata)
         """
         if args_barplot is None:
             args_barplot = {}
@@ -366,6 +386,15 @@ def boxplots(  # pragma: no cover
         Returns:
             Depending on `plot_facets`, returns a :class:`~matplotlib.axes.Axes` (`plot_facets = False`)
             or :class:`~sns.axisgrid.FacetGrid` (`plot_facets = True`) object
+
+        Examples:
+            Example with scCODA:
+            >>> import pertpy as pt
+            >>> haber_cells = pt.dt.haber_2017_regions()
+            >>> sccoda = pt.tl.Sccoda()
+            >>> mdata = sccoda.load(haber_cells, type="cell_level", generate_sample_level=True, cell_type_identifier="cell_label", \
+                sample_identifier="batch", covariate_obs=["condition"])
+            >>> pt.pl.coda.boxplots(mdata, feature_name="condition", add_dots=True)
         """
         if args_boxplot is None:
             args_boxplot = {}
@@ -570,6 +599,17 @@ def rel_abundance_dispersion_plot(  # pragma: no cover
 
         Returns:
             A :class:`~matplotlib.axes.Axes` object
+
+        Examples:
+            Example with scCODA:
+            >>> import pertpy as pt
+            >>> haber_cells = pt.dt.haber_2017_regions()
+            >>> sccoda = pt.tl.Sccoda()
+            >>> mdata = sccoda.load(haber_cells, type="cell_level", generate_sample_level=True, cell_type_identifier="cell_label", \
+                sample_identifier="batch", covariate_obs=["condition"])
+            >>> mdata = sccoda.prepare(mdata, formula="condition", reference_cell_type="Endocrine")
+            >>> sccoda.run_nuts(mdata, num_warmup=100, num_samples=1000, rng_key=42)
+            >>> pt.pl.coda.rel_abundance_dispersion_plot(mdata)
         """
         if isinstance(data, MuData):
             data = data[modality_key]
@@ -677,6 +717,22 @@ def draw_tree(  # pragma: no cover
 
         Returns:
             Depending on `show`, returns :class:`ete3.TreeNode` and :class:`ete3.TreeStyle` (`show = False`) or  plot the tree inline (`show = False`)
+
+        Examples:
+            Example with tascCODA:
+            >>> import pertpy as pt
+            >>> adata = pt.dt.smillie()
+            >>> tasccoda = pt.tl.Tasccoda()
+            >>> mdata = tasccoda.load(
+            >>>     adata, type="sample_level",
+            >>>     levels_agg=["Major_l1", "Major_l2", "Major_l3", "Major_l4", "Cluster"],
+            >>>     key_added="lineage", add_level_name=True
+            >>> )
+            >>> mdata = tasccoda.prepare(
+            >>>     mdata, formula="Health", reference_cell_type="automatic", tree_key="lineage", pen_args={"phi": 0}
+            >>> )
+            >>> tasccoda.run_nuts(mdata, num_samples=1000, num_warmup=100, rng_key=42)
+            >>> pt.pl.coda.draw_tree(mdata, tree="lineage")
         """
         if isinstance(data, MuData):
             data = data[modality_key]
@@ -741,6 +797,22 @@ def draw_effects(  # pragma: no cover
         Returns:
             Depending on `show`, returns :class:`ete3.TreeNode` and :class:`ete3.TreeStyle` (`show = False`)
             or  plot the tree inline (`show = False`)
+
+        Examples:
+            Example with tascCODA:
+            >>> import pertpy as pt
+            >>> adata = pt.dt.smillie()
+            >>> tasccoda = pt.tl.Tasccoda()
+            >>> mdata = tasccoda.load(
+            >>>     adata, type="sample_level",
+            >>>     levels_agg=["Major_l1", "Major_l2", "Major_l3", "Major_l4", "Cluster"],
+            >>>     key_added="lineage", add_level_name=True
+            >>> )
+            >>> mdata = tasccoda.prepare(
+            >>>     mdata, formula="Health", reference_cell_type="automatic", tree_key="lineage", pen_args={"phi": 0}
+            >>> )
+            >>> tasccoda.run_nuts(mdata, num_samples=1000, num_warmup=100, rng_key=42)
+            >>> pt.pl.coda.draw_effects(mdata, covariate="Health[T.Inflamed]", tree="lineage")
         """
         if isinstance(data, MuData):
             data = data[modality_key]
@@ -895,6 +967,19 @@ def effects_umap(  # pragma: no cover
 
         Returns:
             If `show==False` a :class:`~matplotlib.axes.Axes` or a list of it.
+
+        Examples:
+            Example with scCODA:
+            >>> import pertpy as pt
+            >>> haber_cells = pt.dt.haber_2017_regions()
+            >>> sccoda = pt.tl.Sccoda()
+            >>> mdata = sccoda.load(haber_cells, type="cell_level", generate_sample_level=True, cell_type_identifier="cell_label", \
+                sample_identifier="batch", covariate_obs=["condition"])
+            >>> mdata = sccoda.prepare(mdata, formula="condition", reference_cell_type="Endocrine")
+            >>> sccoda.run_nuts(mdata, num_warmup=100, num_samples=1000, rng_key=42)
+
+            >>> pt.pl.coda.effects_umap(mdata, effect_name="", cluster_key="")
+            #TODO: Add effect_name parameter and cluster_key and test the example
         """
         data_rna = data[modality_key_1]
         data_coda = data[modality_key_2]

diff --git a/pertpy/plot/_dialogue.py b/pertpy/plot/_dialogue.py
@@ -28,6 +28,16 @@ def split_violins(
 
         Returns:
             A :class:`~matplotlib.axes.Axes` object
+
+        Examples:
+            >>> import pertpy as pt
+            >>> import scanpy as sc
+            >>> adata = pt.dt.dialogue_example()
+            >>> sc.pp.pca(adata)
+            >>> dl = pt.tl.Dialogue(sample_id = "clinical.status", celltype_key = "cell.subtypes", \
+                n_counts_key = "nCount_RNA", n_mpcs = 3)
+            >>> adata, mcps, ws, ct_subs = dl.calculate_multifactor_PMD(adata, normalize=True)
+            >>> pt.pl.dl.split_violins(adata, split_key='gender', celltype_key='cell.subtypes')
         """
         df = sc.get.obs_df(adata, [celltype_key, mcp, split_key])
         if split_which is None:
@@ -56,6 +66,19 @@ def pairplot(self, adata: AnnData, celltype_key: str, color: str, sample_id: str
 
         Returns:
             Seaborn Pairgrid object.
+
+        Examples:
+            >>> import pertpy as pt
+            >>> import scanpy as sc
+            >>> adata = pt.dt.dialogue_example()
+            >>> sc.pp.pca(adata)
+            >>> dl = pt.tl.Dialogue(sample_id = "clinical.status", celltype_key = "cell.subtypes", \
+                n_counts_key = "nCount_RNA", n_mpcs = 3)
+            >>> adata, mcps, ws, ct_subs = dl.calculate_multifactor_PMD(adata, normalize=True)
+            #>>> dl_pl=pt.pl.dl()
+            #>>> dl_pl.pairplot(adata=adata, celltype_key="cell.subtypes", color="gender", sample_id="clinical.status")
+            >>> pt.pl.dl.pairplot(adata, celltype_key="cell.subtypes", color="gender", sample_id="clinical.status")
+            #TODO: Is self parameter there on purpose -> create DialoguePlot object first?
         """
         mean_mcps = adata.obs.groupby([sample_id, celltype_key])[mcp].mean()
         mean_mcps = mean_mcps.reset_index()

diff --git a/pertpy/plot/_guide_rna.py b/pertpy/plot/_guide_rna.py
@@ -41,6 +41,17 @@ def heatmap(
          Returns:
              List of Axes. Alternatively you can pass save or show parameters as they will be passed to sc.pl.heatmap.
              Order of cells in the y axis will be saved on adata.obs[key_to_save_order] if provided.
+
+        Examples:
+            Each cell is assigned to gRNA that occurs at least 5 times in the respective cell, which is then
+            visualized using a heatmap.
+
+            >>> import pertpy as pt
+            >>> mdata = pt.data.papalexi_2021()
+            >>> gdo = mdata.mod['gdo']
+            >>> ga = pt.pp.GuideAssignment()
+            >>> ga.assign_by_threshold(gdo, assignment_threshold=5)
+            >>> pt.pl.guide.heatmap(gdo)
         """
         data = adata.X if layer is None else adata.layers[layer]