GitHub - alfredsimkin/copia_regulates_plasticity: scripts needed to reproduce the figures of the manuscript "The Retrotransposon Copia regulates structural synaptic plasticity at the Drosophila Neuromuscular Junction"

About:
This repository is for people who would like to reproduce the computational results of our manuscript "The Retrotransposon Copia regulates structural synaptic plasticity at the Drosophila Neuromuscular Junction".

This repository assumes users who have sudo access to an ubuntu linux distribution (either physical or in a virtual machine) and assumes some knowledge of Unix (e.g. modifying .bashrc files to modify your $PATH variable to point to binary files and/or copying binary files into an existing $PATH folder).

This repository also makes the implicit assumption that users who want to know how any given result was gathered can follow the flow of a snakefile, and have some knowledge of Python scripting. Please write to me if you have questions!

Setup:

Download or clone this repository and unzip it on an Ubuntu machine, and cd to it.
Obtain snakemake:
https://snakemake.readthedocs.io/en/stable/getting_started/installation.html
I followed the instructions, and obtained mambaforge by downloading a shell script for my linux distro from here:
https://github.com/conda-forge/miniforge#mambaforge
Then I obtained snakemake with this command:
mamba create -c conda-forge -c bioconda -n snakemake snakemake
I tested this on snakemake version 7.19.1
Download long sequencing reads and illumina reads, and setup renamed shortcuts to files:
cd to the downloaded_data folder
conda activate snakemake
snakemake -s download_data.smk --cores {your_desired_core_count}
Obtain FLAIR from here:
https://flair.readthedocs.io/en/latest/
I used FLAIR version 1.7.0 for testing and installed snakemake alongside FLAIR in a single conda environment with this command:
conda create -n flair -c conda-forge -c bioconda flair snakemake
Obtain blat from here:
http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/blat/
copy the executable file to your $PATH
Obtain the dm6 genome from here (I use wget for this):
https://hgdownload.soe.ucsc.edu/goldenPath/dm6/bigZips/dm6.fa.gz
Unzip it with gunzip dm6.fa.gz
Obtain minimap2:
install curl (if needed) with (e.g.) sudo apt install curl
Obtain minimap2 binaries from here:
curl -L https://github.com/lh3/minimap2/releases/download/v2.17/minimap2-2.17_x64-linux.tar.bz2 | tar -jxvf -
cd to minimap2-2.17_x64-linux
copy the minimap2 executable to your $PATH
Obtain salmon:
sudo apt install salmon
I used salmon 1.4.0
Obtain scipy and plotly:
conda activate snakemake
conda install scipy
conda install plotly
Obtain kaleido:
sudo apt install python-is-python3
sudo apt install pip
pip install -U kaleido

Reproducing CNS vs. BWM isoform abundances results:

cd to the FLAIR_isoforms folder
conda activate flair
snakemake -s run_flair.smk --cores {your_desired_core_count}
examine FLAIR_isoforms/output_files/mapped_reads.counts.tsv

Reconstructing TE seeds:

cd to the constuct_TE_seeds folder
conda activate snakemake
snakemake -s construct_seeds.smk --cores {your_desired_core_count}
examine construct_TE_seeds/output_files/dm6_TE_seeds.fa

measuring copia chimeric long reads (requires "reconstructing TE seeds" as a precursor):

cd to the chimeric_copia_reads folder
conda activate snakemake
snakemake -s chimeric_copia.smk --cores {your_desired_core_count)
examine these output files:
chimeric_copia_reads/output_files/CNS_final_summary.tsv - see especially column B of the 7 copia reads that map to refseq genes and RNAi regions, around row 1275
chimeric_copia_reads/output_files/seed_mapping/CNS_seed_results.tsv (there are 1225 reads that have a >100 nucleotide match to consensus copia, but only 1224 of them have a >100 nucleotide match to the genome)

measuring skews in gene expression following RNAi:

cd to the RNAi_skewed_gene_expression folder
conda activate snakemake
snakemake -s modular_snakefile.smk
examine RNAi_skewed_gene_expression/output_files/summarized_final_stats folder

searching for RNAi regions in mature and immature Refseq sequences (requires "reconstructing TE seeds" and "measuring copia chimeric long reads" as precursors):

cd to the RNAi_refseq_genomic folder
conda activate snakemake
snakemake -s RNAi_refseq_genomic.smk --cores {your_desired_core_count}
examine these output files:
RNAi_refseq_genomic/output_files/summaries/dm6_genes_containment_stats.tsv
RNAi_refseq_genomic/output_files/summaries/dm6_genes_plus_introns_containment_stats.tsv

searching for copia-mapping regions of the genome:

cd to the copia_consensus_genome_mappers folder
conda activate snakemake
snakemake -s copia_genomic.smk --cores {your_desired_core_count}
examine these output files:
copia_consensus_genome_mappers/output_files/region_stats.tsv
copia_consensus_genome_mappers/output_files/summarized_genomic_copia.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
FLAIR_isoforms		FLAIR_isoforms
RNAi_refseq_genomic		RNAi_refseq_genomic
RNAi_skewed_gene_expression		RNAi_skewed_gene_expression
chimeric_copia_reads		chimeric_copia_reads
construct_TE_seeds		construct_TE_seeds
copia_consensus_genome_mappers		copia_consensus_genome_mappers
downloaded_data		downloaded_data
.gitattributes		.gitattributes
README.md		README.md
manuscript 01-17_ATS.docx		manuscript 01-17_ATS.docx

alfredsimkin/copia_regulates_plasticity

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages