This repository contains custom scripts and documentation associated with the analyses in our BioRxiv manuscript concerning the application of our TALON pipeline to long read transcriptomes from PacBio and direct-RNA Oxford Nanopore.
You can find the preprint here.
To download the TALON program, please visit https://github.com/mortazavilab/TALON.
To download the TranscriptClean program, please visit https://github.com/mortazavilab/TranscriptClean.
To download the ENCODE DCC deployment of the TALON pipeline, please visit https://github.com/ENCODE-DCC/long-read-rna-pipeline.
plotting_scripts: Final versions of data visualization scripts used in the paper.
Figure_2: Describes exactly how the panels of Figure 2 in the paper were generated
Figure_3: Describes exactly how the panels of Figure 3 in the paper were generated
Figure_4: Describes exactly how the panels of Figure 4 in the paper were generated
Figure_5: Describes exactly how the panels of Figure 5 in the paper were generated
Supplement: Describes exactly how the figures in the supplment were generated
refs: Contains instructions for downloading reference genomes, GENCODE annotations, and variant files. Also includes scripts we used to prepare input files for TranscriptClean and TALON initializations.
Illumina: Scripts for downloading short-read Illumina RNA-seq data from the ENCODE consortium and running Kallisto on them in order to perform short read quantification.
data_processing: Scripts used to run TALON and post-TALON utilities (ie filtering, generating GTFs, generating abundance matrices), as well as other various analysis scripts.