Welcome to the diffsplicing repository! Here you can find the codes that we have implemented for our paper Analysis of differential splicing suggests different modes of short-term splicing regulation. In our paper, we study short-term changes in splicing during signalling response within a cell line, using the RNA-seq time series data on estrogen receptor alpha (ERα) signalling response in MCF7 breast cancer cell line. The data has been introduced by Honkela, A. et al. and is accessible in the Gene Expression Omnibus (GEO) database with accession number GSE62789.
Very briefly, our methods outline can be summarized by the following steps:
- Alignment of the RNA-seq reads against the reference transcriptome
- Estimation of expression levels of the transcripts
- Mean and variance estimation in three settings:
- Overall gene expression levels
- Absolute transcript expression levels
- Relative transcript expression levels
- GP modeling in all settings (two alternative GP models fitted to each gene / transcript: "time-dependent" and "time-independent")
- Ranking genes and transcripts by Bayes factors
If one is interested in reproducing the results presented in the paper, (s)he may follow the following instructions. On the other hand, if you would like to apply our method with your own data, we recommend you to use our GPrank R package for better user experience. GPrank is available both on CRAN and GitHub.
(The following functions are included in run_functions.R. So, remember first: source("run_functions.R")
.)
Mean and variance estimation in three settings:
-
Computing overall gene expression:
computeGeneExpr()
-
Computing scaling factors:
computeScaleFac()
-
Scaling the overall gene expression levels, relative and absolute transcript expression levels:
scaleExpr()
-
Computing BitSeq means and variances:
computeBSmeanVar()
Feature transformation methods for relative transcript expression levels:
- Compute the means and variances after isometric log (as well as unlogged) ratio transformation has been applied to the relative expression levels:
applyTrans()
Modeled variances in the L shaped experiment design:
-
Run Metropolis Hastings algorithm to estimate the hyper-parameters alpha and beta at gene level and absolute transcript level:
runMH()
-
Write the means and modelled variances to separate files:
computeMeanModeledVar()
Ranking genes and transcripts by Bayes factors:
-
GPs with BitSeq variances:
computeBF_bitseqVar()
-
GPs with modeled variances:
computeBF_modeledVar()
-
GPs without fixed variances (naive):
computeBF_naive()
GPs with BitSeq variances for transformed expression levels:
-
ILRT transformation:
computeBF_ilrt()
-
IRT transformation:
computeBF_irt()