NOTE

This is a fresh software. Initially, I recommend you cross-check the results with InfernoRDN (homepage). This one is a rewrite of the RRollup method found in InfernoRDN.

Known discrepancies from original:

Filtering of a dataset containing ~1200 proteins led to this version filtering out four proteins not filtered by InfernoRDN. Closer inspection showed that these didn't fulfill the criteria of 50% non-missing values in at least one peptide (10/21 present) but was anyway retained by InfernoRDN.

I wish:

Optionally write the scaled peptides, and the scaled peptides without outlier matrices
Optional profile plots of peptides and resulting protein

Install dependencies

install.packages(c("tidyverse", "outliers", "argparser", "matrixStats"))

Basic run examples from terminal

Can be executed from terminal using the protein_rollup.R script present in the RWorkflowModules Git repo (https://github.com/Jakob37/RWorkflowModules).

With design matrix

If you have multiple annotation columns in your data matrix, you need to use a design matrix to distinguish these

Rscript run_protein_rollup.R \
    --rdf_fp testdata/dummy_df.tsv \
    --ddf_fp ProteinRollup/testdata/dummy_df_design.tsv \
    --sample_col sample \
    --protein_col Protein \
    --out_fp test_out.tsv

With single-column annotation

If having an expression dataset with a single annotation column specifying the proteins, the following call can be executed.

Rscript run_protein_rollup.R \
    --rdf_fp testdata/dummy_df.tsv \
    --out_fp test_out.tsv \
    --one_column_mode TRUE

Basic run example from R

library(tidyverse)
source("ProteinRollup.R")

peptide_rdf <- read_tsv("testdata/dummy_df.tsv")
peptide_mat <- peptide_rdf[, -1] %>% as.matrix()
protein_ids <- peptide_rdf$Protein

protein_rollup(protein_ids, peptide_mat)

What is RRollup?

A reference peptide is selected based on (1) fewest missing values and (2) if several peptides ties - the one with highest median intensity
Ratios to reference are calculated for all. The ratio is simply subtracting the log-scale reference from each peptide
For each non-reference peptide - the median distance to the reference are calculated
- If lower than min_overlap, then set to zero in the median calculation (this is from Inferno, maybe to NA would make more sense? Now bias towards reference)
The other peptides are shifted to the reference median. Intrapeptide variances are retained.
Outlier removal is performed
- minPs specifies how many peptides needs to be present with value in a sample for the outlier calculation to be carried out (default 5)
- pvalue specify how certain the Grubb outlier test should be for removal
Peptides can optionally be centered (TODO: What consequence?)
A rollup score is calculated (meaning what? why is not outlier removed peptides used here?)
The final protein matrix is returned

Key parameters for rollup

min_overlap - How many values should be in common between reference and other peptides for the other peptides to be considered

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
R		R
inst/testdata		inst/testdata
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
run_protein_rollup.R		run_protein_rollup.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOTE

Install dependencies

Basic run examples from terminal

With design matrix

With single-column annotation

Basic run example from R

What is RRollup?

Key parameters for rollup

About

Releases

Packages

Contributors 2

Languages

License

ComputationalProteomics/ProteinRollup

Folders and files

Latest commit

History

Repository files navigation

NOTE

Install dependencies

Basic run examples from terminal

With design matrix

With single-column annotation

Basic run example from R

What is RRollup?

Key parameters for rollup

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages