Single cell label-free DIA quantification #426
RalfG
started this conversation in
Potential new module to discuss
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Aim of the new module
This proposed module addresses a need for benchmarking identification and quantification workflows for low-input and single cell proteomics (SCP). Currently, it is not known whether benchmarking results from high-input samples can be transferred to low-input proteomics. Indeed, typical low-input / single-cell datasets have different properties from bulk proteomics datasets, for instance shorter gradients, larger impact of contaminants, etc.
Full description of the new module
Many acquisition and labeling strategies have been proposed and tested for SCP. Currently, the field seems to mostly converge towards label-free data independent acquisition (DIA) as SCP method of choice. Compared to various labeling strategies, label-free proteomics is highly accessible and simplifies sample preparation. Compared to data-dependent acquisition, DIA promises to lead to fewer missing values and a more consistent quantification accuracy.
We therefore propose to use a label-free DIA dataset as a first SCP benchmarking module. Similarly to the existing bulk proteomics quantification benchmarking modules, multiple low-input multispecies spike-in datasets are available that approximate a ground truth for assessing quantification accuracy. Using a similar type of dataset allows us to reuse much of the codebase and the metrics of the existing quantification ProteoBench modules.
Recently, a two proteome mix of HeLa and yeast acquired on a Thermo Orbitrap Astral was submitted as a preprint on bioRxiv (https://doi.org/10.1101/2024.02.01.578358). The dataset contains data for low-input samples of 250 pg in three different ratios of the two species. We propose to use the 240:10 and 200:50 ratios in the benchmark, as these are the closest to each other, reflecting a scenario closer to real-life differential protein abundances.
Starting from the code base of the existing DIA quantification module, only limited changes will be required. For instance, only two species are present in this dataset, instead of three. Secondary metrics that are more specific to low-input proteomics could also be added, such as the missingness levels.
For this module, various datasets were considered. For instance, a technical HeLa dataset spiked in with E. coli proteins was recently acquired on a timsTOF Pro 2 instrument (PXD053462, https://doi.org/10.1038/s41467-024-52605-x). However, we opted to not include this dataset for benchmarking yet, as the newer timsTOF Ultra 2, or the timsTOF SCP, are more commonly used for low-input proteomics. Interestingly, this dataset contains specific higher-input runs to be used as matching enhancer (i.e., a run that can be used for matching-between-runs, optimizing the search engine’s internal spectral library of the experiment). As this method is steadily increasing in popularity, we propose to add a specific module to benchmark such approaches in the near future.
SCP is a young and fast-moving field. We expect that in the near future, additional modules for other labeling and acquisition setups could be added to ProteoBench. Notably, there is a high interest in novel DIA-compatible multiplexing strategies that allow for an increased sample throughput.
Unfortunately, low-input spike-in samples still do not accurately reflect real-world single cells, but are currently the only way to obtain an approximation of ground truth for quantification. If alternative benchmarking methods with actual single cell data becomes available, a new module should be added with such a dataset.
Potential reviewers
Laurent Gatto (@lgatto)
Will you be able to work on the implementation (coding) yourself, with additional help from the ProteoBench maintainers?
Any other information
This proposal was conceived at the pre-conference hackathon session of the iSCMS 2024 conference at DTU, Denmark by the following participants:
Beta Was this translation helpful? Give feedback.
All reactions