Questions about searching the Pierce iRT peptides. #1673

vindr20 · 2024-07-15T03:29:37Z

- Upload your log file
(If a log file hasn't been generated, go to the 'Run' tab in FragPipe, click 'Export Log', zip the resulting "log_[date_time].txt" file to avoid truncation, then attach the zipped file by drag & drop here.)
log_2024-07-14_20-07-56.txt
log_2024-07-14_20-27-43.txt

- Describe the issue or question:
I'm having issues working with Pierce iRT standards in my samples. In general, fragpipe seems to have a lot of trouble ID'ing them (0 or 1 peptide IDs), even when I inject pure standards and add c-terminal heavy lysines/arginine as fixed modifications. I've tested with both DIA and DDA methods, and manual examination in skyline shows quite convincing spectra that are acquired by both methods. My fasta file is currently just the Pierce standards plus decoys and contaminants, but I have also experienced this issue with a full h.sapiens fasta with the pierce standards appended.

Could you please advise me as to what, if anything, I may be doing incorrectly?

fcyu · 2024-07-15T03:44:37Z

It seems that something was wrong with your LC-MS files or fasta file. Some hits:
DDA

[progress: 262/262 (100%) - 2278 spectra/s] 0.1s | remapping alternative proteins and postprocessing 0.2 s

DIA

[progress: 3169/3169 (100%) - 9632 spectra/s] 0.3s

There are too few scans in both DDA and DIA.

If you like, could you upload your fasta files and raw files to https://www.dropbox.com/request/0OzwbMC4xGe8PQCUBqJB ? I will take a closer look.

Best,

Fengchao

vindr20 · 2024-07-16T02:08:51Z

I've uploaded my raw files and the pierce retention time standards fasta. In fragpipe, I add decoys and contaminants before searching.

The number of scans seems about right to me though; we have an older/slower instrument (QE+) and these are short runs, so DIA doesn't generate that many scans, and I don't think DDA would be expected to trigger many acquisitions when the sample is pure standards. Let me know if I misunderstood your point though.

In case it is useful: I also tried spiking in the standard peptides into a standard digest and analyzing over a longer gradient with DIA, with similar issues; I can also share those files if you'd like.

Thank you for help! I really do appreciate it.

fcyu · 2024-07-16T02:48:18Z

Thanks for uploading your files. The * in your fasta file broke the program

>PIERCE_88320
SSAAPPPPPR*
GISNEGQNASIK*
HVLTSIGEK*
DIPVPKPK*
IGDYAGIK*
TASEFDSAIAQDK*
SAAGAFGPELSR*
ELGQSGVDTYLQTK*
GLILVGGYGTR*
GILFVGSGVSGGEEGAR*
SFANQPLEVVYSK*
LTILEELR*
NGFILDGFPR*
ELASGLSFPVGFK*
LSSEAPALFQFDLK

After removing the starts, FragPipe detected all 15 iRT peptides:
log_2024-07-15_22-42-22.txt
peptide.zip

Best,

Fengchao

vindr20 · 2024-07-16T03:41:36Z

Thank you! I was using a fasta file from another software pipeline, and didn't look too closely at it to see that it was atypical.

Removing the asterisks enabled fragpipe to find these peptides in the DDA data, as expected.

If it's acceptable to ask a follow-up question: Is there a way to search a sample with these peptides spiked-in without enabling variable modifications for the c-terminal heavy label across the whole proteome? I notice that I get a few IDs for proteins with heavy isotopic labels, which is obviously incorrect, and the search generally finds fewer proteins/peptides. But if I don't specify the heavy label as a fixed or variable modification, I can't find the standard peptides at all.

It seems to me that it would be better to search a database with only light peptides for the proteome, but still contains the heavy peptide standards, but I can't find an option for that.

fcyu · 2024-07-16T13:00:53Z

You can do that with a small trick

Change the heavy K to B in your fasta file
Change the heavy R to J in your fasta file
Set the fixed modification of B and J to the mass of heavy K and R, respectively
In the digest rules, change it from KR to KRBJ. Or, put the iRT peptides to separated proteins.

Best,

Fengchao

vindr20 · 2024-07-17T00:39:24Z

I have attempted this, but it seems that specifying custom amino acids breaks DIANN. Log file attached:
log_2024-07-16_17-31-20.txt

I attempted defining heavy lysine/arginine as modifications to B and J in the DIANN command line options, but it didn't seem to help.

fcyu · 2024-07-17T00:41:19Z

It is not DIA-NN, it is MSBooster @yangkl96 .

Best,

Fengchao

fcyu · 2024-07-24T14:49:48Z

@yangkl96 Any updates about this MSBooster error?

Thanks,

Fengchao

yangkl96 · 2024-07-24T14:56:56Z

Sorry I just saw this. MSBooster is not currently equipped to handle custom amino acids. I can implement this right now and get back to you ASAP

yangkl96 · 2024-07-24T17:30:24Z

Hi @vindr20 ,

Attached below is a new MSBooster version that should support B and J. Please let us know if this works for you

https://www.dropbox.com/scl/fi/9v0men3eae218icysokfd/MSBooster-1.2.39.jar?rlkey=axfwxfbkxpec0fjl51htunaql&dl=0

Best,
Kevin

vindr20 · 2024-07-24T17:47:04Z

Thank you for your help! I don't seem to have permissions/access to that dropbox link though. Could you adjust it so I can access the files?

yangkl96 · 2024-07-24T17:50:27Z

Yes you should have permissions now: https://www.dropbox.com/scl/fi/9v0men3eae218icysokfd/MSBooster-1.2.39.jar?rlkey=axfwxfbkxpec0fjl51htunaql&dl=0

vindr20 · 2024-07-24T18:51:02Z

Okay, I had a chance to try this. Unfortunately, the pipeline still breaks, albeit further down this time. Log file attached. If I had to guess from looking at it, easypqp doesn't know how to handle the new amino acids either.
log_2024-07-24_11-39-08.txt

I did check that disabling fixed modifications to B/J, and setting trypsin to only cleave at 'KR' allowed the pipeline to process as per usual.

fcyu · 2024-07-24T19:34:00Z

Thank you so much for the testing.

The error is because EasyPQP doesn't support the noncanonical amino acids. I have fixed it (grosenberger/easypqp@17d49cd) and released a new version. Could you upgrade EasyPQP in the FragPipe "config" tab and try again?

Thanks,

Fengchao

vindr20 · 2024-07-24T22:23:19Z

I updated easypqp to 0.1.48 and tried again, but it still failed. Log file attached.
log_2024-07-24_15-21-37.txt

Nesvilab/FragPipe#1673 (comment)

fcyu · 2024-07-25T02:05:44Z

I apologize for the oversight. I should have tested it before pushing the commits.

It is actually more complicated than I thought. I pushed a new commit, Nesvilab/easypqp@83247ba, trying to fix it, BUT OpenMS, which is a C++ library used by EasyPQP, threw another error

RuntimeError: the value 'B' was used but is not valid; Modification '': origin must be a letter from A to Y, excluding B and J.

Changing the C++ library is complicated because needing to coordinate the whole OpenMS team. I have submitted a ticket to OpenMS/OpenMS#7554. Let's hope that they will implement this feature soon.

For now, you could use U and O for labeled K and R, respectively. Note that U has the non-zero mass 150.95363 and O has the non-zero mass 237.14773. You need to set the fixed modifications equal to the mass difference of labeled K/R and U/O.

Let me know if you have any questions or get any errors when running FragPipe.

Best,

Fengchao

vindr20 · 2024-07-25T03:22:33Z

I have attempted using O and U, and successfully identified several Pierce standards spiked into a sample, but to be honest, this doesn't seem like it is performing well compared to allowing heavy c-terminal residues as a variable modification. To elaborate:

Using O/U to describe heavy lysine/arginine resulted in fewer identified standard peptides (6) than allowing variable heavy K/R c-termini globally (14 peptides identified). I'm not sure why this is, but it is very problematic.
Using the skyline export feature, skyline does not import any of these standard peptides, despite all of them being easily found manually. Presumably this is because Skyline does not support O/U. This wasn't a huge problem because I knew what to look for, but this approach means that I lose the benefit of importing any predicted spectra for the standard peptides.
Using the spike-in standards for retention time alignment fails when using O/U to encode the heavy lysine/arginine. Log file attached for this one.
log_2024-07-24_19-46-26.txt

I tend to think it would be more elegent if there was a way to specify protein-specific modifications - that way only the standards would be modified, and all software involved would agree that they were looking at heavy lysine/arginine. I know MaxQuant has that feature, but I suspect it's not trivial to implement.

In any case, thank you for your help! I hope this is an area that can see active development; if the software can take advantage of them, these spike-in standards have a lot of value for some of our clinical test R&D.

fcyu · 2024-07-25T03:37:25Z

Yes, I agree. It seems that using noncanonical amino acids to replace the labeled ones is not very ideal. We will discuss to see if we can implement the protein-specific modifications easily.

Best,

Fengchao

fcyu self-assigned this Jul 15, 2024

fcyu changed the title ~~Difficulty finding iRT peptides in pure iRT sample~~ * in the protein sequences broke the program Jul 16, 2024

fcyu assigned yangkl96 Jul 17, 2024

fcyu mentioned this issue Jul 22, 2024

Optimization of Fragpipe for Special Sample with Isotopologue Peptides #1686

Closed

fcyu referenced this issue in grosenberger/easypqp Jul 24, 2024

Add BJXZ amino acids to the amino acid mass list.

17d49cd

fcyu added a commit to Nesvilab/easypqp that referenced this issue Jul 25, 2024

Fix a bug left by 17d49cd

83247ba

Nesvilab/FragPipe#1673 (comment)

fcyu mentioned this issue Jul 25, 2024

Support "non-existing" amino acids BJXZ OpenMS/OpenMS#7554

Closed

fcyu changed the title ~~* in the protein sequences broke the program~~ Questions about searching the Pierce iRT peptides. Jul 25, 2024

fcyu mentioned this issue Jul 25, 2024

Fix or remove the warning from ProteinProphet: WARNING: Found the following zero-mass residues in protein entry *** Nesvilab/philosopher#501

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about searching the Pierce iRT peptides. #1673

Questions about searching the Pierce iRT peptides. #1673

vindr20 commented Jul 15, 2024

fcyu commented Jul 15, 2024 •

edited

Loading

vindr20 commented Jul 16, 2024

fcyu commented Jul 16, 2024

vindr20 commented Jul 16, 2024

fcyu commented Jul 16, 2024

vindr20 commented Jul 17, 2024

fcyu commented Jul 17, 2024

fcyu commented Jul 24, 2024

yangkl96 commented Jul 24, 2024

yangkl96 commented Jul 24, 2024

vindr20 commented Jul 24, 2024

yangkl96 commented Jul 24, 2024

vindr20 commented Jul 24, 2024

fcyu commented Jul 24, 2024

vindr20 commented Jul 24, 2024

fcyu commented Jul 25, 2024

vindr20 commented Jul 25, 2024 •

edited

Loading

fcyu commented Jul 25, 2024

Questions about searching the Pierce iRT peptides. #1673

Questions about searching the Pierce iRT peptides. #1673

Comments

vindr20 commented Jul 15, 2024

fcyu commented Jul 15, 2024 • edited Loading

vindr20 commented Jul 16, 2024

fcyu commented Jul 16, 2024

vindr20 commented Jul 16, 2024

fcyu commented Jul 16, 2024

vindr20 commented Jul 17, 2024

fcyu commented Jul 17, 2024

fcyu commented Jul 24, 2024

yangkl96 commented Jul 24, 2024

yangkl96 commented Jul 24, 2024

vindr20 commented Jul 24, 2024

yangkl96 commented Jul 24, 2024

vindr20 commented Jul 24, 2024

fcyu commented Jul 24, 2024

vindr20 commented Jul 24, 2024

fcyu commented Jul 25, 2024

vindr20 commented Jul 25, 2024 • edited Loading

fcyu commented Jul 25, 2024

fcyu commented Jul 15, 2024 •

edited

Loading

vindr20 commented Jul 25, 2024 •

edited

Loading