Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signal/Transit peptides in Reactome (12 confirmed cases) #304

Open
nataled opened this issue Dec 6, 2023 · 3 comments
Open

Signal/Transit peptides in Reactome (12 confirmed cases) #304

nataled opened this issue Dec 6, 2023 · 3 comments

Comments

@nataled
Copy link
Collaborator

nataled commented Dec 6, 2023

The following is the list of signal or transit peptides given as EWASes in Reactome:

1-58 of O00116 is a transit peptide (Reactome:R-HSA-9033519)
1-26 of P09110 is a transit peptide (Reactome:R-HSA-9033534)
1-47 of P06576 is a transit peptide (Reactome:R-HSA-8986172)
1-52 of Q9H4I9 is a transit peptide (Reactome:R-HSA-8949642)
1-28 of O95169 is a transit peptide (Reactome:R-HSA-8986166)
1-34 of Q96H96 is a transit peptide (Reactome:R-HSA-8986157)
1-32 of P00480 is a transit peptide (Reactome:R-HSA-8986137)
1-30 of O14832 is a transit peptide (Reactome:R-HSA-9033518)
1-26 of P10809 is a transit peptide (Reactome:R-HSA-8986171)
1-41 of Q16595 is a transit peptide (Reactome:R-HSA-8986117)
1-61 of P05496 is a transit peptide (Reactome:R-HSA-8986143)
1-25 of P01160 is a signal peptide (Reactome:R-HSA-5578791)

Note that these were found by cross-checking the signal/transit peptide annotation in UniProtKB against the sequence range given in Reactome. Thus, only perfect matches will be found. In other words, if there is a difference (such as if Reactome has an old transit peptide sequence range 1-30 but UniProtKB changed it to 1-31) then I won't be able to find it.

At least two of these (R-HSA-5578791 and R-HSA-8986143; last two on the list) seem to be just an output of proteolytic processing.

@deustp01
Copy link
Collaborator

deustp01 commented Dec 6, 2023

Indeed, Reactome has not annotated any role for any of the aminoterminal peptide fragments on this sample list, and for most of the ones associated with proteins that are synthesized in the cytosol and translocated to the mitochondrial matrix, Reactome has annotated the degradation of the aminoterminal peptide once inside the mitochondrion.

I'm passing the list on to Reactome curators to discuss how we want to annotate such targeting peptides (and thus whether they need separate proteoform IDs). Specifically,

The issue for discussion now is whether we actually gain any useful information by annotating these EWASs. Could we simply create unbalanced reactions with an input of the full-length protein and an output of the trimmed mature protein (or equivalently black box events, thereby evading the requirement for mass balance of inputs and outputs)?

My reaction is that while that seems OK for these leader / transit peptide cases, I don’t see a reliable way to distinguish them from cases in which both products of the cleavage reaction have functions (even if we haven’t fully annotated those functions yet). And without a good, general way to make that distinction, I don’t see a way to make a curation guideline backed up by a QA script, and also user documentation to explain our curation rationale. On the other hand, should we be creating EWASs that have no function?
A practical consideration is that the total number of useless EWASs involved here is small, especially as in most cases, we start our annotations at the point in a protein’s life at which it is already mature – cleaved and correctly localized. As a result, whatever we decide will have a small effect on our content, and we should reach a conclusion that allows consistent annotation for the smallest possible amount of work making new rules and cleaning up existing exceptions.

More thoughts?

@MarijaOM
Copy link

MarijaOM commented Dec 8, 2023

In my opinion, having two subclasses of cleavage reactions - one where all cleavage products are shown (because all participate in downstream events) and another where only some are shown, would be confusing to curators and users and inconsistent from the standpoint of our data model. Also, the class of black box events would be overloaded by reactions whose products are clearly experimentally defined and simple to annotate, but we choose not to do it. In addition, we do provide coordinates and controlled vocabulary names for all cleavage fragments, so they can be clearly distinguished from other products of posttranslational processing that refer to the same UniProt entry.

@deustp01
Copy link
Collaborator

deustp01 commented Dec 8, 2023

In my opinion, ...

This sounds right. We already work around this problem, as noted two steps above, by simply not annotating the cleavage step of protein maturation unless we have some specific reason to do it: the protein springs into existence already mature, correctly located, and ready to function. So perhaps the most useful question is whether we need some guidelines for when to make an exception and annotate the cleavage steps of maturation, accounting for all of the products, to keep our own annotation consistent and to explain it to users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants