-
-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We don't know any metadata of large messages before fully downloading #5888
Comments
blurhash is probably enough and even smaller than low res image. also should contain other metadata like the message type and filename and so on.
the message is split for the large messages, if your limit is higher (likely the case) then those messages will already be downloaded automatically, so I don't think that this is a rare edge case.
I think in the beginning this should be the full message as the metadata message will be hidden on old clients, as far as I understood the idea.
Could also be both, then the double email would only appear in the case that user uses both DC and an encrypted MUA, in which case they would likely already have a filter rule to filter dc messages into the dc folder.
I think yes. we say in the FAQ that read receipt doesn't mean the other party has read or understood it.
I would just say add the new metadata message, do not modify the full message, dc then decides based on the size Imap reports if it should be downloaded. |
Yes, but in the vast majority of cases, the metadata email will be received first because it's sent first. I edited my original post because it was formulated confusingly. Still, probably we should handle even edge cases like message reordering on the server.
I don't think it's possible to hide messages on old clients, though I didn't check. The idea is that old clients simply show it as two messages, i.e. one message with the attachment and one with the text. I know some people who always send attachments this way on WhatsApp (i.e. first send the image and then send another message with the text), so I don't think this will be confusing to users. Again, I updated the original post accordingly.
I added this as "Solution 3"
I also like the simplicity, but again, it seems unlikely we can make old DC clients ignore messages. |
Seems I confused this with the internal message hidden parameter in core.
Then they will have double messages, metadata messages could have a different text in the text part, like the text and an additional info line that tells you to update DC, that info line gets shown by old clients and email, but ignored in new clients because there the headers are used (or json file, or however we encode the metadata):
|
Second part must contain the same text, because the first part may be dropped by spam filter.
I don't see any complexity that we can remove, it is always possible that first message does not arrive and it is always possible to receive large message from non-Delta Chat. Is there anything specific that could be removed? |
IMAP can also download individual parts of the message, so better send multi-part message instead of two messages. Then there is no need to handle cases when one part arrives and the other does not. |
I just talked to @link2xt and @adbenitez about this:
--BQCKQU395rfxVg9YO0H4HceR868ZwN
Content-Description: PGP/MIME version identification
Content-Type: application/pgp-encrypted
Version: 1
+[[[HERE]]]
--BQCKQU395rfxVg9YO0H4HceR868ZwN
Content-Description: OpenPGP encrypted message
Content-Disposition: inline; filename="encrypted.asc";
Content-Type: application/octet-stream; name="encrypted.asc"
-----BEGIN PGP MESSAGE-----
...
-----END PGP MESSAGE-----
+[[[OR HERE]]]
--BQCKQU395rfxVg9YO0H4HceR868ZwN-- ... and then ask the IMAP server to give us only the first 100 KB (or so) of the email, extract the metadata part, and decrypt it. |
Some security concern is that MITM can replay old metadata, exchange metadata for messages etc. We should have some token from the full message (could be Message-ID from the protected header) referenced by metadata. If they don't match after downloading the full message, downloaded full part should be discarded and error should be added on the message. |
We can add "intermediate signatures" in the form of some header which signs all other protected headers and the text part, this is also a compatible change. |
@iequidoo Not sure if I understood your idea correctly; is this what you mean:
Or are "intermediate signatures" some standard thing, if so, could you share a link explaining them? |
Just PGP-encrypt the message as usual, this doesn't change. Otherwise yes, that's the idea. Not sure if some standard exists for this, but looking for implementations doing smth similar makes sense. |
could we set the chat assignment and metadata headers first and cap the metadata header at 1 or two kb? The more complex solution would be to download the first chunks of the message until the whole header is received, like download first 1kb then the second kb and so on until the header was fully decrypted. I think thats too complex for the beginning, maybe sth if we anyways do chunk wise downloading to offer resumable downloads (then we could also keep the already downloaded header/metadata part and only remaining bytes the rest of the message to save some traffic, though maybe not worth the complexity) Excursion on resumable downloads.It would be interesting if we would be able to do that without requesting it in chunks, like counting received bytes until the connection is lost, but that might be too complicated and also has the disadvantage that it would block the imap connection until the file is fully downloaded, so chunked is possibly better even though chunked downloads have extra outbound data from the extra requests |
Looks like rPGP can't do streaming decryption. To be fair I didn't try very long, but I can't come up with any promising next steps. So, unless someone figures out how to do streaming decryption, I'm going to try separately encrypting the metadata and putting them into the email body (#5888 (comment)). |
I am also in favor of placing metadata as a separate message into the first MIME part of |
SEIPDv2 packet already allows chunked decryption: https://www.rfc-editor.org/rfc/rfc9580.html#section-5.13.2 |
If this is the sole problem, then it needs to be weighed against the implementation/cost of doing everything in a single message which seems to need changes in rpgp, imap-crates/commands, and some careful cryptographic design of the "signature over the preview/first-part of a message" as far as i gather. |
The options are:
Only the third option requires chunked decryption and detached signature. Single MIME message with preview OpenPGP message hidden somewhere in the first part or in the headers does not need any rpgp changes. |
I'm regularly updating this description to reflect the current state of the discussion.
In DC, we have a setting "auto-download messages" with the minimum value "160KiB". Delta Chat then won't automatically download larger message in order not to waste mobile data. But then, until the message is completely downloaded, we know almost nothing about the message since all the headers and the body is encrypted together. Sometimes we can't even assign a message to the correct chat.
This issue is about fixing this by separately encrypting some metadata, which can then be fetched without fetching the big attachment. This probably won't require any UI changes.
Motivation
Detailed Solution
Original idea
When sending out a message that would be over 160KiB in total, split it up into two messages internally: - First, send a small "metadata" email with all the metadata, and the message text if it fits. It has a special (encrypted) header referencing the Message-Id of the second message. - In the future, this could even contain a low-resolution preview of images. - Second, send a big "attachment-only" email with just the attachment (and possibly with the text, if it didn't fit - not sure if doing this creates too much code complexity, though). - The message should only be marked as OutDelivered after emails were sent out.When receiving the "metadata" email:
After an intermediate period:
New, better ideas:
We don't know any metadata of large messages before fully downloading #5888 (comment)Things to look out for:
Make sure to delete both emails when the message is deleted, whether by ephemeral messages, manual deletion, or delete-server-after config.Open questions
Questions that are resolved by the new, better ideas
- [ ] We shouldn't split up messages into 2 emails when sending to classical email users - Solution 1: Only split up encrypted messages; assuming that most classical email users can't encrypt. Since the headers are not a problem anyway, - Solution 2: Remember which contacs are classical email users: https://github.com//issues/2970 - [ ] Should a read receipt be sent after the user saw a partially downloaded message? - [ ] Should the "attachment-only" email include a cleartext header that marks it as attachment-only? - For now, I don't see a reason why we would need this. - We _could_ use this header to always completely ignore these emails until the user downloads them. - [ ] How exactly will old clients view split messages? - I don't _think_ it's possible to hide emails on old clients, though I didn't check. The current idea is that old clients simply show it as two messages, i.e. one message with the attachment and one with the text. I know some people who always send attachments this way on WhatsApp (i.e. first send the image and then send another message with the text), so I don't think this will be confusing to users. - In case it's possible to hide messages on old clients, we could send one "full" and one metadata-only email. Both emails would contain the message's text, and if the text is too long it would be truncated in the metadata-only email.The text was updated successfully, but these errors were encountered: