A solution for Microsoft Malware Classification Challenge (BIG 2015)
Classify malware into families based on file content and characteristics This notebook will achieve 0.00670 Logloss in the private set and 0.00809 on the public set which is a near-perfect classification using a mix of features from both binaries and source code
Credits: This notebook uses both https://www.kaggle.com/datasets/muhammad4hmed/malwaremicrosoftbig and and https://www.kaggle.com/datasets/songwonmin/malware-only-byte
Alejandro Mosquera (http://amsqr.com)