Authors: Hemant Rathore (BITS Pilani), Adithya Samavedhi (BITS Pilani), Sanjay K. Sahay (BITS Pilani), and Mohit Sewak (Microsoft)



The last decade witnessed an exponential growth of smartphones and their users, which has drawn massive attention from malware designers. The current malware detection engines are unable to cope with the volume, velocity, and variety of incoming malware. Thus the anti-malware community is investigating the use of machine learning and deep learning to develop malware detection models. However, research in other domains suggests that the machine learning/deep learning models are vulnerable to adversarial attacks. Therefore in this work, we proposed a framework to construct robust malware detection models against adversarial attacks. We first constructed twelve different malware detection models using a variety of classification algorithms. Then we acted as an adversary and proposed Gradient-based Adversarial Attack Network to perform adversarial attacks on the above detection models. The attack is designed to convert the maximum number of malware samples into adversarial samples with minimal modifications in each sample. The proposed attack achieves an average fooling rate of 98.68% against twelve permission-based malware detection models and 90.71% against twelve intent-based malware detection models. We also identified the list of vulnerable permissions/intents which an adversary can use to force misclassifications in detection models. Later we proposed three adversarial defense strategies to counter the attacks performed on detection models. The proposed Hybrid Distillation based defense strategy improved the average accuracy by 54.21% for twelve permission-based detection models and 59.14% for intent-based detection models. We also concluded that the adversarial-based study improves the performance and robustness of malware detection models and is essential before any real-world deployment.