In What Ways Can Artificial Intelligence Improve Malware Detection?
DOI:
https://doi.org/10.61173/26b40m82Keywords:
malware detection, deep learning, feature selection, convolutional neural networkAbstract
Based on 58,942 malware and benign software samples (55.6% malicious), this study systematically evaluated the performance of support vector machines (SVM), random forests (RF), convolutional neural networks (CNN), and recurrent neural networks (RNN) in malware detection. Data preprocessing (Z-score standardization, SMOTE oversampling) and feature selection (random forest screening Top200 features) were implemented through the Weka platform, and the model performance was compared using ten-fold cross validation. The results showed that CNN had the best overall performance, with an accuracy of 92% (F1 value of 0.92), and still maintained an accuracy of 80% in unknown malware detection; RF and RNN were second, and SVM was relatively weak. Feature importance analysis showed that API call frequency (0.35) and byte entropy value (0.25) were key discriminant features. Our study confirmed the advantages of deep learning models (DLM) in malware detection and provided a basis for feature selection for optimizing detection systems.