Development of a Model for Malware Detection and Classification at the Byte Level Based on Transformer

Rulof Baltwin Tallane; Akhmad Unggul Priantoro; Riza Muhida

doi:10.31436/iiumej.v27i1.4009

Authors

Rulof Baltwin Tallane Universitas Budi Luhur https://orcid.org/0009-0003-9934-366X
Akhmad Unggul Priantoro Universitas Budi Luhur https://orcid.org/0009-0006-1524-9142
Riza Muhida Universitas Bandar Lampung

DOI:

https://doi.org/10.31436/iiumej.v27i1.4009

Keywords:

Malware, Byte level, Transformer, Polymorphism, Machine Learning, Cybersecurity

Abstract

Malware threats have been a critical concern in cybersecurity, particularly due to the increasing complexity and constantly evolving variants that are difficult to detect using conventional signature-based or static rule-based methods. This research focused on developing a Transformer-based model at the byte level to detect and classify malware effectively and adaptively, thereby streamlining the analytical process without requiring specialized feature tokenization. The primary objective was to design and evaluate a Transformer model that captures universal and adaptive malware patterns directly from raw byte representations, enabling cross-platform applicability. A quantitative experimental approach was employed using three public datasets: Malware Detection PE-Based Analysis, MC-dataset-binary, and Malware.zip. Data processing involved byte embedding, dilated 1D convolution, multi-head self-attention, and attention pooling. Model optimization was conducted using AdamW with a combined scheduler, Stochastic Weight Averaging (SWA), random byte masking augmentation, and Mixup Embedding. Experimental results showed that the byte-level Transformer model achieved high classification accuracy across the three datasets, namely 99%, 92%, and 94%, respectively. These results demonstrate that a byte-level Transformer can effectively capture universal malware patterns in binary data, offering a flexible and highly accurate approach to developing resilient defenses against modern cyber threats.

ABSTRAK: Ancaman perisian perosak (malware) merupakan isu kritikal dalam keselamatan siber, terutama apabila disebabkan oleh peningkatan kerumitan dan variasi yang sentiasa berevolusi, sekaligus menyukarkan pengesanan menggunakan kaedah konvensional berasaskan tandatangan atau peraturan statik. Kajian ini memfokuskan kepada pembangunan model berasaskan Transformasi pada peringkat bait (byte-level) bagi mengesan dan mengklasifikasikan perisian perosak secara berkesan dan adaptif, tanpa memerlukan pengekstrakan atau penandaan ciri khusus. Objektif utama kajian adalah bagi mereka bentuk dan menilai keberkesanan model Transformasi yang mampu menangkap corak perisian perosak bersifat universal dan adaptif secara langsung daripada representasi bait mentah, seterusnya membolehkan kebolehgunaan merentas platform. Pendekatan eksperimen kuantitatif digunakan dengan melibatkan tiga set data awam, iaitu Analisis Berdasarkan Pengesanan Perisian Perosak PE, Set Data Binari MC, dan Malware.zip. Pemprosesan data merangkumi pembenaman bait, konvolusi 1D terlarut (dilated), perhatian kendiri berbilang kepala, serta pengumpulan berasaskan perhatian. Pengoptimuman model dilaksanakan menggunakan AdamW dengan penjadual gabungan, Purata Pemberat Stokastik (SWA), penambahan data melalui penyamaran bait rawak, dan Penggabungan Embedding. Dapatan kajian melalui eksperimen menunjukkan bahawa model Transformasi peringkat bait mencapai ketepatan pengelasan tertinggi, iaitu masing-masing 99%, 92%, dan 94% bagi ketiga-tiga set data. Dapatan ini membuktikan bahawa Transformasi peringkat bait berupaya menangkap corak perisian perosak universal daripada data binari, sekaligus menawarkan pendekatan fleksibel dan berketepatan tinggi bagi membangunkan pertahanan yang lebih berdaya tahan terhadap ancaman siber moden.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

BSSN, “LANSKAP KEAMANAN SIBER INDONESIA 2024,” Jan. 2025.

K. Khalda and D. K. Wibowo, “Malware Behavior Analysis Using Static and Dynamic Analysis Approaches,” Jurnal Sains, Nalar, dan Aplikasi Teknologi Informasi, vol. 4, no. 1, pp. 1–8, Jan. 2025, doi: 10.20885/snati.v4.i1.1.

M. V. Ngo, T. Truong-Huu, D. Rabadi, J. Y. Loo, and S. G. Teo, “Fast and Efficient Malware Detection with Joint Static and Dynamic Features Through Transfer Learning,” Nov. 2022, [Online]. Available: http://arxiv.org/abs/2211.13860

M. Abdur Rahman, G. A. Francia Iii, H. Shahriar, G. Francia III, E. El-Sheikh, and S. Iqbal Ahamed, “A Novel Approach to Fine-tune BERT using Non-Text Features for Enhanced Ransomware Detection,” 2025, doi: 10.13140/RG.2.2.35576.15365.

W. A. Salmon, “Learning in Quantum Mechanics,” 2024.

T. Talaei Khoei, H. Ould Slimane, and N. Kaabouch, “Deep learning: systematic review, models, challenges, and research directions,” Nov. 01, 2023, Springer Science and Business Media Deutschland GmbH. doi: 10.1007/s00521-023-08957-4.

J. Daniel and J. H. Martin, “Speech and Language Processing,” 2024.

Y. Bai, J. Mei, A. Yuille, and C. Xie, “Are Transformers More Robust Than CNNs?,” Nov. 2021, [Online]. Available: http://arxiv.org/abs/2111.05464

Y. Fan et al., “Heterogeneous Temporal Graph Transformer: An Intelligent System for Evolving Android Malware Detection,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, Aug. 2021, pp. 2831–2839. doi: 10.1145/3447548.3467168.

A. Rahali and M. A. Akhloufi, “MalBERT: Using Transformers for Cybersecurity and Malicious Software Detection,” Mar. 2021, [Online]. Available: http://arxiv.org/abs/2103.03806

T. L. Huoh, T. Miskell, O. Barut, Y. Luo, P. Li, and T. Zhang, “Malware Detection for Portable Executables Using a Multi-input Transformer-Based Approach,” in 2024 International Conference on Computing, Networking and Communications, ICNC 2024, Institute of Electrical and Electronics Engineers Inc., 2024, pp. 778–782. doi: 10.1109/ICNC59896.2024.10556067.

S. Berrios, D. Leiva, B. Olivares, H. Allende-Cid, and P. Hermosilla, “Systematic Review: Malware Detection and Classification in Cybersecurity,” Applied Sciences, vol. 15, no. 14, p. 7747, Jul. 2025, doi: 10.3390/app15147747.

F. A. Aboaoja, A. Zainal, F. A. Ghaleb, B. A. S. Al-rimy, T. A. E. Eisa, and A. A. H. Elnour, “Malware Detection Issues, Challenges, and Future Directions: A Survey,” Sep. 01, 2022, MDPI. doi: 10.3390/app12178482.

T. L. Wan et al., “Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files,” IEEE Open Journal of the Computer Society, vol. 1, pp. 262–275, 2020, doi: 10.1109/OJCS.2020.3033974.

T. McIntosh, A. S. M. Kayes, Y. P. P. Chen, A. Ng, and P. Watters, “Ransomware Mitigation in the Modern Era: A Comprehensive Review, Research Challenges, and Future Directions,” Dec. 31, 2022, Association for Computing Machinery. doi: 10.1145/3479393.

M. Naseer et al., “Malware Detection: Issues and Challenges,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Apr. 2021. doi: 10.1088/1742-6596/1807/1/012011.

A. P. Tuan, A. T. H. Phuong, N. V. Thanh, and T. N. Van, “Malware Detection PE-Based Analysis Using Deep Learning Algorithm Dataset,” 2018, figshare. doi: 10.6084/m9.figshare.6635642.v1.

E. de O. Andrade, “MC-dataset-binary,” 2018, figshare. doi: 10.6084/m9.figshare.5995408.v1.

A. Azab and M. Khasawneh, “MSIC: Malware Spectrogram Image Classification,” IEEE Access, vol. 8, pp. 102007–102021, 2020, doi: 10.1109/ACCESS.2020.2999320.

D. Morozovskii, K. Thummar, T. Halabi, and S. Ramanna, “Toward Efficient and Robust Deep Learning-based Malware Detection in Fog Computing,” in 2021 International Symposium on Networks, Computers and Communications, ISNCC 2021, Institute of Electrical and Electronics Engineers Inc., 2021. doi: 10.1109/ISNCC52172.2021.9615812.

A. Rahali and M. A. Akhloufi, “MalBERT: Using Transformers for Cybersecurity and Malicious Software Detection,” Mar. 2021, [Online]. Available: http://arxiv.org/abs/2103.03806