Stability-Aware Evaluation of a CNN–LSTM–DQN Intrusion Detection System for Zero-Day and Drifted Network Traffic

Rushendra; Kalamullah Ramli; Prima Dewi Purnamasari

doi:10.31436/iiumej.v27i2.4240

Authors

Rushendra University of Indonesia https://orcid.org/0000-0002-5208-7916
Kalamullah Ramli University of Indonesia https://orcid.org/0000-0002-0374-4465
Prima Dewi Purnamasari University of Indonesia

DOI:

https://doi.org/10.31436/iiumej.v27i2.4240

Keywords:

Intrusion Detection System, zero-day attack, Advanced Persistent Threat, CNN-LSTM, Deep Q-Network, Reinforcement Learning, Concept Drift, Stability-Aware Evaluation, Imbalanced Classification, Network Security

Abstract

Intrusion Detection Systems (IDS) deployed in real-world environments must operate under severe class imbalance, evolving attack strategies, and non-stationary traffic distributions. Conventional supervised and deep learning–based IDS rely on fixed decision functions, limiting their adaptability to zero-day attacks and concept drift. This paper proposes a hybrid CNN–LSTM–DQN framework combined with a stability-aware evaluation methodology. The CNN–LSTM backbone extracts spatio-temporal representations, while a Deep Q-Network (DQN) learns adaptive detection policies using an ARMF-aware reward formulation. The framework is evaluated on eleven experimental stages (E1-11), including supervised baselines, reinforcement learning optimization, zero-day generalization (LOAO), and drift scenarios. Experimental results show that supervised models have high recall (up to 98.39%) but generate too many alerts (ARMF up to 44,845). The reinforcement learning model with prioritized experience replay (E7) achieves a more balanced performance with a recall of 91.40% and an ARMF of 1,031. The proposed PER-based approach significantly improves detection performance while maintaining low alert rates, achieving a recall of 42.47% compared to naive reinforcement learning (E5). Further evaluation in real drifting conditions showed robust recall (89-92%) with a tolerable number of alerts (ARMF ? 1,189). These results indicate that adaptive policy learning enables a more effective trade-off between detection performance and operational cost, while ARMF-based evaluation provides a practical complement to accuracy metrics for real-world IDS deployment.

ABSTRAK: Sistem Pengesanan Pencerobohan (IDS) dalam dunia nyata perlu beroperasi di bawah ketidakseimbangan kelas yang teruk, strategi serangan berevolusi, dan taburan trafik tidak pegun. IDS berasaskan pembelajaran penyeliaan dan pembelajaran mendalam konvensional bergantung pada fungsi keputusan tetap, mengehadkan kebolehsuaian terhadap serangan sifar hari dan hanyutan konsep. Kajian ini mencadangkan gabungan rangka kerja hibrid CNN–LSTM–DQN dan metodologi penilaian yang mementingkan kestabilan. CNN–LSTM mengekstrak representasi ruang-masa, manakala Rangkaian Q Mendalam (DQN) mempelajari dasar pengesanan adaptif menggunakan formulasi ganjaran ARMF. Rangka kerja ini dinilai dengan sebelas peringkat eksperimen (E1-11) termasuk garis dasar penyeliaan, pengoptimuman pembelajaran pengukuhan, generalisasi sifar hari (LOAO) dan senario hanyutan. Dapatan eksperimen menunjukkan bahawa model penyeliaan mempunyai kadar ingat semula (recall) yang tinggi (sehingga 98.39%) tetapi menjana terlalu banyak amaran (ARMF sehingga 44,845). Model pembelajaran pengukuhan dengan ulangan pengalaman berprioriti (E7) mencapai prestasi lebih seimbang dengan kadar ingat semula 91.40% dan ARMF sebanyak 1,031. Pendekatan berasaskan PER yang dicadangkan meningkatkan keupayaan pengesanan dengan ketara sambil mengekalkan kadar amaran yang rendah berbanding pembelajaran pengukuhan naif (E5) dengan kadar ingat semula 42.47%. Kajian selanjutnya dalam keadaan hanyutan sebenar menunjukkan kadar ingat semula yang teguh (89-92%) dengan bilangan amaran yang boleh diterima (ARMF ? 1,189). Dapatan ini menunjukkan bahawa pembelajaran dasar adaptif membolehkan imbangan yang lebih berkesan antara prestasi pengesanan dan kos operasi, manakala penilaian berasaskan ARMF menyediakan pelengkap praktikal pada metrik ketepatan bagi penempatan IDS dunia nyata.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biographies

Rushendra, University of Indonesia

Department of Electrical Engineering, Universitas Indonesia

Kalamullah Ramli, University of Indonesia

Department of Electrical Engineering, Universitas Indonesia

Prima Dewi Purnamasari, University of Indonesia

Department of Electrical Engineering, Universitas Indonesia

References

A. Halbouni, T. S. Gunawan, M. H. Habaebi, M. Halbouni, M. Kartiwi, and R. Ahmad, “CNN-LSTM: Hybrid Deep Neural Network for Network Intrusion Detection System,” Ieee Access, vol. 10, pp. 99837–99849, 2022, doi: 10.1109/access.2022.3206425.

M. Abdallah, N. Le?Khac, H. Z. Jahromi, and A. D. Jurcut, “A Hybrid CNN-LSTM Based Approach for Anomaly Detection Systems in SDNs,” pp. 1–7, 2021, doi: 10.1145/3465481.3469190.

H. C. Altunay and Z. Albayrak, “A hybrid CNN+LSTM-based intrusion detection system for industrial IoT networks,” Eng. Sci. Technol. an Int. J., vol. 38, p. 101322, 2023, doi: https://doi.org/10.1016/j.jestch.2022.101322.

Y. D. Lin, H. X. Huang, D. Sudyana, and Y. C. Lai, “AI for AI-based intrusion detection as a service: Reinforcement learning to configure models, tasks, and capacities,” J. Netw. Comput. Appl., vol. 229, no. February, p. 103936, 2024, doi: 10.1016/j.jnca.2024.103936.

M. Alrehaili and A. Alshamrani, “A Hybrid Deep Learning Approach for Advanced Persistent Threat Attack Detection,” pp. 78–86.

C. Do Xuan and M. H. Dao, “A novel approach for APT attack detection based on combined deep learning model,” Neural Comput. Appl., vol. 33, no. 20, pp. 13251–13264, 2021, doi: 10.1007/s00521-021-05952-5.

[N. K. Almazmomi, “Advanced Persistent Threat Detection Using Optimized and Hybrid Deep Learning Approach,” Secur. Priv., vol. 8, no. 2, p. e70011, Mar. 2025, doi: https://doi.org/10.1002/spy2.70011.

K. Alam, M. F. Monir, M. J. Hossain, M. S. Uddin, and M. T. Habib, “Adaptive Defense: Zero-Day Attack Detection in NIDS With Deep Reinforcement Learning,” IEEE Access, vol. 13, pp. 116345–116361, 2025, doi: 10.1109/ACCESS.2025.3585445.

E. H. Omoush, M. Almseidin, and A. Aldweesh, “A Self-Adaptive Intrusion Detection System for Zero-Day Attacks Using Deep Q-Networks,” IEEE Access, 2025.

V. Sharma, “Rainbow dqn for intrusion detection: A unified deep reinforcement learning approach across benchmark datasets,” Int. J. Appl. Math., vol. 38, no. 5s, pp. 647–675, 2025.

M. Alkasassbeh, E. H. Omoush, M. Almseidin, and A. Aldweesh, “A Self-Adaptive Intrusion Detection System for Zero-Day Attacks Using Deep Q-Networks,” IEEE Access, vol. 13, no. August, pp. 174280–174296, 2025, doi: 10.1109/ACCESS.2025.3617792.

Y. Wu, Y. Hu, J. Wang, M. Feng, A. Dong, and Y. Yang, “An active learning framework using deep Q-network for zero-day attack detection,” Comput. Secur., vol. 139, p. 103713, Apr. 2024, doi: 10.1016/j.cose.2024.103713.

H. A. Sakr, M. M. Fouda, A. F. Ashour, A. Abdelhafeez, M. I. El-Afifi, and M. Refaat Abdellah, “Machine learning-based detection of DDoS attacks on IoT devices in multi-energy systems,” Egypt. Informatics J., vol. 28, no. May, p. 100540, 2024, doi: 10.1016/j.eij.2024.100540.

I. H. Sarker, CyberLearning: Effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks, vol. 14, no. April. 2021. doi: 10.1016/j.iot.2021.100393.

W. Ren et al., “APT Attack Detection Based on Graph Convolutional Neural Networks,” Int. J. Comput. Intell. Syst., vol. 16, no. 1, p. 184, 2023, doi: 10.1007/s44196-023-00369-5.

L. Y. Por et al., “A Systematic Literature Review on the Methods and Challenges in Detecting Zero-Day Attacks: Insights from the Recent CrowdStrike Incident,” IEEE Access, vol. 12, no. October, pp. 144150–144163, 2024, doi: 10.1109/ACCESS.2024.3455410.

R. Article, M. M. Issa, M. Aljanabi, and H. M. Muhialdeen, “Systematic literature review on intrusion detection systems?: Research trends , algorithms , methods , datasets , and limitations,” 2024.

Rushendra, K. Ramli, N. Hayati, E. Ihsanto, T. S. Gunawan, and A. H. Halbouni, “Development of Intrusion Detection System using Residual Feedforward Neural Network Algorithm,” 2021 4th Int. Semin. Res. Inf. Technol. Intell. Syst. ISRITI 2021, pp. 539–543, 2021, doi: 10.1109/ISRITI54043.2021.9702773.

J. A. Shaikh et al., “RCLNet: an effective anomaly-based intrusion detection for securing the IoMT system,” Front. Digit. Heal., vol. 6, no. October, pp. 1–12, 2024, doi: 10.3389/fdgth.2024.1467241.

J. A. Shaikh et al., “A deep Reinforcement learning-based robust Intrusion Detection System for securing IoMT Healthcare Networks,” Front. Med., vol. 12, no. April, 2025, doi: 10.3389/fmed.2025.1524286.

I. Technology et al., “A Systematic Literature Review of Intrusion Detection System for Network Security?: Research Trends , Datasets and Methods,” no. May, pp. 0–5, 2020, doi: 10.1109/ICICoS51170.2020.9299068.

S. S. Bamber, A. V. R. Katkuri, S. Sharma, and M. Angurala, “A hybrid CNN-LSTM approach for intelligent cyber intrusion detection system,” Comput. Secur., vol. 148, no. June 2024, p. 104146, 2025, doi: 10.1016/j.cose.2024.104146.

S. Elsayed, K. Mohamed, and M. A. Madkour, “A Comparative Study of Using Deep Learning Algorithms in Network Intrusion Detection,” IEEE Access, vol. 12, no. May, pp. 58851–58870, 2024, doi: 10.1109/ACCESS.2024.3389096.

P. Sinha, D. Sahu, S. Prakash, T. Yang, R. S. Rathore, and V. K. Pandey, “A high performance hybrid LSTM CNN secure architecture for IoT environments using deep learning,” Sci. Rep., vol. 15, no. 1, pp. 1–26, 2025, doi: 10.1038/s41598-025-94500-5.

U. Agarwal, “Reinforcement Learning and Hybrid CNN-LSTM Based Host-Intrusion Detection System,” no. April, 2025, doi: 10.13140/RG.2.2.24939.25129.

A. H. Halbouni, T. S. Gunawan, M. Halbouni, F. A. A. Assaig, M. R. Effendi, and N. Ismail, “CNN-IDS: Convolutional Neural Network for Network Intrusion Detection System,” in Proceeding of 2022 8th International Conference on Wireless and Telematics, ICWT 2022, 2022, pp. 1–4. doi: 10.1109/ICWT55831.2022.9935478.

C. Do Xuan and T. T. Nguyen, “A novel approach for APT attack detection based on an advanced computing,” pp. 1–19, 2024.

N. Saini, V. Bhat Kasaragod, K. Prakasha, and A. K. Das, “A hybrid ensemble machine learning model for detecting APT attacks based on network behavior anomaly detection,” Concurr. Comput. Pract. Exp., vol. 35, no. 28, pp. 1–27, 2023, doi: 10.1002/cpe.7865.

Youaccell, “Reinforcement Learning: Adaptive Security Measures,” Youaccell. [Online]. Available: https://youaccel.com/lesson/reinforcement-learning-adaptive-security-measures/premium

M. R. Naeem, R. Amin, M. Farhan, F. S. Alsubaei, E. Alsolami, and M. D. Zakaria, “Cyber security Enhancements with reinforcement learning: A zero-day vulnerabilityu identification perspective,” PLoS One, vol. 20, no. 5, p. e0324595, May 2025, [Online]. Available: https://doi.org/10.1371/journal.pone.0324595

M. R. Naeem, R. Amin, M. Farhan, F. S. Alsubaei, E. Alsolami, and M. D. Zakaria, “Cyber security Enhancements with reinforcement learning: A zero-day vulnerabilityu identification perspective.,” PLoS One, vol. 20, no. 5, p. e0324595, 2025, doi: 10.1371/journal.pone.0324595.

Y. Wu, Y. Hu, J. Wang, M. Feng, A. Dong, and Y. Yang, “An active learning framework using deep Q-network for zero-day attack detection,” Comput. Secur., vol. 139, no. December 2023, p. 103713, 2024, doi: 10.1016/j.cose.2024.103713.

F. Ullah, S. Ullah, G. Srivastava, and J. C.-W. Lin, “IDS-INT: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic,” Digit. Commun. Networks, vol. 10, no. 1, pp. 190–204, 2024, doi: https://doi.org/10.1016/j.dcan.2023.03.008.

I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization.,” ICISSp, vol. 1, no. 2018, pp. 108–116, 2018.

R. Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in 2010 IEEE symposium on security and privacy, IEEE, 2010, pp. 305–316.

T. Fawcett, “An introduction to ROC analysis,” Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874, 2006.

J. G. Fiscus and G. R. Doddington, “Topic detection and tracking evaluation overview,” in Topic detection and tracking: event-based information organization, Springer, 2002, pp. 17–31.

J. Gama, I. Žliobait?, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Comput. Surv., vol. 46, no. 4, pp. 1–37, 2014.

A. Bifet and R. Gavalda, “Learning from time-changing data with adaptive windowing,” in Proceedings of the 2007 SIAM international conference on data mining, SIAM, 2007, pp. 443–448.

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, vol. 1, no. 1. MIT press Cambridge, 1998.