Efficiency-Aware Multi-Class Spinal Disorder Classification Using CBAM-Enhanced Lightweight CNNs with Dual-Branch Fusion

Teddy Surya Gunawan; Nurul Jannah; Mira Kartiwi; Noreha Abdul Malik

doi:10.31436/iiumej.v27i2.4341

Authors

Teddy Surya Gunawan International Islamic University Malaysia https://orcid.org/0000-0003-3345-4669
Nurul Jannah International Islamic University Malaysia https://orcid.org/0009-0008-8183-2162
Mira Kartiwi International Islamic University Malaysia
Noreha Abdul Malik International Islamic University Malaysia https://orcid.org/0009-0003-4235-1271

DOI:

https://doi.org/10.31436/iiumej.v27i2.4341

Keywords:

Spinal Disorder Classification, Deep Learning, Lightweight CNN, Attention Mechanism (CBAM), Efficiency-Aware Evaluation

Abstract

Spinal X-rays are still often read through manual measurements, yet the patients who most need timely assessment cannot afford delay, inconsistency, or heavy computational pipelines. Motivated by this clinical tension, this study proposes an efficiency-aware deep learning framework for three-class spinal disorder classification that asks a practical question rarely centered in prior work: not only which model is most accurate, but which model is accurate enough, light enough, and fast enough to matter in real screening settings. Using a public dataset of 338 subjects, five lightweight backbones, CBAM-enhanced variants, and a dual-branch fusion model were evaluated through stratified 5-fold cross-validation under multiple balancing strategies, with performance measured by accuracy, precision, recall, F1-score, parameter count, FLOPs, model size, latency, and throughput. The results reveal an unexpected pattern: bigger models do not win. MobileNetV3Small delivers the strongest efficiency-performance balance, reaching an F1-score of 0.962 with only 1.0 million parameters, while the best overall result is achieved by the Fusion_MNv3_MNAS model under augmentation-only training, with an F1-score of 0.976. Ablation findings further show that attention and fusion are not universally beneficial, but become most effective when paired with sufficient data-driven regularization, and that fine-tuning about 30% of backbone parameters yields the most favorable adaptation. Taken together, these findings show that performance in spinal X-ray classification depends less on model size alone than on the fit between architecture and training strategy. The study therefore offers a concrete and clinically relevant message: lightweight, well-regularized models can match or surpass heavier alternatives while remaining more practical for scalable deployment.

ABSTRAK: Radiograf tulang belakang masih kerap dinilai melalui pengukuran manual, sedangkan pesakit yang memerlukan rawatan awal tidak dapat menanggung kelewatan, ketidakselarasan, atau kebergantungan pada sistem pengiraan yang berat. Berpunca pada masalah klinikal ini, kajian ini mengemukakan satu rangka kerja pembelajaran mendalam berpaksikan kecekapan bagi pengelasan tiga kelas gangguan tulang belakang, dengan menumpukan persoalan praktikal yang jarang diberi perhatian dalam kajian terdahulu, iaitu bukan pada model mana yang paling tepat, tetapi model mana yang cukup tepat, cukup ringan, dan cukup pantas bagi persekitaran saringan klinikal. Dengan menggunakan satu set data awam yang melibatkan 338 subjek, lima model asas ringan, varian yang dipertingkatkan dengan CBAM, serta model gabungan dwi-cabang telah dinilai melalui pengesahan silang berstrata lima lipatan di bawah beberapa strategi pengimbangan kelas, dengan prestasi diukur menggunakan ketepatan, kejituan, keboleh ingatan, skor F1, bilangan parameter, FLOPs, saiz model, kependaman, dan kadar pemprosesan. Dapatan kajian menunjukkan satu corak yang tidak dijangka, iaitu model yang lebih besar tidak semestinya memberi prestasi terbaik. MobileNetV3Small memperlihatkan keseimbangan paling kukuh antara kecekapan dan prestasi dengan mencapai skor F1 sebanyak 0.962 hanya menggunakan 1.0 juta parameter, manakala prestasi keseluruhan terbaik dicapai oleh model Fusion_MNv3_MNAS di bawah latihan berasaskan augmentasi sahaja dengan skor F1 sebanyak 0.976. Analisis ablasi seterusnya menunjukkan bahawa mekanisme perhatian dan gabungan tidak sentiasa memberikan manfaat secara menyeluruh, sebaliknya menjadi paling berkesan apabila dipadankan dengan regularisasi berasaskan data yang mencukupi, dan penalaan halus sekitar 30% parameter rangka asas menghasilkan penyesuaian yang terbaik. Secara keseluruhan, dapatan ini menunjukkan bahawa prestasi dalam pengelasan sinar-X tulang belakang kurang bergantung pada saiz model semata-mata, sebaliknya lebih dipengaruhi oleh kesesuaian antara seni bina model dan strategi latihan. Oleh itu, kajian ini membawa mesej yang jelas dan signifikan dari sudut klinikal, iaitu model ringan yang diregularisasikan dengan baik mampu menandingi malah mengatasi model yang lebih berat, iaitu kekal lebih praktikal bagi pelaksanaan berskala dalam persekitaran saringan klinikal.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biography

Teddy Surya Gunawan, International Islamic University Malaysia

Professor Teddy Surya Gunawan received the B.Eng. degree (cum laude) in electrical engineering from the Institut Teknologi Bandung (ITB), Indonesia, in 1998, the M.Eng. degree from the School of Computer Engineering, Nanyang Technological University, Singapore, in 2001, and the Ph.D. degree from the School of Electrical Engineering and Telecommunications, The University of New South Wales, Australia, in 2007. His research interests include speech and audio processing, biomedical signal processing and instrumentation, image and video processing, and parallel computing. He was awarded the Best Researcher Award from IIUM in 2018. He was a Chairman of the IEEE Instrumentation and Measurement Society–Malaysia Section (2013, 2014, and 2020), a Professor (since 2019), the Head of Department (from 2015 to 2016) with the Department of Electrical and Computer Engineering, and the Head of Programme Accreditation, and the Quality Assurance for Faculty of Engineering (from 2017 to 2018), International Islamic University Malaysia. He has been a Chartered Engineer (IET, U.K.) and Insinyur Profesional Madya (PII, Indonesia) since 2016, a registered ASEAN Engineer since 2018, and an ASEAN Chartered Professional Engineer since 2020.

References

M. Fraiwan, Z. Audat, L. Fraiwan, and T. Manasreh, "Using deep transfer learning to detect scoliosis and spondylolisthesis from x-ray images," Plos One, vol. 17, no. 5, p. e0267851, 2022.

R. R. Maaliw III, "SCOLIONET: an automated scoliosis Cobb angle quantification using enhanced X-ray images and deep learning models," Journal of Imaging, vol. 9, no. 12, p. 265, 2023.

M. I. Jamaludin, T. S. Gunawan, R. K. Karupiah, S. A. Zabidi, M. Kartiwi, and Z. Zakaria, "Optimizing U-Net Architecture with Feed-Forward Neural Networks for Precise Cobb Angle Prediction in Scoliosis Diagnosis," Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol. 11, no. 3, pp. 883-895, 2023.

N. A. Makhdoomi et al., "Development of scoliotic spine severity detection using deep learning algorithms," in 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), 2022: IEEE, pp. 0574-0579.

H. Güne?, C. Hark, and A. E. Akkaya, "Comparison of deep learning models and optimization algorithms in the detection of scoliosis and spondylolisthesis from X-ray images," Sakarya University Journal of Science, vol. 28, no. 2, pp. 438-451, 2024.

G. M. Trinh et al., "Detection of lumbar spondylolisthesis from X-ray images using deep learning network," Journal of Clinical Medicine, vol. 11, no. 18, p. 5450, 2022.

A. Vephasayanant, A. Jitpattanakul, P. Muneesawang, K. Wongpatikaseree, and N. Hnoohom, "YOLO-based image segmentation for the diagnostic of spondylolisthesis from lumbar spine X-ray images," IEEE Access, vol. 12, pp. 182242-182258, 2024.

C. Xu et al., "Deep Learning-Based Diagnosis of Lumbar Spondylolisthesis Using X-Ray Imaging," Diagnostics, vol. 15, no. 16, p. 2015, 2025.

F. N. M. Zamri, T. S. Gunawan, S. H. Yusoff, A. A. Alzahrani, A. Bramantoro, and M. Kartiwi, "Enhanced small drone detection using optimized YOLOv8 with attention mechanisms," IEEE Access, vol. 12, pp. 90629-90643, 2024.

J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132-7141.

S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "Cbam: Convolutional block attention module," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3-19.

Q. Guan, Y. Huang, Z. Zhong, Z. Zheng, L. Zheng, and Y. Yang, "Diagnose like a radiologist: Attention guided convolutional neural network for thorax disease classification," arXiv preprint arXiv:1801.09927, 2018.

O. Oktay et al., "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999, 2018.

G. Papanastasiou, N. Dikaios, J. Huang, C. Wang, and G. Yang, "Is attention all you need in medical image analysis? A review," IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 3, pp. 1398-1411, 2023.

P. Chen, Z. Zhou, H. Yu, K. Chen, and Y. Yang, "Computerized?Assisted Scoliosis Diagnosis Based on Faster R?CNN and ResNet for the Classification of Spine X?Ray Images," Computational and Mathematical Methods in Medicine, vol. 2022, no. 1, p. 3796202, 2022.

Z. Al?Milaji and H. Yousif, "Lightweight deep learning model optimization for medical image analysis," International Journal of Imaging Systems and Technology, vol. 34, no. 5, p. e23173, 2024.

M. Tan and Q. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," in International Conference on Machine Learning, 2019: PMLR, pp. 6105-6114.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520.

A. Howard et al., "Searching for mobilenetv3," in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1314-1324.

M. Tan et al., "Mnasnet: Platform-aware neural architecture search for mobile," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 2820-2828.

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.

J. Irvin et al., "Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison," in Proceedings of the AAAI conference on artificial intelligence, 2019, vol. 33, no. 01, pp. 590-597.

M. Raghu, C. Zhang, J. Kleinberg, and S. Bengio, "Transfusion: Understanding transfer learning for medical imaging," Advances in neural information processing systems, vol. 32, 2019.

X. Liu et al., "Advances in deep learning-based medical image analysis," Health Data Science, vol. 2021, p. 8786793, 2021.

C. J. Kelly, A. Karthikesalingam, M. Suleyman, G. Corrado, and D. King, "Key challenges for delivering clinical impact with artificial intelligence," BMC Medicine, vol. 17, no. 1, p. 195, 2019.

E. J. Topol, "High-performance medicine: the convergence of human and artificial intelligence," Nature Medicine, vol. 25, no. 1, pp. 44-56, 2019.

S. Jetley, N. A. Lord, N. Lee, and P. H. Torr, "Learn to pay attention," arXiv preprint arXiv:1804.02391, 2018.

H. Fu, J. Cheng, Y. Xu, D. W. K. Wong, J. Liu, and X. Cao, "Joint optic disc and cup segmentation based on multi-label deep network and polar transformation," IEEE Transactions on Medical Imaging, vol. 37, no. 7, pp. 1597-1605, 2018.

V. V. Valindria et al., "Multi-modal learning from unpaired images: Application to multi-organ segmentation in CT and MRI," in 2018 IEEE winter conference on applications of computer vision (WACV), 2018: IEEE, pp. 547-556.

Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, "Unet++: A nested U-Net architecture for medical image segmentation," in International workshop on deep learning in medical image analysis, 2018: Springer, pp. 3-11.

K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.

C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of big data, vol. 6, no. 1, pp. 1-48, 2019.

H. He and E. A. Garcia, "Learning from imbalanced data," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263-1284, 2009.

G. Douzas and F. Bacao, "Effective data generation for imbalanced learning using conditional generative adversarial networks," Expert Systems with Applications, vol. 91, pp. 464-471, 2018.