Streptococcus Gallolyticus Infection and its Interrelation with Colorectal Cancer: Diagnostic Accuracy of Statistical and Machine Learning Models for Early Detection Algorithm

Authors

  • Edre Mohammad Aidid International Islamic University Malaysia
  • Hairul Aini Hamzah International Islamic University Malaysia
  • Mohd Shaiful Ehsan Shalihin International Islamic University Malaysia
  • Azmi Md Nor International Islamic University Malaysia
  • Che Muhammad Khairul Hisyam Ismail International Islamic University Malaysia

DOI:

https://doi.org/10.31436/imjm.v24i04.2895

Keywords:

Streptococcus gallolyticus, colorectal cancer, diagnostic accuracy, machine learning, bayesian

Abstract

INTRODUCTION: Epidemiological studies have emphasized the role of Streptococcus gallolyticus subspecies gallolyticus (Sgg) infection in the development of colorectal cancer (CRC), yet it remains underappreciated. While statistical and machine learning (ML) models can enhance CRC prediction, direct comparisons between them are rare. This study aims to assess the diagnostic accuracy of stool polymerase chain reaction (PCR) for Sgg and immunochemical fecal occult blood test (iFOBT) for CRC detection and to compare multivariable statistical and ML models in predicting CRC. MATERIALS AND METHODS: A hospital-based case-control study with a reversed flow design was conducted, involving 33 CRC cases and 80 controls. The analysis incorporated Asia Pacific Colorectal Screening (APCS) risk factors into three predictive models: logistic regression (LR), decision tree (DT), and ensemble Bayesian boosted decision tree (BDT). RESULTS: Combined testing achieved a net sensitivity of 54%, outperforming individual tests (iFOBT=12.1%, Stool PCR=48.5%). Among the models, the ensemble BDT approach demonstrated the highest classification accuracy for CRC (BDT= 78.1%; DT=72.4%; LR=69.9%). The DT model identified iFOBT as the sole predictor, while the BDT ensemble model prioritized positive stool PCR for Sgg as the primary predictor, followed by normal to overweight body mass index and individuals aged over 53 years. CONCLUSION: The ensemble ML model incorporating Sgg infection demonstrated superior predictive performance. Screening for Sgg in stool samples has the potential as an early CRC detection strategy, particularly for individuals with a normal to overweight BMI and those above 53 years old.

 

Downloads

Download data is not yet available.

Downloads

Published

01.10.2025

How to Cite

Mohammad Aidid, E., Hamzah, H. A., Shalihin, M. S. E., Md Nor, A., & Ismail, C. M. K. H. (2025). Streptococcus Gallolyticus Infection and its Interrelation with Colorectal Cancer: Diagnostic Accuracy of Statistical and Machine Learning Models for Early Detection Algorithm. IIUM Medical Journal Malaysia, 24(04). https://doi.org/10.31436/imjm.v24i04.2895