Forecasting of infection prevalence of Helicobacter pylori (H. pylori) using regression analysis

Authors

DOI:

https://doi.org/10.31436/iiumej.v23i2.2164

Keywords:

H. pylori, infectious disease prediction, multivariate linear regression

Abstract

Global warming may have a significant impact on human health because of the growth of the population of harmful bacteria such as Helicobacter pylori infection. It is crucial to predict the prevalence of a pathogen in a society in a faster and more cost-effective way in order to manage caused disease. In this research, we have done predictive analysis of H. pylori infection spread behavior with respect to weather parameters (e.g., humidity, dew point, temperature, pressure, and wind speed) of Istanbul based on a database from Istanbul Samatya Hospital. We developed a forecasting model to predict H. pylori infection prevalence. The goal is to develop a machine learning model to predict H. pylori (Hp) related infection diseases (e.g., gastric ulcer diseases, gastritis) based on climate variables. The dataset for this study covered years from 1999 to 2003 and contained a total of 7014 rows from the Samatya Hospital in Istanbul.  The weather information related to those years and location, including humidity (H), dew point (D), temperature (T), pressure (P) and wind speed (W), were collected from the following website: https://www.wunderground.com. In this paper we analyzed the forecasting model, which was used to predict H. pylori infection prevalence, by non-linear multivariate linear regression model (MLRM). We applied the non-linear least square method of minimization for the sum of squares to find optimal parameters of MLRM. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the Hp infection disease is most influenced by humidity. Hp prevalence is modelled using the Multiple Regression Method equation, the average H, D, T, P, and W were the most important parameters to deviation of the datasets (testing dataset was 17% and 18% for training dataset). This showed that the statistical model predicts the Hp prevalence with about 83% accuracy of the testing data set (11 months) and 87% accuracy of the training data set (42 months). Based on the proposed model, monthly infection can be predicted early for medical services to take preventative measures and for government to prepare against the bacteria. In addition, drug producers can adjust their drug production rates based on forecasting results.  

ABSTRAK: Pemanasan global mungkin mempunyai kesan langsung terhadap kesihatan manusia kerana pertambahan populasi bakteria merbahaya seperti infeksi H. pylori. Adalah penting bagi mengesan kehadiran patogen dalam masyarakat bagi mengawal penularan penyakit dengan cepat, dan melalui kaedah kurang mahal. Kajian ini berkaitan analisis ramalan penularan infeksi H. pylori secara langsung terhadap parameter cuaca (cth: kelembapan, titik embun, suhu, tekanan, kelajuan angin) di Istanbul berdasarkan data dari Hospital Samatya Istanbul. Kajian ini membentuk model ramalan bagi menjangka penyebaran infeksi H. pylori. Matlamat adalah bagi mencipta model pembelajaran mesin bagi mengjangka penyakit berkaitan infeksi H. pylori (Hp) (cth: penyakit ulser gastrik, gastrik) berdasarkan pembolehubah cuaca. Dari tahun 1999 ke 2003, set data telah digunakan bagi mempelajari di mana sejumlah 7014 baris dari Hospital Samatya di Istanbul. Informasi berkaitan tahun-tahun tersebut dan lokasi mengenai kelembapan (H), titik embun (D), suhu (T), tekanan (P) dan kelajuan angin (W) dikumpul dari laman sesawang https://www.wunderground.com. Kajian ini mengguna pakai model ramalan bagi meramal kelaziman infeksi H. pylori, melalui model regresi berkadaran multivariat tidak-berkadaran (MLRM). Kaedah Kuasa Dua Terkecil tidak linear digunakan bagi pengurangan jumlah ganda dua bagi mencapai parameter optimum MLRM. Kaedah Regresi Gandaan digunakan bagi mencari persamaan antara kriteria pembolehubah dan gabungan pembolehubah ramalan. Dapatan menunjukkan infeksi penyakit Hp adalah disebabkan oleh faktor kelembapan. Penyebaran Hp dimodel menggunakan persamaan Kaedah Regresi Gandaan, purata H, D, T, P dan W adalah parameter terpenting bagi sisihan data latihan iaitu sebanyak 17% dan 18% bagi set data latihan. Ini menunjukkan model statistik menjangkakan penyebaran Hp adalah sebanyak 83% adalah tepat pada set data yang diuji (selama 11 bulan) dan 87% tepat pada set data latihan (selama 42 bulan). Berdasarkan model yang dicadangkan ini, infeksi bulanan dapat di jangka lebih awal bagi membendung servis kepada perubatan dan kerajaan bersiap-sedia memerangi bakteria ini. Tambahan, prosedur jumlah ubatan dapat dihasilkan lebih atau kurang daripada jumlah ubatan berdasarkan dapatan ramalan.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Lu C, Yu Y, Li L, Yu C, Xu P. (2018) Systematic review of the relationship of Helicobacter pylori infection with geographical latitude, average annual temperature and average daily sunshine. BMC gastroenterology, 18(1): 50. DOI: https://doi.org/10.1186/s12876-018-0779-x

Tang MY, Chung PH, Chan HY, Tam PK, Wong KK. (2019) Recent trends in the prevalence of Helicobacter pylori in symptomatic children: A 12-year retrospective study in a tertiary centre. Journal of pediatric surgery, 54(2): 255-257. DOI: https://doi.org/10.1016/j.jpedsurg.2018.10.079

Peek Jr RM, Blasser MJ. (1997) Pathophysiology of Helicobacter pylori-induced gastritis and peptic ulcer disease. The American journal of medicine, 102(2): 200-207. DOI: https://doi.org/10.1016/S0002-9343(96)00273-2

Thorsen K, Søreide JA, Kvaløy JT, Glomsaker T, Søreide K. (2013) Epidemiology of perforated peptic ulcer: age-and gender-adjusted analysis of incidence and mortality. World Journal of Gastroenterology, 19(3): 347. DOI: https://doi.org/10.3748/wjg.v19.i3.347

Rawla P, Barsouk A. (2019) Epidemiology of gastric cancer: global trends, risk factors and prevention. Przeglad gastroenterologiczny, 14(1): 26. DOI: https://doi.org/10.5114/pg.2018.80001

Song Y, Wang F, Wang B, Tao S, Zhang H, Liu S, Ramirez O, Zeng, Q. (2015) Time series analyses of hand, foot and mouth disease integrating weather variables. PloS one, 10(3): e0117296. https://doi.org/10.1371/journal.pone.0117296 DOI: https://doi.org/10.1371/journal.pone.0117296

Downloads

Published

2022-07-04

How to Cite

Usarov, K., Ahmedov, A., Abasiyanik, M. F., & Ku Khalif, K. M. N. (2022). Forecasting of infection prevalence of Helicobacter pylori (H. pylori) using regression analysis. IIUM Engineering Journal, 23(2), 183–192. https://doi.org/10.31436/iiumej.v23i2.2164

Issue

Section

Engineering Mathematics and Applied Science