SARIMA-LSTM Combination For COVID-19 Case Modeling

Authors

DOI:

https://doi.org/10.31436/iiumej.v23i2.2134

Keywords:

SARIMA, LSTM, SARIMA-LSTM, COVID-19 patients

Abstract

The study of SARIMA method in combination with LSTM is interesting to do. This combination method can be convincing and significant because the data collected is numerical and saved based on time. In addition, the proposed method can anticipate datasets, either linear or non-linear. Based on several previous studies, the SARIMA method has the advantage of completing linear datasets while the LSTM method excels in achieving non-linear datasets. Also, both methods have been shown to have an accuracy value compared to some other methods. This study tried to combine the two through several stages of the first stage of applying the SARIMA method using fit datasets (linear data) then residual Dataset (non-linear data) analysed using the LSTM method. The result of the combination methods will be checked for the accuracy value. This research will be compared by using SARIMA and LSTM methods separately. The Dataset used as a trial is COVID-19 patient data in the United States. The results showed that the combination of SARIMA-LSTM method is better than either SARIMA or LSTM alone with RMSE of 0.33905765 and MAE of 0.29077017.

ABSTRAK: Gabungan kaedah kajian SARIMA dengan LSTM adalah menarik untuk dikaji. Gabungan kaedah ini meyakinkan dan penting kerana data yang dikumpulkan bersifat numerik dan disimpan berdasarkan waktu. Selain itu, kaedah yang diusulkan ini dapat menerima set data, samada berkadar langsung atau tidak langsung. Berdasarkan beberapa penelitian sebelumnya, kaedah SARIMA mempunyai faedah dalam melengkapi set data linear, sedangkan kaedah LSTM berguna dalam mencapai set data tidak-linear. Tambahan, kedua-dua kaedah ini terbukti memiliki nilai ketepatan lebih baik berbanding beberapa kaedah lain. Kajian ini cuba menggabungkan keduanya melalui beberapa tahap. Tahap pertama mengunakan kaedah SARIMA secara set data (data linear) kemudian baki set data (data tidak-linear) dianalisa menggunakan kaedah LSTM. Dapatan dari gabungan kedua-dua kaedah tersebut akan diperiksa nilai ketepatannya. Kajian ini akan dibandingkan melalui kaedah SARIMA dan LSTM secara berasingan. Set data yang digunakan adalah merupakan data pesakit COVID-19 dari Amerika Syarikat. Dapatan kajian menunjukkan gabungan kaedah SARIMA-LSTM memiliki nilai ketepatan yang lebih baik berbanding kaedah SARIMA secara berasingan, dan LSTM dengan RMSE adalah sebanyak 0.33905765 dan MAE sebanyak 0.29077017.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Davis RA. (2014) Introduction to statistical analysis of time series. Department of Statistics Columbia University, pp. 1-24.

Borkowf CB. (2002) Time-Series Forecasting. Technometrics, 44(2): 194-195. https://doi.org/10.1198/tech.2002.s718. DOI: https://doi.org/10.1198/tech.2002.s718

Schlüter T. (2012) Knowledge discovery from time series (Doctoral dissertation, Universitäts-und Landesbibliothek der Heinrich-Heine-Universität Düsseldorf).

Chen KY, Wang CH. (2007) A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan. Expert Systems with Applications, 32(1): 254-264. https://doi.org/10.1016/j.eswa.2005.11.027 DOI: https://doi.org/10.1016/j.eswa.2005.11.027

Chi YN. (2021) Time Series Forecasting of Global Price of Soybeans using a Hybrid SARIMA and NARNN Model: Time Series Forecasting of Global Price of Soybeans. Data Science: Journal of Computing and Applied Informatics, 5(2): 85-101. https://doi.org/10.4108/eai.2-8-2019.2290473 DOI: https://doi.org/10.4108/eai.2-8-2019.2290473

Ozozen A, Kayakutlu G, Ketterer M, Kayalica O. (2016) A combined seasonal ARIMA and ANN model for improved results in electricity spot price forecasting: Case study in Turkey. In 2016 Portland International Conference on Management of Engineering and Technology (PICMET) (pp. 2681-2690). IEEE. https://doi.org/10.1109/PICMET.2016.7806831. DOI: https://doi.org/10.1109/PICMET.2016.7806831

Parviz L. (2020) Comparative evaluation of hybrid SARIMA and machine learning techniques based on time varying and decomposition of precipitation time series. Journal of Agricultural Science and Technology, 22(2): 563-578. Retrieved from: http://jast.modares.ac.ir/article-23-26018-en.html

Abellana DPM, Rivero DMC, Aparente ME, Rivero, A. (2020) Hybrid SVR-SARIMA model for tourism forecasting using PROMETHEE II as a selection methodology: a Philippine scenario. Journal of Tourism Futures. https://doi.org/10.1108/JTF-07-2019-0070 DOI: https://doi.org/10.1108/JTF-07-2019-0070

Tahyudin I, Nambo H. (2018) Comparison Study of Deep Learning and Time Series for Bioelectric Potential Analysis. In 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE) (pp. 79-83). IEEE. https://doi.org/10.1109/ICITISEE.2018.8720998 DOI: https://doi.org/10.1109/ICITISEE.2018.8720998

Tahyudin I, Nambo H. (2018) SARIMA Model of Bioelectic Potential Dataset. In International Conference on Big Data, Cloud and Applications (pp. 367-378). Springer, Cham. https://doi.org/10.1007/978-3-319-96292-4_29 DOI: https://doi.org/10.1007/978-3-319-96292-4_29

Kumar J, Goomer R, Singh AK. (2018) Long short term memory recurrent neural network (LSTM-RNN) based workload forecasting model for cloud datacenters. Procedia Computer Science, 125: 676-682. https://doi.org/10.1016/j.procs.2017.12.087 DOI: https://doi.org/10.1016/j.procs.2017.12.087

Benvenuto D, Giovanetti M, Vassallo L, Angeletti S, Ciccozzi M. (2020) Application of the ARIMA model on the COVID-2019 epidemic dataset. Data in Brief, 29: 105340. https://doi.org/10.1016/j.dib.2020.105340 DOI: https://doi.org/10.1016/j.dib.2020.105340

Ceylan Z. (2020) Estimation of COVID-19 prevalence in Italy, Spain, and France. Science of The Total Environment, 729:138817. DOI: https://doi.org/10.1016/j.scitotenv.2020.138817

Zeroual A, Harrou F, Dairi A, Sun Y. (2020) Deep learning methods for forecasting COVID-19 time-Series data: A Comparative study. Chaos, Solitons & Fractals, 140: 110121 DOI: https://doi.org/10.1016/j.chaos.2020.110121

NIST/SEMATECH: Seasonality (2012). http://www.itl.nist.gov/div898/ handbook/pmc/section4/pmc443.htm. Accessed 23 September 2020

Qi C, Zhang D, Zhu Y, Liu L, Li C, Wang Z, Li X. (2020) SARFIMA model prediction for infectious diseases: application to hemorrhagic fever with renal syndrome and comparing with SARIMA. BMC medical research methodology, 20(1): 1-7. https://doi.org/10.1186/s12874-020-01130-8 DOI: https://doi.org/10.1186/s12874-020-01130-8

Hamilton JD. (2020) Time series analysis. Princeton university press. DOI: https://doi.org/10.2307/j.ctv14jx6sm

Sherstinsky A. (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, 404: 132306. https://doi.org/10.1016/j.physd.2019.132306 DOI: https://doi.org/10.1016/j.physd.2019.132306

Reddy BK, Delen D. (2018) Predicting hospital readmission for lupus patients: An RNN-LSTM-based deep-learning methodology. Computers in biology and medicine, 101: 199-209. https://doi.org/10.1016/j.compbiomed.2018.08.029 DOI: https://doi.org/10.1016/j.compbiomed.2018.08.029

Qi J, Liu X, Tejedor J. (2020) Variational inference-based Dropout in recurrent neural networks for slot filling in spoken language understanding. arXiv Preprint arXiv:2009.01003

Li C, Zhao L, Cai B. (2020) Size prediction of railway switch gap based on RegARIMA model and LSTM network. IEEE Access, 8, 198188-198200. https://doi.org/10.1109/ACCESS.2020.3034687. DOI: https://doi.org/10.1109/ACCESS.2020.3034687

Z. Liu et al., “Entity recognition from clinical texts via recurrent neural network,” BMC Med. Inform. Decis. Mak., vol. 17, no. Suppl 2, 2017, doi: 10.1186/s12911-017-0468-7 DOI: https://doi.org/10.1186/s12911-017-0468-7

M. A. Jishan, K. R. Mahmud, A. K. Al Azad, M. S. Alam, and A. M. Khan, “Hybrid deep neural network for bangla automated image descriptor,” Int. J. Adv. Intell. Informatics, vol. 6, no. 2, pp. 109–122, 2020. https://doi.org/10.26555/ijain.v6i2.499 DOI: https://doi.org/10.26555/ijain.v6i2.499

Downloads

Published

2022-07-04

How to Cite

Tahyudin, I., Wahyudi, R., & Nambo, H. (2022). SARIMA-LSTM Combination For COVID-19 Case Modeling. IIUM Engineering Journal, 23(2), 171–182. https://doi.org/10.31436/iiumej.v23i2.2134

Issue

Section

Engineering Mathematics and Applied Science