주메뉴 바로가기 본문 바로가기
  • KDCA
  • Contact us
  • E-Submission

PHRP : Osong Public Health and Research Perspectives

OPEN ACCESS. pISSN: 2210-9099. eISSN: 2233-6052
About
Browse articles
Editorial policies
For authors
Original Article

Analysis of influenza-like illness trends in Saudi Arabia: a comparative study of statistical and deep learning techniques

Osong Public Health and Research Perspectives 2025;16(3):270-284.
Published online: June 12, 2025

Department of Mathematics, Faculty of Science, Al-Baha University, Al-Baha, Saudi Arabia

Corresponding author: Fathelrhman EL Guma Department of Mathematics, Faculty of Science, Al-Baha University, Al-Aqiq 65931, Saudi Arabia E-mail: fguma@bu.edu.sa
• Received: March 11, 2025   • Revised: April 21, 2025   • Accepted: April 25, 2025

© 2025 Korea Disease Control and Prevention Agency.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

  • 578 Views
  • 18 Download
prev next
  • Objectives
    To develop and evaluate forecasting models using the Holt-Winters statistical approach and the long short-term memory (LSTM) deep learning method for weekly seasonal influenza-like illness (ILI) incidences in Saudi Arabia. The study compares model performance and assesses the predictive value added by incorporating region-specific exogenous variables within Middle Eastern epidemiological modeling.
  • Methods
    This study compared the performance of Holt-Winters and LSTM models in forecasting weekly ILI cases in Saudi Arabia, using data collected from 2017 to 2022. Time series analysis integrated exogenous variables including climatic conditions and population mobility trends. The Holt-Winters model employed both additive and multiplicative seasonal components. Model performance was evaluated using root mean squared error (RMSE), mean absolute percentage error, and R2.
  • Results
    The best-performing model, LSTM with exogenous variables, achieved an RMSE of 28.55, mean absolute error (MAE) of 0.14, R2 of 0.96, and percent bias (PBIAS) of +2.1%, indicating negligible systematic error. The LSTM model without exogenous variables demonstrated slightly lower accuracy (RMSE of 34.07, MAE of 0.18, R2 of 0.93, PBIAS of +5.8%), indicating strong predictive capability but less precision in determining peak ILI cases. The Holt-Winters model effectively captured seasonal and long-term trends, but showed a moderate performance with an RMSE of 82.57, MAE of 0.38, R2 of 0.58, and a high PBIAS of +14.2%, revealing significant unexplained variability during periods of high incidence fluctuation.
  • Conclusion
    This study highlights the respective strengths and limitations of statistical and machine learning approaches for ILI forecasting.
The rapid spread of influenza-like illnesses (ILIs) poses a significant public health challenge worldwide. According to the World Health Organization (WHO), approximately 1 billion cases of ILIs occur annually, including 3 to 5 million severe cases, resulting in an estimated 290,000 to 650,000 respiratory deaths globally [1,2]. In developing countries, ILIs account for over 99% of deaths among children under 5 years of age. These illnesses typically spread as seasonal epidemics during winter; however, in tropical regions, they circulate year-round. Common symptoms include cough, headache, severe malaise, and runny nose, and transmission occurs primarily through respiratory droplets [3]. Each year, over 2 million individuals gather in confined spaces for the Hajj pilgrimage, significantly increasing infection transmission during this period [4]. Seasonal variability and mass gatherings, especially during the Hajj season, complicate the monitoring and forecasting of ILIs in Saudi Arabia [58].
Traditional Forecasting
Forecasting is essential for combating the spread of ILIs as it facilitates effective resource allocation and the timely implementation of appropriate public health interventions. Accurate forecasting reduces uncertainty by predicting flu activity in advance, including when and where outbreaks are likely to occur. This predictive capability supports efficient resource allocation and improved planning before flu-related hospitalizations surge [9]. Previously, influenza surveillance data collected from clinics, diagnostic laboratories, general practitioners, and public health organizations were used to track ILIs and other respiratory infections. However, this approach often led to delays of 1 week or more in report issuance, with frequent retrospective modifications [10]. Additionally, unreported and unnoticed cases posed significant challenges in accurately predicting influenza incidence. The 2014 WHO Global Epidemiological Surveillance Standards for Influenza propose a “sentinel surveillance” approach, involving the regular collection of epidemiological and virological data from selected monitoring sites [11,12]. Although this approach is typically efficient in gathering high-quality data, it has several inherent limitations. First, it only captures influenza cases in individuals who seek medical attention, representing merely a fraction of the actual disease burden. Additionally, there is typically a delay between symptom onset and the decision to seek healthcare, compounded by a further delay of 1 to 2 weeks between data collection and its reporting. Moreover, the effectiveness of sentinel surveillance heavily depends on factors such as the availability of quality laboratory resources and skilled personnel, which may not always be accessible. Lastly, the selection of sentinel sites often prioritizes locations serving large, easily accessible populations, introducing potential sampling bias [13]. Mathematical models for predicting infectious disease outbreaks generally fall into 2 primary categories. The first category includes mechanistic models, which predict the future course of an ongoing epidemic. These models rely on understanding disease transmission mechanisms and their influencing factors, such as the depletion of susceptible populations or climate-related impacts on transmission rates. Mechanistic models may function at the population level (e.g., compartmental models) or the individual level (e.g., agent-based models) [14]. In the United States, Ginsberg et al. [15] were the first to use Google query data to predict ILI rates, creating an influenza trend model that monitored real-time data from millions of search engine queries. However, this model exhibited limitations common to many previous surveillance systems. Other research teams have subsequently utilized various online platforms, including Twitter, Weibo, Baidu, Yahoo, and Google, to develop models for improved accuracy in identifying influenza trends [16]. These limitations of traditional methods highlight the need for more advanced approaches, particularly in managing complex patterns and nonlinear dynamics.
Advancing Machine Learning Models
Recent advancements in machine learning have demonstrated exceptional performance in predicting influenza trends, particularly through models that analyze high-dimensional and nonlinear data. Notable examples include recurrent neural networks (RNNs), random forests, and support vector machines [1719]. Prior research has confirmed the effectiveness of machine learning methods for predicting various epidemic diseases across different countries. Among these, deep learning techniques—especially long short-term memory (LSTM) networks—have emerged as powerful tools for forecasting complex time series data [20]. LSTM models excel at modeling long-term temporal dependencies and irregular patterns, significantly surpassing traditional forecasting methods like seasonal autoregressive integrated moving average (SARIMA) and Holt-Winters in accurately predicting ILI trends [21,22]. Bidirectional LSTM networks extend the capabilities of LSTM by capturing patterns in both forward and backward temporal directions, providing superior predictive accuracy for both short- and long-term trends, particularly when temporal correlations are complex [23]. However, achieving optimal performance with these advanced models requires data of sufficient quality and quantity, which can be a notable limitation.
Despite these technological advancements, Saudi Arabia and other Middle Eastern countries continue to underutilize deep learning models for ILI forecasting. Most research in the region still relies heavily on conventional statistical methods, which effectively identify seasonal trends but fall short of capturing intricate asymmetrical dynamics [24]. Moreover, earlier studies generally overlooked exogenous variables—such as vaccination rates, climatic conditions, and population mobility—which are crucial for enhancing predictive accuracy. One study in the Asia-Pacific region incorporated migration trends and demographic data to significantly improve predictive performance [25]. Additionally, other researchers have demonstrated the advantages of employing hybrid forecasting models, combining statistical and machine learning methods for predicting ILI trends across various regions [26,27]. Gomez-Cravioto et al. [28] and Budiharto [18] highlighted the efficiency of LSTM models in predicting nonlinear patterns associated with seasonal variability. By integrating key factors such as climate variations, demographic patterns, population migration, and vaccination rates, the current study addresses existing knowledge gaps and aims to enhance the forecasting accuracy of ILIs in Saudi Arabia.
Emerging Machine Learning in the Middle East
Various methods have been applied by Middle Eastern countries for forecasting ILIs, but these efforts remain in an ongoing developmental stage [24]. A limited number of studies have employed traditional forecasting methods, such as autoregressive integrated moving average (ARIMA) and SARIMA models, particularly during pandemics. Recently, a novel extreme gradient boosting (XGBoost) model demonstrated improved accuracy in forecasting monthly influenza case numbers in Saudi Arabia, surpassing traditional methods in pandemic scenarios [29]. Another study conducted in Syria compared multiple forecasting models using weekly ILI data collected via the Early Warning Alert and Response System. Although this study observed forecasting improvements, it noted significant limitations regarding data quality and availability [30]. Table 1 provides a comparative overview of traditional and machine learning methods. During the coronavirus disease 2019 (COVID-19) pandemic, Saudi Arabia leveraged machine-learning models for outbreak prediction, enhancing early disease detection, screening, prognosis, and facilitating the development of proactive strategies to mitigate disease impacts [8]. These findings underscore the underutilized potential of advanced machine-learning approaches in Middle Eastern countries, highlighting the need to address this gap.
Hybrid models, which integrate the strengths of traditional statistical approaches and machine learning techniques, offer the potential to develop more accurate and interpretable forecasting systems. Despite advancements in ILI forecasting, there has been limited comparative analysis between traditional statistical methods—such as the Holt-Winters model—and advanced deep learning techniques, particularly LSTM networks, in Saudi Arabia. Most research has focused exclusively on individual modeling approaches, often neglecting the influence of seasonal fluctuations and mass gatherings, such as Hajj and Umrah, on forecasting performance. An improved understanding of the factors that improve predictive accuracy could significantly optimize public health interventions and enhance early warning systems.
The aims of this study were as follows: (1) To compare the performance of the traditional Holt-Winters statistical model and the deep learning-based LSTM model in forecasting weekly seasonal ILI incidences in Saudi Arabia. (2) To evaluate the predictive accuracy of both models using the root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) metrics. (3) To enhance public health awareness by identifying an adaptable, data-driven forecasting model tailored to the unique epidemiological trends and public health challenges in Saudi Arabia.
Study Design and Data Collection
This study evaluates the performance of the Holt-Winters and LSTM models for forecasting weekly ILI cases in Saudi Arabia. Data covering a 6-year period from 2017 to 2022 were obtained from the Saudi Ministry of Health and the WHO. Weekly data were collected from all regions of Saudi Arabia. Exogenous variables, such as climatic conditions, population mobility trends, and vaccination rates, were included to improve predictive accuracy.
Holt Linear Model
The Holt-Winters forecasting method enhances univariate time series modeling [3133]. As a statistical approach, the Holt-Winters method effectively captures seasonal and trend components, making it particularly suitable for short-term forecasting of seasonal data due to its computational efficiency and ease of implementation. Depending on the characteristics of the seasonal component, either additive or multiplicative Holt-Winters models are utilized [3436]. Additive models are appropriate for data exhibiting linear or stable exponential trends without significant growth over time [37].
The additive Holt-Winters model is mathematically represented as follows [29,30]:
Forecasting:
(1)
x^t+ht=Lt+h.Tt+Stm+1+(h1)(m),
where:
x^t+ht: Forecasted value for time t+h,
Lt: Level component at time t,
Tt: Trend component at time t,
St-m+1+(h-1)(m): Seasonal component for the forecasted period,
m: Length of the seasonal cycle.
Level (Lt):
(2)
Lt=αxtStm+1αLt1+Tt1,
Trend (Tt):
(3)
Tt=βLtLt1+1βTt1,
Seasonality (St):
(4)
St=γ(xtLt)+(1γ)Stm,
where:
xt: Observed value at time t,
St-m: Seasonal component from m periods ago,
α, β, γ: Smoothing parameters for level, trend, and seasonality, 0<α, β, γ<1,
The multiplicative model is formally expressed as follows [24,25]:
Forecasting:
(5)
x^t+ht=(Lt+h.Tt).Stm+1+(h1)()m,
x^t+ht: Forecasted value for time t+h,
Lt: Level component at time t,
Tt: Trend component at time t,
St-m+1+(h-1)(m): Seasonal component for the forecasted period,
m: Length of the seasonal cycle.
Level (Lt):
(6)
Lt=αxtStm+1αLt1+Tt1,
Trend (Tt):
(7)
Tt=βLtLt1+1βTt1,
Seasonality (St):
(8)
St=γxtLt+1γStm,
where:
xt: Observed value at time t,
St-m: Seasonal component from m periods ago,
α, β, γ: Smoothing parameters for level, trend and seasonality, 0<α, β, γ<1
Long Short-Term Memory
LSTM is a specific type of RNN. It enhances traditional RNNs through an improved structure containing 3 gate mechanisms—forget, input, and output gates—as well as a cell state [38]. This enhanced structure is crucial for epidemiological forecasting, as it effectively manages abrupt changes and irregular patterns driven by climatic variations, public health interventions, or mass gatherings. LSTM models overcome many limitations of traditional statistical methods by better capturing long-term dependencies and nonlinear interactions. Figure 1 illustrates the architecture of the LSTM model. The following paragraphs detail the architecture and gate mechanisms of the LSTM model [25].
The forget gate reviews the current time step’s input information, denoted as st, and the previous time step’s output information, denoted as ht-1. When ft=0 , the gate discards the read information. Conversely, when ft=1, it retains the read information. The formula for ft is as follows [39]:
(9)
ft=σwfht1.xt+bf,
This gate determines which new input information to store in the neuron. It creates a candidate cell state C¯t, and the input gate updates the candidate cell state. The new information is subsequently added to the cell state. The specific formula is as follows [40]:
(10)
C¯t=tanhwc.ht1·xt+bc,
(11)
it=σwi.ht1xt+bi,
(12)
ct=ft.ct1+it C¯t,
In the above formulas, wc. is the weight matrix for the cell state, bc is the bias coefficient for the cell state, wi is the weight matrix for the input gate, and bi is the bias coefficient for the input gate.
The output gate determines the final output ht using the cell state. It starts by processing the current input information xt and the previous output information ht-1. Then, it multiplies these values by the cell state processed by the tanh layer to obtain the final output ht. The specific formula is as follows [41]:
(13)
ot= σwo.ht1xt+bo,
(14)
ht= ottanhct,
in these formulas, wo is the weight matrix for the output gate, and bo is the bias coefficient for the output gate.
Both models are consistently employed throughout this study, leveraging combined statistical and machine learning strengths. Combining Holt-Winters and LSTM methods potentially enhances forecasting accuracy by better adjusting for seasonal variations. Optimizing hyperparameters is crucial for enhancing model performance, particularly when predicting unseen or new data [42,43]. Hyperparameter tuning involves selecting and adjusting parameters such as the optimizer type, number of epochs, number of neurons, activation functions, and loss functions. Appropriate optimization levels significantly improve the accuracy and generalizability of forecasting outcomes.
Cross-Validation for Model Optimization
Cross-validation is a systematic method used to explore combinations of hyperparameters, helping to identify the optimal model configuration. This approach is especially valuable for time-sensitive applications such as disease forecasting [44]. It is widely employed to estimate prediction errors and focuses explicitly on preventing model overfitting. A commonly used cross-validation method for optimizing LSTM models is Monte Carlo cross-validation [45]. For LSTM models, the optimizer’s learning rate (e.g., Adam or RMSprop) regulates the step size for updating model parameters (Figure 2). Overfitting is typically mitigated by incorporating a dropout rate, which randomly disables neurons during training. Additionally, the number of epochs and batch size play crucial roles: increasing epochs can capture complex patterns but risk overfitting, whereas batch size influences both training duration and model generalization [42].
Frequently used metrics for evaluating prediction accuracy include MAE, mean squared error (MSE), RMSE, and the coefficient of determination (R2) [42,43]. The MAE quantifies the average magnitude of prediction errors, representing the typical error magnitude expected from forecasts [4650]:
(15)
MAE= j=1nZjZn
Where: n is the number of errors, and |Zj-Z| the absolute errors.
Eq. (15) defines the RMSE, which represents the standard deviation of the prediction errors.
(16)
RMSE =i=1n(χiχi)2n
R2 quantifies the proportion of the predicted weekly ILI cases that can be explained by the predictor variables, as defined in Eq. (16).
(17)
R2=1i=1n(χiχi)2i=1n(χiχl)2
where χi represents the predicted value of the ith sample and represents the matching real value for the whole n samples.
PBIAS (%)
The percent bias (PBIAS) quantifies the amount of bias in the model by assessing the cumulative discrepancy between observed and predicted values relative to the total observed values:
(18)
PBIAS=i=1n(χiχi)i=1n(χi)
where,
χi is the observed value,
χi is the predicted value,
n is the number of observations.
The magnitude of the errors that may be predicted from the average forecast is denoted by PBIAS [5153].
Willmott’s Index of Agreement
Willmott’s index of agreement (WI) is a dimensionless statistic used to evaluate the magnitude and direction of predictive error across various models [52,54,55]. It is particularly useful for comparing predictive accuracy between models:
(19)
WI=i=1n(χiχi)2i=1n(|χiχı|+|χiχı|)2
where:
χi is the observed value,
χi is the predicted value,
n is the number of observations.
χı ¯ is the mean of observed values.
The workflow is presented in Figure 3.
Data Characterization
Temporal epidemiological analysis was performed on ILI trends, highlighting variability in disease incidence across different seasons and years. Weekly data on ILIs in Saudi Arabia, spanning from January 2017 to December 2022, were collected from the Saudi Ministry of Health and the WHO. The dataset encompasses all regions of Saudi Arabia, providing comprehensive national-level insight into ILI dynamics.
Data Preparation
To ensure data accuracy, consistency, and readiness for analysis, the following pre-processing steps were implemented: (1) Missing data points were imputed using interpolation. (2) Data were normalized using min-max scaling to ensure compatibility with the LSTM algorithms, rescaling ILI case counts to (0, 1). This supports faster convergence in LSTM. (3) Data were split into training (80%), validation (10%), and testing (10%) sets.
Characteristics of weekly ILI cases
Figure 4 illustrates clear, recurrent peaks at regular intervals, demonstrating seasonal patterns in weekly ILI incidence. The peaks correspond with defined seasons, particularly winter, and significant mass events such as the Hajj pilgrimage season. A noticeable decline in ILI cases occurred between 2020 and 2021, reflecting stringent preventive measures implemented in response to COVID-19. This decline reflects the impact of public health interventions on disease spread.
Forecasting weekly ILI cases utilizing the Holt-Winters model
The Holt-Winters forecasting models were developed and assessed using additive and multiplicative seasonal components based on the training dataset. A systematic grid search identified the optimal model configuration, selecting the model with the lowest RMSE for the test dataset. Comprehensive evaluation identified the Holt-Winters multiplicative model as the best-performing configuration, and multiple metrics were computed to evaluate its performance (Table 2). These metrics affirmed the model’s capability to capture seasonal trends effectively. The smoothing parameters for level, trend, and seasonality were automatically optimized based on data characteristics.
Figure 5 presents the actual and predicted values generated by the Holt-Winters model. The graph demonstrates that the model effectively captured general seasonal patterns and long-term trends; however, deviations are evident during periods of high volatility. Notably, the model substantially underestimated ILI cases during seasonal peaks, such as the Hajj season, and struggled to adapt to the sudden case declines during 2020–2021 attributed to COVID-19 prevention measures
Forecasting weekly ILI cases utilizing the LSTM model
For the LSTM model, data normalization was performed using the min-max Scaler, and sequences were generated using a 10-week timestep to provide adequate temporal context. The model was trained across 20 epochs using a batch size of 32, employing the Adam optimizer and MSE as the loss function to reduce forecasting inaccuracies. Model performance was evaluated, as summarized in Table 1. Figures 5 and 6 emphasize discrepancies between observed and predicted values, clearly showing that the Holt-Winters model struggled with high-variability periods, while the LSTM model maintained greater stability.
Figure 5 demonstrates the Holt-Winters model’s forecast for weekly ILI cases from 2017 to 2023, comparing training and test datasets with predictions. Although the Holt-Winters model captured seasonality and general trends reasonably well, it struggled with sudden spikes during events such as the Hajj season and significant variability during the COVID-19 pandemic. These deviations reflect the model’s inability to fully account for nonlinear patterns, suggesting that incorporating exogenous variables or using hybrid modeling approaches could enhance forecast performance.
Figure 6 shows that the LSTM model effectively captured both seasonal variations and long-term trends in ILI incidence. The graph compares observed data from the training set, test set, and LSTM predictions, clearly illustrating the model’s capacity to discern patterns accurately and generate reliable predictions during the testing period. While the model generally exhibited strong performance, minor inconsistencies were observed during periods of rapid variation in ILI incidence, indicating opportunities for further improvement through feature engineering or additional hyperparameter optimization. LSTM predictions closely tracked actual values, showing only slight discrepancies during abrupt incidence changes. Incorporating exogenous variables such as mobility patterns, climatic factors, and vaccination rates, along with hyperparameter fine-tuning, could further enhance predictive accuracy. The LSTM model notably demonstrated superior accuracy in handling temporal dependencies and complex epidemiological patterns.
Figure 7 highlights a sudden peak during the Hajj season, illustrating that the LSTM model predicted these spikes more accurately than the Holt-Winters model. The LSTM’s superior performance is attributed to its advanced capacity for handling temporal dependencies and retaining historical information. However, brief periods of high volatility showed slight deviations in LSTM predictions, suggesting that additional feature engineering may be beneficial (Table 2).
To comprehensively assess forecasting performance, statistical metrics were used, including RMSE, MAPE, R2, PBIAS [111], and the WI [111]. RMSE and MAPE assess prediction accuracy by quantifying absolute and relative errors, respectively, while R² indicates the proportion of variance explained by the model. PBIAS provides insights into the direction and magnitude of systematic model bias, determining if predictions are generally overestimated or underestimated relative to observations, particularly beneficial for long-term forecasts. The WI evaluates overall predictive performance, quantifying the proximity between observed and predicted values, further illustrated by the Taylor diagram and box plot of prediction errors (Figures 8, 9). As summarized in Table 2, the LSTM model significantly outperformed the Holt-Winters model, with notably lower RMSE (34.07 vs. 82.57), reduced MAPE (0.18 vs. 0.38), higher R2 (0.93 vs. 0.58), and substantially improved PBIAS (+5.8% vs. +14.2%) and WI (0.48 vs. 105.79) scores.
These findings strongly support the conclusion that LSTM models are more suitable than traditional statistical approaches for modeling complex, nonlinear, and seasonally varying time-series data related to ILI trends.
This study compared the predictive performance of Holt-Winters and LSTM models for forecasting weekly ILI cases in Saudi Arabia, emphasizing the inclusion of exogenous variables such as climatic conditions, population mobility, and vaccination coverage. The results demonstrated that the LSTM model, particularly when incorporating exogenous variables, consistently outperformed the Holt-Winters model across all evaluation metrics. Specifically, the LSTM model achieved an RMSE of 28.55 and an R² of 0.96, highlighting its superior capability to capture nonlinear trends and significant peaks in disease incidence, particularly during high-variability periods like the Hajj season. Both additive and multiplicative seasonal components were methodically examined using a grid search methodology to determine the optimal model configuration. This systematic evaluation allowed exploration of multiple parameter combinations, ensuring the selection of the most effective model configuration by balancing computational efficiency and accuracy. Although the Holt-Winters model provides an intuitive framework for addressing seasonality, differences emerged in forecasts during periods of elevated ILI variability. These discrepancies highlight that, despite its proficiency in capturing general trends and predictable seasonal patterns, the Holt-Winters model struggles with nonlinear dynamics and abrupt changes typical of epidemiological data. A previous study indicated similar challenges, noting that the model may fail to predict sudden spikes or drops in disease incidence due to unforeseen events or interventions. Moreover, Holt-Winters models inherently assume consistent seasonal patterns, an assumption frequently invalid in epidemiological contexts where seasonality may vary considerably [34].
In contrast, the LSTM model demonstrated enhanced performance across all assessment criteria, including a statistically significant reduction in RMSE (34.07 vs. 82.57), reduced MAPE (0.18 vs. 0.38), and increased R2 (0.93 vs. 0.58). These metrics underscore the practical advantage of LSTM, facilitating more accurate disease forecasts and enabling timely public health interventions. The improved metrics also reflect the LSTM model's strong capability in learning sequential dependencies and complex nonlinear interactions inherent in epidemiological time series data. By utilizing 2 stacked LSTM layers combined with fully connected layers, the model effectively captured intricate temporal patterns. Stacked LSTM architecture enhances the model’s ability to recognize complex temporal dynamics and nonlinear data dependencies. Each additional layer of LSTM allows for deeper, more abstract representations of the data, significantly improving performance on forecasting tasks [56]. Despite these advancements, the LSTM model encountered challenges when handling sudden fluctuations in ILI incidence rates. Occasional discrepancies during periods of rapid change likely reflect issues such as limited data availability or inherent randomness in epidemiological data, indicating room for further model refinement.
Compared to related research, the current study offers several methodological improvements. For instance, Khan et al. [57] introduced a cloud-based modeling system for the influenza pandemic using a feed-forward propagation neural network (MSDII-FFNN) model for pandemic influenza forecasting using Internet of Things-generated data [57]. While comparable in its real-time monitoring objective, their approach differed methodologically by focusing more on infrastructure integration rather than methodological optimization. Unlike our study, Khan et al. [57] did not utilize standardized performance metrics or baseline controls, thus limiting the rigorous evaluation of their model’s reliability and accuracy. Similarly, Alzahrani and Guma [58] utilized ARIMA, SARIMA, and XGBoost models to predict monthly influenza cases in Saudi Arabia. Their findings align with our results in demonstrating machine learning’s superiority over traditional statistical methods. However, their study differed significantly in terms of scope and temporal resolution, employing monthly data and omitting exogenous variables, thus restricting their model’s applicability for predicting rapid epidemiological changes such as those associated with the Hajj pilgrimage or unexpected outbreaks.
Meanwhile, the study by Olukanmi et al. [59] is similar to ours in employing deep learning techniques for direct weekly ILI predictions. Their approach, however, was distinguished by using digital behavior data from Google Trends rather than structured environmental data. While both studies highlight the strengths of LSTM models, they demonstrated the effectiveness of real-time public data in identifying early symptom-related searches, whereas our study emphasized epidemiological and environmental proxies to further improve predictive performance. These comparisons illustrate how different data inputs—such as digital search trends in Olukanmi et al. [59], cloud-embedded monitoring in Khan et al., and monthly aggregated case data in Alzahrani and Guma [58]—influence model outcomes. Our research positions itself between these methodologies by utilizing formal, real-world weekly data with additional exogenous variables, achieving a balance of timeliness, precision, and policy relevance. Developing interpretable hybrid models capable of integration into public health systems will be crucial for timely and accurate epidemic forecasting across diverse epidemiological contexts.
However, other limitations must be considered. Training deep learning models such as LSTM is resource-intensive, demanding substantial computational power, specialized hardware, and memory [43,60]. Conversely, the Holt-Winters model is more suited to real-time applications due to its computational efficiency and lower resource requirements. Nonetheless, this computational efficiency compromises prediction accuracy when dealing with complex, nonlinear epidemiological data.
Integrating advanced characteristics—such as population mobility patterns, climatic variations, and vaccination rates—can significantly enhance a model’s ability to predict seasonal peaks and fluctuations associated with mass gatherings, notably the Hajj pilgrimage, thereby deepening our understanding of influenza transmission dynamics [61,62]. Certain models can dynamically adjust real-time forecasts according to changing environmental factors. Improved epidemiological models may focus more closely on population migration patterns to enhance disease transmission predictions [61,63]. Moreover, climatic factors like temperature and humidity have previously improved the accuracy of epidemiological models for COVID-19 transmission projections [64]. Similarly, predictive models in the United States have enhanced performance by explicitly incorporating population migration rates as exogenous variables [64]. Concentrating on these factors could allow machine learning models to predict changes in ILI incidence more accurately, ultimately facilitating more effective public health strategies.
When comparing machine learning models, interpretability and simplicity are key strengths of the Holt-Winters model, despite its performance decline in complex scenarios characterized by temporal dependencies or sudden shifts. Conversely, long-term machine learning models such as LSTM demonstrate robust capabilities in representing these complexities but require careful hyperparameter tuning, substantial computational resources, and strategies to mitigate overfitting. These findings underscore the appropriateness of advanced machine learning methods, like LSTM, for predicting influenza and related illnesses within dynamic epidemic settings. In stable conditions, the Holt-Winters model remains beneficial due to its efficiency and interpretability. Future studies could explore hybrid methodologies combining statistical and machine learning models, capitalizing on their complementary strengths to enhance predictive performance. Previous research has demonstrated the effectiveness of hybrid models—for instance, combining LSTM with ARIMA in time-series forecasting has yielded improvements in predictive accuracy and adaptability within epidemiological modeling [62]. Additionally, hybrid deep learning approaches integrating convolutional neural networks with LSTMs have effectively predicted disease trends, enhancing both feature extraction and sequential learning capabilities [63]. These hybrid models could achieve even greater accuracy when enriched with exogenous variables such as population migration, vaccination rates, or climatic conditions.
Limitations and Future Work
This study’s models had limited capacity to fully adjust for external factors affecting disease transmission, focusing predominantly on historical ILI case data. Incorporating exogenous variables such as population mobility, vaccination coverage, and environmental conditions (temperature and humidity) may further improve forecasting accuracy. Still, the models faced challenges accommodating abrupt shifts and inherent nonlinearities in the data, illustrating the broader challenge of relying solely on statistical approaches in highly volatile datasets. Nevertheless, the LSTM model consistently performed better across all evaluation metrics, reflecting its strong capability to capture temporal dependencies and nonlinear interactions. Lower RMSE and MAPE values combined with a higher R2 indicate that LSTM is particularly suitable and effective for predicting ILI cases in dynamic and complex epidemic scenarios. Predictions occasionally deviated during rapid data fluctuations, indicating potential improvements through additional feature enrichment and hyperparameter tuning. Previous studies have shown that incorporating mobility data into COVID-19 models and environmental variables into epidemiological forecasts enhances predictive performance [61,63,64]. Future research efforts should focus on developing hybrid models combining statistical and machine learning approaches to leverage their respective strengths. Moreover, integrating exogenous variables such as meteorological data, migration patterns, vaccination coverage, and public health interventions could yield more precise and practically valuable predictions. The findings from this study contribute to improved methods for forecasting ILI cases, facilitating public health planning and enhancing intervention strategies for seasonal and emerging infectious diseases. For instance, the ARIMAX model successfully incorporates external variables into epidemiological predictions. Combining ARIMAX with LSTM models could further enhance predictive accuracy while retaining interpretability. Future work might also involve public health experts in model development, thereby improving adoption and utilization within healthcare systems. Additionally, the integration of real-time data streams and adaptive modeling techniques warrants further investigation to better accommodate evolving epidemiological trends.
This study explored the strengths and limitations of Holt-Winters and LSTM models for predicting weekly ILI cases in Saudi Arabia. The Holt-Winters model effectively captured seasonal patterns and general trends but struggled significantly with sudden shifts in incidence. In contrast, the LSTM model delivered superior prediction accuracy, demonstrating exceptional proficiency in modeling nonlinear patterns and temporal relationships. Nevertheless, the LSTM model remains sensitive to hyperparameter adjustments and requires considerable computational resources. Future research should explore hybrid modeling approaches, integrating deep learning and statistical methods alongside exogenous variables to improve predictive accuracy further. These advances will significantly enhance epidemic forecasting, benefiting public health planning and response strategies. Thus, this study reinforces that novel machine learning approaches, such as LSTM, generally yield superior predictive results for complex epidemiological scenarios, whereas classical statistical models like Holt-Winters remain faster, simpler, and valuable in more stable and predictable contexts.
• This study compared the Holt-Winters statistical model and the long short-term memory (LSTM) deep learning model for forecasting influenza-like illnesses in Saudi Arabia.
• LSTM outperformed Holt-Winters across all evaluation metrics, achieving a root mean squared error (RMSE) of 34.07 and an R2 of 0.93, compared to Holt-Winters’ RMSE of 82.57 and R2 of 0.58.
• The LSTM model better captured nonlinear temporal patterns, whereas Holt-Winters struggled with rapid fluctuations in incidence and seasonal variability.
• Incorporating exogenous variables such as climate factors, mobility patterns, and vaccination rates further improved the predictive accuracy of the LSTM model.
• Future research should explore hybrid models integrating statistical and machine learning approaches for enhanced epidemic forecasting.

Ethics Approval

Not applicable.

Conflicts of Interest

The author has no conflicts of interest to declare.

Funding

None.

Availability of Data

Data are available upon reasonable request.

Figure 1.
The architecture of the long short-term memory model and its gate structures.
figure
Figure 2.
Hyperparameter tuning techniques.
LSTM, long short-term memory; popLSTM, proposed optimized LSTM.
figure
Figure 3.
Representation of the entire workflow.
LSTM, long short-term memory; ILI, influenza-like illness.
figure
Figure 4.
Weekly influenza-like illness (ILI) incidence in Saudi Arabia from 2017 to 2023, showing seasonal peaks and fluctuations.
figure
Figure 5.
Holt-Winters forecasting of influenza-like illness (ILI) with training and test datasets.
figure
Figure 6.
Long short-term memory (LSTM) forecasting of influenza-like illness (ILI) with training and test datasets.
figure
Figure 7.
Comparative forecast of influenza-like illness (ILI) cases. LSTM, long short-term memory.
figure
Figure 8.
Taylor diagram. LSTM, long short-term memory.
figure
Figure 9.
Box plot and model prediction error analysis. LSTM, long short-term memory.
figure
figure
Table 1.
Comparison of traditional and machine learning models
Table 1.
Features Traditional models Machine learning models
Model types SARIMA, exponential smoothing, Holt-Winters Random forests, support vector machines, recurrent neural networks (e.g., LSTM, BiLSTM, ConvLSTM)
Data handling Primarily univariate time series data Capable of handling high-dimensional, nonlinear, and multivariate data
Trend and pattern capture Effective in identifying linear and seasonal trends Superior in capturing complex nonlinear dynamics and long-term dependencies
Predictive accuracy Adequate for short-term forecasts; may struggle with irregular patterns Demonstrated higher accuracy in various studies, especially for long-term and complex pattern forecasting
Applications in the Middle East Widely used in regional studies; limited integration of exogenous variables Underutilized; potential for enhanced accuracy by incorporating factors like climate, mobility, and vaccination rates

SARIMA, seasonal autoregressive integrated moving average; LSTM, long short-term memory; BiLSTM, bidirectional LSTM.

Table 2.
Evaluation of the performance of 2 models
Table 2.
Model RMSE MAPE R2 PBIAS WI
Holt-Winters 82.57 (95% CI, 80.21–85.12) 0.38 (95% CI, 0.36–0.40) 0.58 +14.2% 105.79
LSTM 34.07 (95% CI, 32.45–35.62) 0.18 (95% CI, 0.16–0.19) 0.93 +5.8% 0.48
LSTM (with weather+mobility+vaccination) 28.55 0.14 0.96 +2.1%

RMSE, root mean squared error; MAPE, mean absolute percentage error; PBIAS, percent bias; WI, Willmott’s index of agreement; CI, confidence interval; LSTM, long short-term memory.

  • 1. Haider S, Hassan MZ. Seasonal influenza surveillance and vaccination policies in the WHO South-East Asian Region. BMJ Glob Health 2025;10:e017271.
  • 2. Fiandrino S, Bizzotto A, Guzzetta G, et al. Collaborative forecasting of influenza-like illness in Italy: the Influcast experience. Epidemics 2025;50:100819.
  • 3. Babcock HM, Merz LR, Fraser VJ. Is influenza an influenza-like illness?: clinical presentation of influenza in hospitalized patients. Infect Control Hosp Epidemiol 2006;27:266−70.
  • 4. Alsubaie N, EL Guma F, Boulehmi K, et al. Improving influenza epidemiological models under Caputo fractional-order calculus. Symmetry 2024;16:929.
  • 5. Rashid H, Shafi S, Haworth E, et al. Influenza vaccine in Hajj pilgrims: policy issues from field studies. Vaccine 2008;26:4809−12.
  • 6. Liu IT, Prasad V, Darrow JJ. Evidence for community face masking to limit the spread of SARS-CoV-2: a critical review. Health Matrix 2023;33:1.
  • 7. Alshahrani SM, Zahrani Y. Prevalence and predictors of seasonal influenza vaccine uptake in Saudi Arabia post COVID-19: a web-based online cross-sectional study. Vaccines (Basel) 2023;11:353.
  • 8. Nagraj VP, Benefield AE, Williams D, et al. PLANES: plausibility analysis of epidemiological signals. PLoS One 2025;20:e0320442.
  • 9. Nsoesie EO, Brownstein JS, Ramakrishnan N, et al. A systematic review of studies on forecasting the dynamics of influenza outbreaks. Influenza Other Respir Viruses 2014;8:309−16.
  • 10. Lima S, Goncalves AM, Costa M. Time series forecasting using Holt-Winters exponential smoothing: an application to economic data. AIP Conf Proc 2019;2186:090003.
  • 11. Luo J, Wang X, Fan X, et al. A novel graph neural network based approach for influenza-like illness nowcasting: exploring the interplay of temporal, geographical, and functional spatial features. BMC Public Health 2025;25:408.
  • 12. Meyer AG, Lu F, Clemente L, et al. A prospective real-time transfer learning approach to estimate influenza hospitalizations with limited data. Epidemics 2025;50:100816.
  • 13. Cobb NL, Collier S, Attia EF, et al. Global influenza surveillance systems to detect the spread of influenza-negative influenza-like illness during the COVID-19 pandemic: time series outlier analyses from 2015-2020. PLoS Med 2022;19:e1004035.
  • 14. Guma FE. Comparative analysis of time series prediction models for visceral leishmaniasis: based on SARIMA and LSTM. Appl Math Inf Sci 2024;18:125−32.
  • 15. Ginsberg J, Mohebbi MH, Patel RS, et al. Detecting influenza epidemics using search engine query data. Nature 2009;457:1012−4.
  • 16. Chen Q, Zheng X, Shi H, et al. Prediction of influenza outbreaks in Fuzhou, China: comparative analysis of forecasting models. BMC Public Health 2024;24:1399.
  • 17. Tsan YT, Chen DY, Liu PY, et al. The prediction of influenza-like illness and respiratory disease using LSTM and ARIMA. Int J Environ Res Public Health 2022;19:1858.
  • 18. Budiharto W. Data science approach to stock prices forecasting in Indonesia during COVID-19 using long short-term memory (LSTM). J Big Data 2021;8:47.
  • 19. Akter T, Hossain MF, Ullah MS, et al. Mortality prediction in COVID‐19 using time series and machine learning techniques. Comput Math Methods 2024;2024:5891177.
  • 20. Fotis G, Vita V, Ekonomou L. Machine learning techniques for the prediction of the magnetic and electric field of electrostatic discharges. Electronics 2022;11:1858.
  • 21. Zhao D, Zhang R. A new hybrid model SARIMA-ETS-SVR for seasonal influenza incidence prediction in mainland China. J Infect Dev Ctries 2023;17:1581−90.
  • 22. Pavlatos C, Makris E, Fotis G, et al. Enhancing electrical load prediction using a bidirectional LSTM neural network. Electronics 2023;12:4652.
  • 23. Yang J, Yang L, Li G, et al. Influenza time series prediction models in a megacity from 2010 to 2019: based on seasonal autoregressive integrated moving average and deep learning hybrid prediction model. Chin Med J (Engl) 2024;137:2242−4.
  • 24. Keshavamurthy R, Pazdernik KT, Ham C, et al. Meeting global health needs via infectious disease forecasting: development of a reliable data-driven framework. JMIR Public Health Surveill 2025;11:e59971.
  • 25. Rauf HT, Lali MI, Khan MA, et al. Time series forecasting of COVID-19 transmission in Asia Pacific countries using deep neural networks. Pers Ubiquitous Comput 2023;27:733−50.
  • 26. Darwish A, Rahhal Y, Jafar A. A comparative study on predicting influenza outbreaks using different feature spaces: application of influenza-like illness data from Early Warning Alert and Response System in Syria. BMC Res Notes 2020;13:33.
  • 27. Talkhi N, Akhavan Fatemi N, Ataei Z, et al. Modeling and forecasting number of confirmed and death caused COVID-19 in IRAN: a comparison of time series forecasting methods. Biomed Signal Process Control 2021;66:102494.
  • 28. Gomez-Cravioto DA, Diaz-Ramos RE, Cantu-Ortiz FJ, et al. Data analysis and forecasting of the COVID-19 spread: a comparison of recurrent neural networks and time series models. Cognit Comput 2021;1−12.
  • 29. Awajan AM, Ismail MT, Al Wadi S. Improving forecasting accuracy for stock market data using EMD-HW bagging. PLoS One 2018;13:e0199582.
  • 30. Swapnarekha H, Behera HS, Nayak J, et al. Multiplicative Holts Winter model for trend analysis and forecasting of COVID-19 spread in India. SN Comput Sci 2021;2:416.
  • 31. Pongdatu GA, Putra YH. Seasonal time series forecasting using SARIMA and Holt Winter’s exponential smoothing. IOP Conf Ser Mater Sci Eng 2018;407:012153.
  • 32. Abdullah MM. Using the single-exponential-smoothing time series model under the additive Holt-Winters algorithm with decomposition and residual analysis to forecast the reinsurance-revenues dataset. Paki J Stat Oper Res 2024;311−40.
  • 33. Ince MN, Tasdemir C. Forecasting retail sales for furniture and furnishing items through the employment of multiple linear regression and Holt-Winters models. Systems 2024;12:219.
  • 34. Jin Y, Wang R, Zhuang X, et al. Prediction of COVID-19 data using an ARIMA-LSTM hybrid forecast model. Mathematics 2022;10:4001.
  • 35. Waqas M, Humphries UW. A critical review of RNN and LSTM variants in hydrological time series predictions. MethodsX 2024;13:102946.
  • 36. Michankow J, Sakowski P, Slepaczuk R. Mean absolute directional loss as a new loss function for machine learning problems in algorithmic investment strategies. J Comput Sci 2024;81:102375.
  • 37. Makarovskikh T, Abotaleb M. Hyper-parameter tuning for long short-term memory (LSTM) algorithm to forecast a disease spreading. In: 2022 VIII International Conference on Information Technology and Nanotechnology (ITNT); 2022 May 23; Samara, Russian Federation. IEEE; 2022. p. 1−6.
  • 38. Sediyono E, Wahyuni SN, Sembiring I. Optimizing the long short-term memory algorithm to improve the accuracy of infectious diseases prediction. IAES Int J Artif Intell 2024;13:2893−903.
  • 39. Shejul K, Harikrishnan R, Gupta H. The improved integrated Exponential Smoothing based CNN-LSTM algorithm to forecast the day ahead electricity price. MethodsX 2024;13:102923.
  • 40. Sembiring I, Wahyuni SN, Sediyono E. LSTM algorithm optimization for COVID-19 prediction model. Heliyon 2024;10:e26158.
  • 41. Gowriswari S, Brindha S. Hyperparameters optimization using gridsearch cross validation method for machine learning models in predicting diabetes mellitus risk. In: 2022 International Conference on Communication, Computing and Internet of Things (IC3IoT); 2022 Mar 10; Chennai, India. IEEE; 2022. p. 1−4.
  • 42. Funk S, Camacho A, Kucharski AJ, et al. Assessing the performance of real-time epidemic forecasts: a case study of Ebola in the Western Area region of Sierra Leone, 2014-15. PLoS Comput Biol 2019;15:e1006785.
  • 43. Chhabra A, Singh SK, Sharma A, et al. Sustainable and intelligent time-series models for epidemic disease forecasting and analysis. Sustain Technol Entrep 2024;3:100064.
  • 44. Plevris V, Solorzano G, Bakas NP, et al. Investigation of performance metrics in regression analysis and machine learning-based prediction models. In: 8th European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS 2022); 2022 Jun 5-9; Norway. ECCOMAS Congress. 2022. p. 2022
  • 45. Das P. PerMat: Performance metrics in predictive modeling [Internet]. PerMat; 2024 [cited 2025 Jun 2]. Available from: http://dx.doi.org/10.32614/cran.package.permat.
  • 46. Ryan TP, Woodall WH. The most-cited statistical papers. J Appl Stat 2005;32:461−74.
  • 47. MacDonald P. An influenza vaccination season like no other: present and future aspects. Pract Nurs 2021;32(Sup3a). S10−4.
  • 48. Plenos M. Time series forecasting using holt-winters exponential smoothing: application to abaca fiber data. Sci J Wars Univ Life Sci 2022;22:17−29.
  • 49. Xiao F. Time series forecasting with stacked long short-term memory networks. arXiv [Preprint] 2020 Nov 2;https://doi.org/10.48550/arXiv.2011.00697.
  • 50. Wanli Z, Youjun T, Xiaomei M. From learning science to computer science: a scientometric review of deeper learning in foreign languages (1993-2024). SAGE Open 2025;15:21582440251322564.
  • 51. Tao H, Abba SI, Al-Areeq AM, et al. Hybridized artificial intelligence models with nature-inspired algorithms for river flow modeling: a comprehensive review, assessment, and possible future research directions. Eng Appl Artif Intell 2024;129:107559.
  • 52. Samantaray S, Sahoo A, Satapathy DP, et al. Suspended sediment load prediction using sparrow search algorithm-based support vector machine model. Sci Rep 2024;14:12889.
  • 53. Samantaray S, Sahoo A, Yaseen ZM, et al. River discharge prediction based multivariate climatological variables using hybridized long short-term memory with nature inspired algorithm. J Hydrol 2025;649:132453.
  • 54. Mahakur V, Mahakur VK, Samantaray S, et al. Prediction of runoff at ungauged areas employing interpolation techniques and deep learning algorithm. HydroRes 2025;8:265−75.
  • 55. Samantaray S, Ghose DK. Prediction of S12-MKII rainfall simulator experimental runoff data sets using hybrid PSR-SVM-FFA approaches. J Water Clim Chang 2022;13:707−34.
  • 56. Semenza JC, Paz S. Climate change and infectious disease in Europe: impact, projection and adaptation. Lancet Reg Health Eur 2021;9:100230.
  • 57. Khan MA, Abidi WU, Al Ghamdi MA, et al. Forecast the influenza pandemic using machine learning. Comput Mater Contin 2021;66:331−40.
  • 58. Alzahrani SM, Guma FE. Improving seasonal influenza forecasting using time series machine learning techniques. J Inf Syst Eng Manag 2024;9:30195.
  • 59. Olukanmi SO, Nelwamondo FV, Nwulu NI. Utilizing google search data with deep learning, machine learning and time series modeling to forecast influenza-like illnesses in South Africa. IEEE Access 2021;9:126822−36.
  • 60. Shaikh F, Khattak HA. Hybrid forecasting model for smart grid using exogenous variables. In: 2024 International Conference on Emerging Trends in Smart Technologies (ICETST); 2024 Oct 10-11; Karachi, Pakistan. IEEE; 2024. p. 1–6.
  • 61. Jiang N, Chu W, Li Y. Modeling, inference, and prediction in mobility-based compartmental models for epidemiology. arXiv [Preprint] 2024 Jun 17;https://doi.org/10.48550/arXiv.2406.12002.
  • 62. Xu B, Zhu Q. Dynamic probabilistic latent variable model with exogenous variables for dynamic anomaly detection. In: 2023 American Control Conference (ACC); 2023 May 31; San Diego, USA. IEEE; 2023. p. 3945−50.
  • 63. Hametner C, Kozek M, Bohler L, et al. Estimation of exogenous drivers to predict COVID-19 pandemic using a method from nonlinear control theory. Nonlinear Dyn 2021;106:1111−25.
  • 64. Toutiaee M, Li X, Chaudhari Y, et al. Improving COVID-19 forecasting using exogenous variables. arXiv [Preprint] 2021 Jul 20;https://doi.org/10.48550/arXiv.2107.10397.

Download Citation

Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

Format:

Include:

Analysis of influenza-like illness trends in Saudi Arabia: a comparative study of statistical and deep learning techniques
Osong Public Health Res Perspect. 2025;16(3):270-284.   Published online June 12, 2025
Download Citation

Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

Format:
Include:
Analysis of influenza-like illness trends in Saudi Arabia: a comparative study of statistical and deep learning techniques
Osong Public Health Res Perspect. 2025;16(3):270-284.   Published online June 12, 2025
Close

Figure

  • 0
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
Analysis of influenza-like illness trends in Saudi Arabia: a comparative study of statistical and deep learning techniques
Image Image Image Image Image Image Image Image Image Image
Figure 1. The architecture of the long short-term memory model and its gate structures.
Figure 2. Hyperparameter tuning techniques.LSTM, long short-term memory; popLSTM, proposed optimized LSTM.
Figure 3. Representation of the entire workflow. LSTM, long short-term memory; ILI, influenza-like illness.
Figure 4. Weekly influenza-like illness (ILI) incidence in Saudi Arabia from 2017 to 2023, showing seasonal peaks and fluctuations.
Figure 5. Holt-Winters forecasting of influenza-like illness (ILI) with training and test datasets.
Figure 6. Long short-term memory (LSTM) forecasting of influenza-like illness (ILI) with training and test datasets.
Figure 7. Comparative forecast of influenza-like illness (ILI) cases. LSTM, long short-term memory.
Figure 8. Taylor diagram. LSTM, long short-term memory.
Figure 9. Box plot and model prediction error analysis. LSTM, long short-term memory.
Graphical abstract
Analysis of influenza-like illness trends in Saudi Arabia: a comparative study of statistical and deep learning techniques
Features Traditional models Machine learning models
Model types SARIMA, exponential smoothing, Holt-Winters Random forests, support vector machines, recurrent neural networks (e.g., LSTM, BiLSTM, ConvLSTM)
Data handling Primarily univariate time series data Capable of handling high-dimensional, nonlinear, and multivariate data
Trend and pattern capture Effective in identifying linear and seasonal trends Superior in capturing complex nonlinear dynamics and long-term dependencies
Predictive accuracy Adequate for short-term forecasts; may struggle with irregular patterns Demonstrated higher accuracy in various studies, especially for long-term and complex pattern forecasting
Applications in the Middle East Widely used in regional studies; limited integration of exogenous variables Underutilized; potential for enhanced accuracy by incorporating factors like climate, mobility, and vaccination rates
Model RMSE MAPE R2 PBIAS WI
Holt-Winters 82.57 (95% CI, 80.21–85.12) 0.38 (95% CI, 0.36–0.40) 0.58 +14.2% 105.79
LSTM 34.07 (95% CI, 32.45–35.62) 0.18 (95% CI, 0.16–0.19) 0.93 +5.8% 0.48
LSTM (with weather+mobility+vaccination) 28.55 0.14 0.96 +2.1%
Table 1. Comparison of traditional and machine learning models

SARIMA, seasonal autoregressive integrated moving average; LSTM, long short-term memory; BiLSTM, bidirectional LSTM.

Table 2. Evaluation of the performance of 2 models

RMSE, root mean squared error; MAPE, mean absolute percentage error; PBIAS, percent bias; WI, Willmott’s index of agreement; CI, confidence interval; LSTM, long short-term memory.