A ROBUST FRAMEWORK FOR DRIVER FATIGUE DETECTION FROM EEG SIGNALS USING ENHANCEMENT OF MODIFIED Z-SCORE AND MULTIPLE MACHINE LEARNING ARCHITECTURES

: Physiological signals, such as electroencephalogram (EEG), are used to observe a driver’s brain activities. A portable EEG system provides several advantages, including ease of operation, cost-effectiveness, portability, and few physical restrictions. However, it can be challenging to analyse EEG signals as they often contain various artefacts, including muscle activities, eye blinking, and unwanted noises. This study utilised an independent component analysis (ICA) approach to eliminate such unwanted signals from the unprocessed EEG data of 12 young, physically fit male participants between the ages of 19 and 24 who took part in a driving simulation. Furthermore, driver fatigue state detection was carried out using multichannel EEG signals obtained from O1, O2, Fp1, Fp2, P3, P4, F3, and F4. An enhanced modified z-score was utilised with features extracted from a time-frequency domain continuous wavelet transform (CWT) to elevate the reliability of driver fatigue classification. The proposed methodology offers several advantages. First, multichannel EEG analysis improves the accuracy of sleep stage detection, which is vital for accurate driver fatigue detection. Second, an enhanced modified z-score in feature extraction is more robust than conventional z-score techniques, making it more effective for removing outlier values and improving classification accuracy. Third, the proposed approach for detecting driver fatigue employs multiple machine learning classifiers, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Artificial Neural Networks (ANNs) that utilise Long Short-Term Memory (LSTM), and also machine learning techniques like Support Vector Machines (SVM). The evaluation of five classifiers was performed through 5-fold cross-validation. The outcomes indicate that the suggested framework attains exceptional precision in identifying driver fatigue, with an average accuracy rate of 96.07%. Among the classifiers, the ANN classifier achieved the most significant precision of 99.65%


INTRODUCTION
According to statistics from the World Health Organization, roughly 127,000 individuals lose their lives in traffic accidents yearly, with nearly one-third of those casualties being teenagers and young adults [1].Fatigue driving contributes to fatalities in road accidents, contributing to more than ten thousand deaths in a conservative estimate.Recently, some autonomous vehicles have proposed a warning system to prevent road accidents due to driver fatigue.The system would prompt drivers to take a break from prolonged driving by sounding an alarm in the vehicle, notifying the driver to stop driving and grab a coffee break.
Physiological signals such as electroencephalograms (EEG) are used to observe a driver's brain activities.A portable EEG system provides several advantages over other electroencephalography systems, including ease of operation, cost-effectiveness, portability, and few physical restrictions [2].The presence of artefacts in EEG signals, such as muscle activity, eye blinking, and unwanted noise, can pose a significant challenge for analysis.Therefore, the current paper proposes using an independent component analysis (ICA) technique to eliminate such noise from the raw EEG signal.Numerous studies have suggested that an essential component of precise sleep stage detection is the analysis of multichannel EEGs [3].Consequently, the present study considers multichannel EEG signals obtained from O1, O2, Fp1, Fp2, P3, P4, F3, and F4 for detecting driver fatigue states. https://doi.org/10.31436/iiumej.v24i2.2799 The features from a time-frequency domain, continuous wavelet transform (CWT) with enhanced modified z-score improved the accuracy of driver fatigue classification.It is important to choose the best features to get better results.The Morlet mother wavelet is a common practice in conventional CWT techniques due to its computational efficiency, surpassing other methods.This is because the Morlet wavelet involves fewer computations, most of which are performed through the fast Fourier transform, requiring less code [4].
In the field of data analysis and quality control, the identification of outliers is a crucial step in ensuring the accuracy and validity of statistical analyses.The z-score is a widely used method for detecting outliers in datasets, but it is susceptible to extreme values and is not considered robust in the presence of such outliers.The modified z-score was introduced to address this issue, which is less sensitive to outliers and has become a popular method for outlier detection in various applications.In recent years, the modified z-score has also been applied to feature extraction in machine learning and signal processing, where removing outlier values is crucial for accurate and robust analysis.This paper presents an enhancement of the modified z-score method for feature extraction in signal processing, specifically in driver fatigue detection using EEG signals.
Our proposed method has several strengths.First, using multichannel EEG analysis improves the accuracy of sleep stage detection, which is vital for accurate driver fatigue detection.Second, our use of enhanced modified z-score in feature extraction is more robust than conventional z-score techniques, making it more effective for removing outlier values and improving classification accuracy.Third, our approach utilises various machine learning classifiers, providing a comprehensive and accurate method for driver fatigue detection.
This paper presents a methodology for the precise identification of distinct levels of driver drowsiness by utilising diverse machine learning classifiers, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Artificial Neural Networks (ANNs) that incorporate Long Short-Term Memory (LSTM), and machine learning approaches like Support Vector Machines (SVM).A modified z-score technique to enhance the statistical feature of the classification was also introduced, which significantly improved the accuracy of the proposed method.To evaluate the effectiveness of this approach, a 5fold cross-validation strategy was employed to distinguish between driver fatigue and normal states.

RELATED WORKS
Outlier detection is critical in various fields, including environmental monitoring, geology, epidemiology, and data mining.The modified z-score is a frequently employed technique for detecting outliers, which considers the weighted mean of adjacent data points to estimate the anticipated value of each point.Aggarwal et al. proposed a modified z-score method for detecting spatial outliers in datasets with spatial autocorrelation [5].The technique improves the accuracy and robustness of the z-score test using a trimmed mean instead of the usual arithmetic mean.The study evaluated the method on simulated and realworld datasets and showed promising results in detecting spatial outliers.Although the modified z-score method proposed by Aggarwal et al. effectively detects spatial outliers with spatial autocorrelation, it may not perform well in datasets without spatial autocorrelation.Additionally, using a trimmed mean instead of the arithmetic mean may result in the loss of valuable information from the dataset.https://doi.org/10.31436/iiumej.v24i2.2799 Sandbhor et al. investigated the importance of detecting outliers in data mining and their effect on the quality and output of prediction models [6].The study's primary objective was to determine the most effective approach for detecting outliers in neural networks (NN) to forecast real estate values.The authors assessed several univariate outlier detection methods, such as Tukey's Standard Deviation (SD), median, z-score, median absolute deviation (MAD), and modified z-score, on a set of 3,094 instances of property sales data.Based on the findings, it can be concluded that for this particular problem, the median technique proved to be the most efficient approach for detecting outliers.Although Sandbhor et al. found that the median technique was the most efficient approach for detecting outliers in neural networks for real estate value prediction, it is important to note that this conclusion may not necessarily apply to other types of datasets or outlier detection techniques.Moreover, the study only evaluated univariate outlier detection methods and did not consider multivariate techniques, which may be more effective in certain applications.
Leite et al. conducted a study to evaluate the effectiveness of the modified z-score as an indicator for identifying changes in entropy-based features to detect faults in bearings [7].The research involved using 12 entropy-based features across the time, frequency, and time-frequency domains, in addition to three different entropy measures, namely Shannon entropy, Renyi entropy, and Jensen-Renyi divergence.The proposed technique was applied to process two real-bearing datasets obtained from experiments conducted until the point of failure.Furthermore, three bearings with different defects were examined to verify the performance of the entropy-based features.The results demonstrated that the modified zscore is a robust method for detecting changes in entropy-based features, highlighting its potential for early detection of anomalies in the vibration signals of bearings.This finding suggests that the proposed technique can be effectively utilised for fault diagnosis in bearings.However, it is important to note that the study only evaluated the effectiveness of the modified z-score method on two real-bearing datasets obtained from experiments conducted until the point of failure and three bearings with different defects.
Although outlier detection is a powerful tool for identifying unique data points, several limitations must be considered.For instance, in some cases, there may not be a clear definition of what constitutes an outlier, making it challenging to determine which data points to flag.Moreover, outlier detection methods may produce false positives or negatives, leading to incorrect conclusions and recommendations.Furthermore, choosing the appropriate outlier detection method for a specific dataset or problem can be complex, and there is no one-size-fits-all solution.Additionally, while outlier detection can identify anomalous data points, it may not always address the underlying cause of the outlier or provide a solution to the problem.Therefore, to get the most out of outlier detection, careful consideration of the goals and context of the analysis is essential.It is also important to use outlier detection in conjunction with other analytical tools and techniques to gain a more comprehensive understanding of the data and to develop effective solutions that address the root cause of any identified anomalies.
Several techniques have been suggested to identify the underlying mechanisms of fatigue in EEG signals.Among them, one method entails computing distinct types of entropies as feature sets based on a solitary channel [8].Quintero-Rincon has presented a straightforward and efficient method for identifying driver fatigue in real-time systems using a single-channel EEG signal [9].The algorithm selects the most significant channel and extracts four feature parameters to detect fatigue using an ensemble bagged decision trees classifier.By utilising data obtained from the Jiangxi University of Technology database, the proposed approach achieves an accuracy of 92.7% with a 1.8-second time delay.However, it is important to note that the study evaluated the method on a specific https://doi.org/10.31436/iiumej.v24i2.2799dataset, and further research may be needed to determine its effectiveness on other datasets and under different conditions.Additionally, the time delay of 1.8 seconds may not be practical for real-time monitoring in some situations, and it is important to consider the potential impact on driver safety if there is a delay in detecting fatigue.
In another study, Jing et al. aimed to detect driving fatigue in low-voltage and hypoxia plateau environments using subjective and objective monitoring methods [10].EEG signals from real-time driving tests were subjected to nonlinear and linear analyses to assess the signal trend during awake, critical, and fatigue states.The (α+θ)/β and (α+β)/θ energy features were identified as potential markers of driving fatigue in these environments, providing a basis for the development of a driving fatigue warning system.However, the study was limited to field driving fatigue tests in a specific environment, and further research is needed to validate the findings in other environments and driving conditions.Additionally, Zhang et al. proposed an innovative approach known as clustering on brain networks (CBNs) to improve the performance of driver fatigue detection [11].The CBNs approach employs a clustering algorithm to identify spatial nodes with unique connectivity features from electroencephalogram (EEG) data.The wavelet entropy features obtained from these nodes are then transformed into spatiotemporal images and examined using an image edge detection technique to differentiate between various stages of fatigue.This method reduces signal interference and detects fatigue before the onset of subjective feelings, making it a potentially useful tool for early warning and accident prevention.The research demonstrated the limitations of using EEG indicators in time and frequency domains for reliable detection of driver fatigue due to the challenge of signal mixing and limited sample size, lacking comparison with existing methods and validation in real-world driving scenarios.Then, the previous researcher proposed an intelligent system for automated driver fatigue detection utilising EEG signals [12].This system comprises a feature generation network that utilises texture descriptors and a hybrid feature selection method to enhance detection accuracy.The proposed framework achieved an impressive classification accuracy of 97.29% for detecting fatigue using EEG signals, highlighting its potential for efficient driver fatigue detection.However, the proposed framework used traditional machine learning algorithms, which may limit its ability to adapt to complex and dynamic driving environments.
The proposed research introduces a novel approach for efficiently detecting driver fatigue using EEG signals [13].The method employs a new channel selection algorithm based on correlation coefficients, an ensemble classifier using random subspace k-nearest neighbours (k-NN), and power spectral density (PSD) for feature extraction.The approach achieved an impressive accuracy of 99.99% for identifying driver fatigue using EEG signals in a 0.5-second time window.The proposed method demonstrates strong performance and can effectively detect EEG-based driver fatigue.However, due to its high computational complexity, a k-NN-based ensemble classifier may not be suitable for real-time applications.Hwang et al. proposed a subject-independent EEG-based driver fatigue state classification model in another study that addresses individual performance gaps [14].The authors utilised an adversarial training approach to induce the misclassification of subject labels in the classification model.Additionally, they incorporated an Inter-subject Feature Distance Minimization (IFDM) technique to minimise performance discrepancies between individuals.Their method enabled training on EEG datasets with limited, subject labels and was evaluated on the SEED-VIG dataset, resulting in superior accuracy and decreased individual performance variability when classifying drowsiness.However, one of the major drawbacks is that EEG signals contain large differences between individuals, making it challenging to build a unified model that can perform well for all individuals. https://doi.org/10.31436/iiumej.v24i2.2799 The studies reviewed propose various methods for detecting driver fatigue using EEG signals, ranging from single-channel feature extraction to more complex machine learning models.One common approach involves using power spectral density and various entropy measures as feature sets, while others utilise clustering algorithms and image edge detection to distinguish different stages of fatigue.Several studies also address individual performance gaps and subject variability by employing adversarial training strategies and component-specific batch normalisation.These studies demonstrate the potential of EEGbased driver fatigue detection for early warning and accident prevention, achieving high accuracies and providing new possibilities for extracting more information from complex EEG data.However, the methods vary in computational complexity, the number of channels required, and the level of subject independence achieved, suggesting that further research is needed to identify the most efficient and effective approach for practical applications.
Wilapiprasitporn et al. proposed a deep learning approach that combines CNN and RNN to identify individuals using affective EEG data [15].Their study used the Database for Emotion Analysis using Physiological Signals (DEAP) dataset and showed that the proposed method outperforms an SVM baseline system with a Correct Recognition Rate (CRR) of up to 99.90-100%.Recent research suggests that CNN-GRU models outperform CNN-LSTM models in identifying individuals using EEG data from the brain's frontal region, and they are effective at countering the impact of affective states.However, the proposed method relies on EEG signals, which may require specialised equipment and data collection and analysis expertise.Qin et al. proposed a deep learning model that combines CNN and LSTM to extract vein features from raw images for finger-vein biometrics [16].The proposed model uses supervised encoding to eliminate binary vein texture, resulting in significantly improved verification accuracy when evaluated on a publicly available fingervein database.However, deep learning models are prone to overfitting, learning the training data too well and failing to generalise to new data.Techniques such as regularisation and dropout can help prevent overfitting.
Mondal et al. developed a multitask learning framework using a CNN and a bidirectional long short-term memory (Bi-LSTM) model to analyse surgical workflows from video data [17].Their framework included a joint distribution loss function for concurrent tool usage during phase identification.The proposed method demonstrated excellent tool and phase identification performance compared to previous approaches when evaluated on the Cholec80 dataset.However, the limitation of this study was that it was only evaluated on a single dataset, and it is unclear how well the proposed approach would generalise to other surgical datasets.Hu et al. proposed the Deep Complex Convolution Recurrent Network (DCCRN), a network architecture that can handle both CNN and RNN structures and replicate complex-valued operations [18].In the Interspeech 2020 Deep Noise Suppression (DNS) challenge, DCCRN outperformed previous networks based on objective and subjective metrics and obtained the top rank for the real-time track and the second rank for the non-real-time track based on Mean Opinion Score (MOS).The proposed DCCRN network with 3.7M parameters proved highly effective in this task.However, the study focused on speech enhancement in clean environments and did not consider noisy or reverberant conditions common in real-world scenarios.
Researchers proposed a machine learning model that utilised CNN, U-net architecture, RNN, and LSTM architecture to create structural topology configurations that fulfilled minimum compliance and deformation criteria under various load conditions and volume fraction limitations.The model was trained using randomly generated finite element simulation data and a strategy to remove elements during training.The model outperformed traditional methods regarding time, cost, and practicality when applied to two-dimensional https://doi.org/10.31436/iiumej.v24i2.2799and three-dimensional cantilever-beam structural topology designs.This data-driven approach can speed up preliminary structural design procedures [19].However, the study's limitations include the need for training data and the lack of validation on real-world applications.Later, other researchers focused on improving solar radiation estimation models in agriculture meteorology due to limited data availability and low data quality [20].Several neural network models (SVM, Extreme Learning Machine, CNN, and LSTM) were developed and tested in Southern Spain using different input variable configurations.Performance was analysed using various statistical indices.One limitation of this study is that it only focused on using temperature and relative humidity as input variables for solar radiation estimation.Other climatic variables that can affect solar radiation, such as atmospheric pressure, cloud cover, and wind speed, which were not included in this study.Incorporating these variables could potentially improve the accuracy of solar radiation estimation.
The previous works discussed different deep learning approaches for various applications, including affective EEG-based person identification, finger-vein biometrics, surgical workflow analysis, speech enhancement, and structural topology design.The proposed models showed significant accuracy, efficiency, and applicability improvements over previous methods.Different deep learning architectures, such as CNNs, RNNs, and LSTM, extracted features from raw data, such as EEG signals, video data, and simulation data.The models were evaluated on different datasets and achieved state-of-the-art results regarding recognition rate, mean average precision, and mean opinion score.Additionally, deep learning models were used to improve solar radiation estimation models in agriculture meteorology.
In conclusion, an outlier detection is a valuable tool for identifying anomalies in data.However, its limitations must be carefully considered, such as the lack of a clear definition for what constitutes an outlier, the possibility of false positives or false negatives, and the challenge of choosing the appropriate method for a specific dataset or problem.EEG-based driver fatigue detection has shown great potential for early warning and accident prevention using various deep learning methods, achieving high accuracies and extracting more information from complex EEG data.Moreover, deep learning has significantly improved accuracy, efficiency, and applicability for various applications, such as affective EEG-based person identification, finger-vein biometrics, surgical workflow analysis, speech enhancement, and structural topology design.Further research is needed to identify the most efficient and effective approach for practical applications in outlier detection and deep learning.

METHODOLOGY
The research design is structured into four major components, data acquisition, preprocessing, classifiers, and evaluation metrics, as illustrated in Fig. 1.The successful implementation of independent component analysis (ICA) was critical in enhancing frequency resolution and achieving improved energy conservation results in the proposed study.To eliminate unwanted noises and artefacts, such as muscle movements and eye blinks, ICA was employed as a pre-processing step for EEG signals.Additionally, ICA was utilised to obtain EEG amplitude and correctly position the signal in the appropriate coordinate system for further analysis.The resulting clean EEG signals were then divided into alpha, delta, and theta sub-bands.This study [21] employed continuous wavelet transform (CWT) as the preferred technique for time-frequency domain analysis in the feature extraction stage.The Morlet wavelet was chosen as the mother wavelet, a combination of a complex sinusoid and a Gaussian envelope with a time scale of t and exhibited an inverse relationship between scale and frequency, leading to an increase in frequency as the scale decreased.The researchers then enhanced statistical features using the modified z-score in the feature extraction process to improve the accuracy of the classification process.In the final step, the classification process involved using CNN, RNN, and ANN, including LSTM, and machine learning methods like SVM using a 5-fold cross-validation strategy to distinguish between fatigue and normal states.

Data Acquisition
The dataset used in this study was obtained from a previous researcher's online database [22].The dataset consisted of EEG recordings from 12 healthy male participants (19)(20)(21)(22)(23)(24) who completed a driving simulator task for up to 2 hours.EEG data from eight specific channels (O1, O2, Fp1, Fp2, P3, P4, F3, and F4) were selected from a Neuroscan device that had 30 electrodes and operated at a sampling rate of 1000 Hz.The study was divided into two phases: a 5-minute normal and a 5-minute fatigued state.Fatigue was self-reported by the participants after 40-100 minutes of driving.The ZY-31D driving simulator, which features a wide-screen display consisting of three 24-inch screens, was utilised in the study.The driving environment was created using a Peking Ziguangjiye program ZG-601, resulting in a low traffic density scenario.This study's driver fatigue detection system focused on EEG channel O1, which exhibited the highest correlation among the selected channels.Electrodes O1, O2, and Fp1 were chosen based on their correlation with fatigue and drowsiness.Combining Fp1 and Fp2 achieved an accuracy of 85% in classifying fatigue driving, which was higher than using Fp1 alone (79%) or Fp2 alone (68%) [23].The EEG electrode F4 was chosen for analysis due to its high performance in the classification process.In addition, electrode P4 positively impacted drowsiness and poor driving performance.Other electrodes, such as O2, Fp2, F3, and P3, were also included for further research.Previous studies have used ten EEG channels, including Fp1, Fp2, F3, and F4, to obtain the best results [24].According to the study, the electrodes Fp1 and P3 were the most effective for driver fatigue detection.Interestingly, these electrodes have also been successful in EEG emotion recognition in previous studies.

Pre-processing
The purpose of using MATLAB software is to create programs.Raw EEG signals typically contain interference and noise that must be removed before processing.In order to accomplish this, the independent component analysis (ICA) method was employed during the pre-processing stage.This method can eliminate unwanted noises and artefacts and separate source signals from observed signals, all without prior knowledge of the mixture.The ICA mixture model can be represented as a vector matrix.

𝑥 = 𝐴𝑠 (1)
Matrix A represents the mixing matrix or the linear combination of independent components that contribute to the observed mixed signals in EEG data.Each row in matrix A corresponds to a unique independent component, and the columns represent different time points.On the other hand, the observed mixed signals are represented by the rows in matrix s.Each row in s corresponds to a specific time point in the EEG data, and the columns represent different electrodes or sensors used to measure the electrical activity in the brain.In essence, matrix A and matrix s represent two views of the same EEG data.Matrix A provides information about the underlying independent components contributing to the observed signals, while matrix s includes information about the observed signals at each electrode or sensor.
To reduce redundant data in a dataset, a feature extraction technique known as continuous wavelet transform (CWT) creates a time-frequency distribution.In order to identify signs of fatigue, the signal was partitioned into distinct sub-bands, namely alpha (8-13 Hz), delta (0.5-4 Hz), and theta (4-8 Hz), that are recognised as significant in this regard.The CWT is a highly effective method for these tasks because it has superior computing performance, provided that enough wavelets of analytics are used.The CWT can be defined as: The output obtained from this method is a spatially dependent spectral decomposition that exhibits the spectral response for both spatial frequency (ω) and spatial area (x).The CWT output is typically represented as a two-dimensional plot, where the x-axis represents time or position, and the y-axis represents frequency or scale.The plot shows the wavelet coefficients as a function of time and frequency, with higher coefficients indicating more robust signal content at that time and frequency.This plot is often referred to as a spectrogram or a scalogram.
The mother wavelet, denoted as ψ, plays a crucial role in the CWT process.It determines the shape and frequency characteristics of the wavelet used in the transformation and serves as the basis for analysing signals in various fields such as engineering, physics, and finance.The choice of the mother wavelet can significantly impact the results of the CWT analysis, as different wavelets are better suited for different types of signals or applications.In addition, the selection of the mother wavelet can also affect the computational efficiency and accuracy of the CWT algorithm.A wide range of mother wavelets are available, each with its unique advantages and disadvantages, and researchers continue to develop new wavelets with improved performance in various applications.As such, the study of mother wavelets remains an active and important area of research in signal processing.
In the context of EEG analysis, CWT can be used to analyse the spectral content of neural activity at different spatial locations on the scalp.By applying CWT to EEG data from multiple electrodes, researchers can generate a map of the spectral response for different spatial frequencies and locations, providing insights into the spatial distribution of neural activity at different frequency scales.
The paper proposes using a modified z-score to enhance statistical features in the classification process.Unlike the traditional z-score, which can be unreliable in the presence https://doi.org/10.31436/iiumej.v24i2.2799 of outliers, the modified z-score uses the median and median absolute deviation (MAD) instead of the mean and standard deviation to detect outliers in a dataset.The modified zscore is less affected by extreme values and is particularly useful for datasets with nonnormal distributions or with extreme values.The formula for calculating the modified zscore is: where k is the constants, Xi is the observation, the median is the median of the dataset, and MAD is the median absolute deviation.The constant, k, was modified from 0.6745 to 0.33725 to account for the larger input data spread, making the modified z-score comparable to the z-score for normally distributed data.The larger spread of input data can be accounted for by lowering the value of k.When the value of k is reduced, the range of acceptable data points also decreases.This means that the modified z-score becomes less sensitive to outliers or extreme values, which may be more likely to occur in data with a larger spread.
By decreasing the value of k, the modified z-score can be more comparable to the zscore for normally distributed data.This can help ensure the analysis is accurate and reliable, particularly when working with large or complex data sets.Then, further improvement of the modified z-scored data is multiplied by a binary mask, which assigns a value of 1 to rows that correspond to the fatigue data and a value of -1 to the rows that correspond to the non-fatigue data.This allows the non-fatigue and fatigue data to be easily separated based on their modified z-scores, with the fatigue data having positive values and the non-fatigue data having negative values.
The modified z-score is a statistical technique commonly used to identify dataset outliers.It is more robust than other outlier detection methods because it is less sensitive to extreme values.This is particularly important in data analysis and quality control applications, where outliers can significantly impact the overall analysis or conclusions drawn from the data.The modified z-score is considered a more reliable approach to identifying outliers because it uses the median absolute deviation (MAD), a robust measure of variability not affected by extreme values.In contrast, traditional z-score methods use the sample standard deviation, which can be heavily influenced by extreme values and may not accurately represent the variability of the dataset.Therefore, the modified z-score is often preferred over other outlier detection methods in situations where robustness and reliability are critical, such as analysing EEG signals for driver fatigue detection.
The modified z-score is a powerful tool that enables researchers and analysts to identify and handle outliers more reliably and consistently.Using a robust measure of variability not influenced by extreme values, the modified z-score can accurately detect and flag outliers in a dataset.This, in turn, allows analysts to handle outliers more effectively, either by excluding them from the analysis or by applying appropriate statistical techniques to account for them.As a result, using a modified z-score can lead to more accurate and reliable statistical analysis and better-informed decision-making.In addition, the ability to detect outliers reliably and consistently is particularly important in fields where the impact of outliers can be significant, such as in EEG signal analysis for driver fatigue detection.Using the modified z-score, researchers and analysts can ensure that their analyses are robust and reliable and that conclusions drawn from the data are based on accurate and representative information.

Classifiers
On the other hand, RNNs, including LSTM, are better suited for sequential data, such as time series data, and have been used for fatigue detection based on physiological signals, https://doi.org/10.31436/iiumej.v24i2.2799such as EEG and ECG.These signals provide valuable insights into the driver's physiological state and can help detect the onset of fatigue before the driver becomes visibly drowsy or sleepy.Additionally, ANNs and SVMs have been used for fatigue detection based on physiological and behavioural data, such as steering wheel movements, vehicle speed, and lane deviations.Overall, the use of machine learning and deep learning methods for fatigue detection shows great promise for improving road safety.These automated systems can provide real-time feedback to drivers and alert them when they are at risk of falling asleep at the wheel, thereby preventing accidents and saving lives.
Sequential data, like time-series data, is best processed by RNNs because they can identify the temporal dependencies between consecutive frames of video or physiological signals.This is particularly important in detecting driver fatigue.Meanwhile, ANN is a versatile type of neural network that can be utilised for various tasks, such as classification.Our study divided the dataset into two subsets for the ANN classifier, 80% for training and 20% for testing.ANN is often used as a baseline for comparing with other deep-learning models and is an appropriate starting point for developing a fatigue detection system [25].When dealing with datasets with fewer features, SVM is a popular machine learning algorithm that can be used for classification and regression analysis [11].Lastly, LSTM is an RNN architecture that solves the problem of vanishing gradients in traditional RNNs by enabling the neurons to give feedback on their output as inputs to the next neuron, allowing RNNs to identify sequential dependencies between different time steps in a time series.
Driver fatigue is a significant issue on the road that can lead to accidents and fatalities.Automated systems that detect and classify driver fatigue in real time are a potential solution to this problem.Deep learning methods like CNN, RNN, and LSTM and machine learning methods like ANN and SVM have shown promise in fatigue detection [26].These classification algorithms were implemented using Python for its ease of use and flexibility.These classifiers were selected based on their suitability for analysing EEG signals and their proven success in related classification tasks.Precisely, CNNs effectively extract relevant features from multichannel EEG signals, RNNs and LSTMs are well-suited for handling sequential EEG data, and SVMs are known for their ability to handle high-dimensional data and perform well with limited sample sizes.ANNs are versatile and widely used types of classifiers that can be applied to a range of data types.We used multiple classifiers to compare their performances on the EEG dataset and evaluate their suitability for driver fatigue classification.

Evaluation Metrics
We aimed to evaluate the fatigue detection system's effectiveness using standard metrics typically employed to gauge classifier performance.Classification models in machine learning are commonly evaluated using metrics such as accuracy, sensitivity, and specificity.Accuracy indicates the model's ability to classify instances correctly and is calculated by determining the percentage of correctly classified cases.Sensitivity evaluates the true positive rate by determining the percentage of actual positives correctly identified by the model.Conversely, specificity measures the true negative rate by determining the proportion of real negatives correctly identified by the model.When false positives or negatives may have profound implications, combining sensitivity and specificity can provide a more comprehensive evaluation of the model's performance.
In evaluating the performance of our fatigue detection system, we utilised receiver operating characteristic (ROC) curves and area under the curve (AUC) in addition to standard metrics.ROC curves plot true and false positive rates at varying classification thresholds, providing a comprehensive evaluation of the classifier's performance, mainly when the costs of false positives and negatives are distinct.Meanwhile, the AUC yields a single value summarising the classifier's overall performance, with a score of 1 denoting perfect classification and 0.5 indicating random classification.By incorporating these metrics, the system's performance could be effectively evaluated and areas for improvement identified.
In order to increase the reliability of these findings, a 5-fold cross-validation approach was employed.This process randomly partitions the dataset into five equally sized subsets, with four subsets utilised for training and one subset reserved for testing.This procedure was repeated five times, with each subgroup used once as a test set, to ensure each fold received equal representation.Each fold's evaluation metrics were calculated, including accuracy, sensitivity, specificity, ROC curves, and AUC, to assess the classifier's performance.By computing these metrics for each fold, the classifier's performance was evaluated more accurately across the entire dataset.Finally, the evaluation metrics were averaged over the five folds for a final classifier performance estimate.
To compare the performance of various classifiers, the Wilcoxon signed-rank test was utilised; this non-parametric statistical test compares paired data sets to determine if there is a significant difference between them.The aim was to compare the performance of different classifiers on the same dataset, considering a statistically significant difference if the p-value was less than 0.05, using a significance level of 0.05.The Wilcoxon signed-rank test was combined with a cross-validation strategy to obtain unbiased and dependable results, allowing the evaluation and comparison of different classifiers' performance objectively.

RESULTS AND DISCUSSION
This study analysed the patterns of brainwave activity in a group of participants using electroencephalography (EEG).Specifically, the focus was on the alpha (8-13 Hz), delta (0.5-4 Hz), and theta (4-8 Hz) frequency bands, which have been linked to different cognitive and emotional processes.The data were analysed using statistical and visualisation techniques, including box and scatter plots.The results revealed the presence of outliers in the data, which are data points that fall far outside the expected range and can significantly impact the overall pattern of results.These outliers were particularly evident in the alpha and delta frequency bands, with fewer outliers observed in the theta band.
Outliers are data points that fall far outside the expected range and can significantly impact the overall pattern of results when analysing a data set.One approach to identifying outliers is to use z-scores or modified z-scores, which are measures of how many standard deviations a given data point is away from the mean of the dataset.By plotting these scores visually, it is possible to detect outliers in a dataset and evaluate their impact on the overall data pattern.This method allows the identification of the magnitude of the outliers and assesses their potential influence on the analysis.
Figures 2 and 3 illustrate the distribution of data points, where the blue dots correspond to data points that fall within the expected range, and the red dots represent data points that fall outside the expected range and are considered outliers.In addition to the potential impact of outliers on statistical analyses, it is important to note that outliers can also affect the interpretation and generalisation of research findings.Outliers can lead to overestimation or underestimation of effect sizes, which can have significant implications for the practical significance of research results.Thus, appropriate identification and treatment of outliers   In Fig. 4, the proposed method enhancing modified z-scores was influential in ensuring that all data points fell within the expected range, indicating that it successfully addressed the issue of outliers and improved the overall data quality.The method effectively managed and adjusted the data points to achieve the desired outcome without reducing any data classified as outliers during the process.This study evaluated the performance of five popular machine learning models, RNN, CNN, ANN, SVM, and LSTM, to determine the most effective classification task.The z-score is a method of standardisation that involves subtracting the mean from the data and dividing it by the standard deviation.However, datasets that include outliers, such as driver fatigue detection, may not be well-suited for this method.The MZS process uses the median and median absolute deviation instead of the mean and standard deviation to standardise the data.The MZS is more robust to outliers and has been used in several studies for driver fatigue detection.In this study, we further enhanced the MZS, the so-called enhancement of modified z-score (EMZS), by applying a scaling factor to adjust the sensitivity and specificity of the classifier.This enhancement resulted in the highest accuracy among the five classifiers tested: RNN, CNN, ANN, SVM, and LSTM.Based on the results in Table 1, the ANN performed the best among the five classifiers tested.The ANN demonstrated higher accuracy and specificity than the other classifiers, with values of 99.65% and 100.00%, respectively.However, its sensitivity value of 99.48% was slightly lower than that of different classifiers.Nonetheless, the results indicate that the ANN was still proficient at detecting driver fatigue.This result outperformed the accuracy of 96% and sensitivity of 94% reported by a previous researcher in the field [27].Therefore, our study demonstrates the effectiveness of the proposed approach for accurate driver fatigue detection.The sensitivity of the classifier CNN and LSTM were not applicable (n/a) in this study due to the absence of negative instances in the test set.As a result, the denominator of the sensitivity calculation was zero, and the sensitivity value was undefined.
https://doi.org/10.31436/iiumej.v24i2.2799However, the high precision and specificity values achieved by the classifier suggest that it effectively identifies positive instances.Implementing the enhanced, modified z-score approach resulted in an accuracy improvement of over 30.00% compared to the traditional modified z-score method.The SVM accuracy result, 97.89%, indicates high accuracy compared to the previous researcher, who used the same machine learning with eight electrode channels [13].
According to the results, the proposed enhanced, modified z-score method was found to be more efficient than the traditional modified z-score approach in identifying driver fatigue from EEG signals.The accuracy of the improved process was significantly better than the conventional method, as indicated by the study's findings.The proposed approach effectively minimises false positives and negatives, which is critical for detecting driver fatigue with high precision.Notably, the accuracy results obtained through the proposed method were competitive with the findings of similar studies that used EEG signals for driver fatigue detection.Furthermore, the proposed framework's multi-classifier approach could be used to improve the detection of other physiological signals related to driver fatigue, such as eye movements and heart rate variability.ROC curves were utilised to plot and evaluate the performance of each classifier to establish the ideal threshold for classification.The results indicated that the CNN and ANN classifiers exhibited the highest true positive and lowest false positive rates, suggesting they could accurately distinguish between normal and fatigued states.Additionally, the SVM classifier displayed a high true positive rate and a low false positive rate, indicating its effectiveness in classifying driver fatigue.In contrast, the LSTM and RNN classifiers had slightly lower true positive rates and higher false positive rates than the other classifiers.This suggests that they were less effective in accurately classifying driver fatigue.Despite these limitations, the LSTM and RNN classifiers still represent promising approaches to detecting driver fatigue and should be further investigated to determine if their performance can be improved.The findings suggest that the CNN and ANN classifiers and the SVM classifier have significant potential for accurately detecting driver fatigue from EEG signals.These results have important implications for reducing the risks associated with drowsy driving and improving overall road safety.The study's findings indicate that the enhanced modified z-score approach proposed, combined with multiple classifiers, was exceedingly influential in detecting driver fatigue from EEG signals with high accuracy.In particular, the ANN classifier outperformed the other classifiers, exhibiting the highest accuracy and specificity.On the other hand, the CNN and LSTM classifiers achieved perfect classification with the highest area under the curve (AUC) values.The ANN classifier is particularly well-suited for detecting driver fatigue from EEG signals due to its capacity to learn complex nonlinear relationships between input and output variables, flexibility in handling vast amounts of data, and ability to adapt to changing input patterns over time.Therefore, the ANN classifier represents a promising approach to accurately detecting driver fatigue and mitigating the risks associated with drowsy driving.ANN classifier has several benefits but has limitations, such as the need for high-quality input data, significant training data, and customisation to specific contexts.Despite these challenges, the ANN classifier remains a promising approach for accurately detecting driver fatigue.Artificial Neural Networks (ANNs) are inspired by the structure and function of the human brain, consisting of interconnected nodes that process information and transmit signals.By adjusting the weights between nodes based on training data, ANNs can learn to identify complex patterns and make highly accurate predictions.In the case of detecting driver fatigue from EEG signals, ANNs can analyse the intricate patterns of brainwave activity associated with fatigue and accurately distinguish them from normal brainwave patterns.The proposed approach has exhibited significant improvements in accuracy as compared to the traditional modified z-score method, which is an important development in the field of driver fatigue detection.Moreover, the multi-classifier approach has shown potential in detecting other physiological signals related to driver fatigue, thereby enhancing the overall reliability of driver fatigue detection systems.These findings could have farreaching implications in ensuring road safety by enabling the development of more accurate and reliable driver fatigue detection systems.

CONCLUSION
In conclusion, the proposed framework for driver fatigue detection using enhancement of modified z-score and multiple machine learning architectures was highly influential in accurately detecting driver fatigue from EEG signals.Our study demonstrated that the https://doi.org/10.31436/iiumej.v24i2.2799framework achieved high accuracy for all classifiers, with the ANN achieving the highest accuracy of 99.65%.The SVM and CNN also achieved high accuracy, with 97.89% and 97.19%, respectively.Although the RNN and LSTM achieved slightly lower accuracy, they still achieved over 90.00% accuracy.Our study's contribution lies in presenting a comprehensive and effective framework that can accurately detect driver fatigue from EEG signals, surpassing the performance of previous approaches.Furthermore, the proposed framework can potentially be applied to practical driver fatigue detection systems, improving driving safety and reducing the number of road accidents caused by driver fatigue.Future research could build on this framework by investigating its performance in real-world settings and exploring ways to improve its sensitivity to subtle changes in EEG signals.

Fig. 1 :
Fig. 1: The framework of enhancement of modified z-score.
https://doi.org/10.31436/iiumej.v24i2.2799are essential to ensure research findings' integrity and applicability to the broader population or context of interest.

Fig. 4 :
Fig. 4: Identification of outliers in the enhancement of modified z-score data.