BOTNET DETECTION USING INDEPENDENT COMPONENT ANALYSIS

Botnet is a significant cyber threat that continues to evolve. Botmasters continue to improve the security framework strategy for botnets to go undetected. Newer botnet source code runs attack detection every second, and each attack demonstrates the difficulty and robustness of monitoring the botnet. In the conventional network botnet detection model that uses signature-analysis, the patterns of a botnet concealment strategy such as encryption & polymorphic and the shift in structure from centralized to decentralized peer-to-peer structure, generate challenges. Behavior analysis seems to be a promising approach for solving these problems because it does not rely on analyzing the network traffic payload. Other than that, to predict novel types of botnet, a detection model should be developed. This study focuses on using flow-based behavior analysis to detect novel botnets, necessary due to the difficulties of detecting existing patterns in a botnet that continues to modify the signature in concealment strategy. This study also recommends introducing Independent Component Analysis (ICA) and data pre-processing standardization to increase data quality before classification. With and without ICA implementation, we compared the percentage of significant features. Through the experiment, we found that the results produced from ICA show significant improvements. The highest F-score was 83% for Neris bot. The average F-score for a novel botnet sample was 74%. Through the feature importance test, the feature importance increased from 22% to 27%, and the training model false positive rate also decreased from 1.8% to 1.7%. ABSTRAK: Botnet merupakan ancaman siber yang sentiasa berevolusi. Pemilik bot sentiasa memperbaharui strategi keselamatan bagi botnet agar tidak dapat dikesan. Setiap saat, kodkod sumber baru botnet telah dikesan dan setiap serangan dilihat menunjukkan tahap kesukaran dan ketahanan dalam mengesan bot. Model pengesanan rangkaian botnet konvensional telah menggunakan analisis berdasarkan tanda pengenalan bagi mengatasi halangan besar dalam mengesan corak botnet tersembunyi seperti teknik penyulitan dan teknik polimorfik. Masalah ini lebih bertumpu pada perubahan struktur berpusat kepada struktur bukan berpusat seperti rangkaian rakan ke rakan (P2P). Analisis tingkah laku ini


INTRODUCTION
A botnet is a collection of computers infected by malicious software (malware) that a botmaster manages. All Internet-of-Things (IoTs) devices, such as closed-circuit television cameras (CCTV), web cameras, computers, and mobile devices, can be infected devices. The vulnerabilities and the computing resources of these infected devices are exploited where they operate remotely as servants following the instructions given by their botmaster. The main aim of assigning a botnet is to launch an assault on the victim. However, the number of bots depends on the frequency of the attacks. Therefore, the most significant factor contributing to the frequency of the attacks is the number of bots in botnet environments [1], [2].
They can execute major attacks on victims, such as DDOS or email spam, because of the large number of bots, rendering victims unable to function for hours or days. For example, in the Mirai incident of 2016, the vast and unlimited number of bots produced a massive impact assault [3]. In the Mirai incident of 2016, the attacks were identified from 600,000 Internet-ofthings (IoT) devices [4]. At that moment, Mirai attacks were noteworthy since the bots used Internet-of-things (IoT) devices, not just computers or laptops. Consider the result if 600,000 devices concurrently sent a ping to a specific website, leading to that website being overwhelmed, inaccessible, and its services slowed down. The botnet detection model has become a hot topic among researchers due to the history of botnet attacks and their impact on the industry. The arms race never ends between the botmasters and the researchers trying to beat each one. Every group continues to develop its abilities, and this can be seen through the botnet revolution. Botnet evolves or mutates every day after the source code has been released to the public [5]. It can be seen in the Mirai botnet and the Mirai version. Two months after the release of the Mirai source code to the public, the bots multiplied with variant complexity, from 213,000 to 493,000, twice. They show the statistics of various botnet attacks on Securelist websites, and 39.35 percent of new botnet found in 2018 is based on the Kaspersky Lab Botnet Monitoring project compared to 2017 botnet attacks [6]. Subsection 1.1 briefly clarifies why the botnet relies heavily on the Rallying stage or C&C stage Command & Control server. The importance of preventing the server from being identified by the security system also explains the botnet structure's revolution. The botnet framework revolution switches from centralized to a decentralized. Centralized botnet, such as IRC and HTTP, via the primary server, call the Command & Control server. Decentralized botnet, such as Peer-to-peer (P2P), are more advanced since the bots themselves can act as servers. P2P is designed to hide the C&C server, as stated in [7], [8]. The botnet's strength lies in its capacity to elude security systems and carry out large-scale attacks thanks to various tactics such as packet data concealment and encrypted packet data [9]. A botnet can hide from the protection system and imitate the regular traffic flow where normal traffic is usually more random [10], it then waits for stage and imbalanced class distribution. The P2P technique is also a part of the concealment strategy to mask the C&C server [7], [8].
The botnet is now becoming a profitable business, according to [8], where the botmaster provides the service for any cyber-attacks. However, the current capability leading the business of these services must monitor the bots, advise of subsequent attacks, lengthen the duration of the attacks, and avoid monitoring its identity.

Botnet Life-Cycle and Structure
It is necessary to understand the life cycle of the botnet when designing a behavior-based analysis of the botnet detection model since the choice of related features of this model depends on it. Other than that, identification must also be carried out before the attacks occur and it is too late [11]. As below, the botnet life-cycle can be divided into four main stages: - The first stage is injection or replication. This stage can be achieved in several ways on the network, such as exchanging folders, visiting malicious websites, or adding emails. The bot herder increases the number of bots at this level. The Command & Control stage or 'Rallying' stage is the second stage. The infected devices already behave like bots in this process. The bots keep updating the devices' status, and if necessary, the bot herder submits a new source code [12]. The revised source code and the vulnerability report are designed to ensure the bots are undetected and robust [13]. The third stage is the attack phase, where all bots are targeted at attacking specific victims. The botmaster gives an attack launch order, and the bots simultaneously launch the attack based on the command. The Release Period is the last stage. The release stage is where the botmaster removes fingerprints, substitutes new systems for identified bots, and does not leave a digital footprint behind. Often the botmaster distributes the source code to hinder government investigations. During this process, learning from the previous attack, the functionality of the bot system is also enhanced [14].
We concentrate on botnet activity in the process of Command & Control or Rallying for our study. The infected system continues to attempt to connect to the C&C server to send reports of the infected devices. The system also receives updated source code to keep hiding from protection [8]. Based on a Kaspersky Lab study [15], monitoring DDOS attacks is the correct time to detect botnet to intercept the command from the Command & Control.

Motivation and Contribution
Our inspiration is the potential and consequences of a botnet (a botnet attack), to discover its method before the attack is launched. However, our critical general incentive to build a model that can predict a novel botnet is due to the continuous evolution of the botnet. The technological motivation for using behavior-based patterns and flow-based functionality is due to the shortcomings in detecting the new forms of botnets in the signature-based detection model. Other than that, we were motivated by the research from [16] that combined Principal Component Analysis (PCA) for clustering with k-means. https://doi.org/10.31436/iiumej.v23i1.1789 In statistics, principal component analysis is a technique used to describe a data set in terms of new uncorrelated variables ("components"). The components are ordered by the amount of original variance they describe, so the technology helps reduce the dimensionality of a data set. In comparison, Independent Component Analysis (ICA) is a machine learning technique used to distinguish independent sources from a mixed input. Unlike principal component analysis, which focuses on maximizing data point variance, independent component analysis emphasizes independence or independent components. Since we are using aggregation for pre-processing data, the vital information might be lost, resulting in decreased performance, so we agreed to use ICA. The explanation about ICA is in subsection 0. The significant contributions to this analysis are: -• This method can detect network packets even in concealment strategies such as obfuscation, code encryption, oligomorphic strategy, polymorphic strategy, and metamorphic. • This method used the CTU-13 botnet benchmark dataset that consists of centralized and decentralized structures. It proves that our framework can detect both structures. • The evaluation of this framework used the different types of botnets, proving that our framework can detect novel botnets. • Our result shows the average 74% f-score that tests on five types of novel botnets.
• The performance of the framework is compared with other researchers that used the same source of data.

RELATED WORKS
The latest developments in the concealment technique of packet data in network traffic make the signature-based or content-based inefficient in detecting new forms of botnets. For example, Singh et al. [17] suggested that the signature time-to-time was revamped by the botnet and significantly modified. These changes in behavior caused signature-based analysis output to drop on the new release botnet because signature-based analysis relied heavily on the bot's https://doi.org/10.31436/iiumej.v23i1.1789 signature. In addition, many concealing tactics are used to mask packet data content in network traffic, including obfuscation, code encryption, oligomorphic approach, polymorphic strategy, and metamorphic strategy [18].
Patsakis et al. [8] raised many concerns about DNS queries that have been used to conceal the botnet on the encrypted channel. Although AsSadhan et al. [17], claimed that packet data contents should be shielded to safeguard the identity of the private information of the individual or user, where only the header of the packets can be released to the public. This author also concentrated on analyzing traffic, exchanging packets, and providing a framework for lightweight security. But their work is considered DNS and is only for the DGA botnet. A model was also developed by this author using the actions of the botnet when interacting with others, but the time interval used for this study is 31-49 minutes.
Two common approaches, 1) payload-based and 2) traffic-based can be classified into machine learning models to detect network operations. The payload-based approach trains models based on characteristics derived from the payload/data portion of the packets transmitted over the network, as the name implies. The disadvantages of such models are the resource-intensive challenge (where features for each packet need to be evaluated), privacy problems, and encrypted information where features cannot be extracted [18,19] . By analyzing the communication packet headers or Netflow information, the traffic-based approach aims to mitigate some of the model's drawbacks. While privacy remains an issue with such an approach (such as individual IP addresses in features), this can be mitigated by aggregating time window records.

Behavior-based and Flow-based Features
What is behavior-based? What are the differences between behavior-based and signaturebased? Behavior-based and signature-based are in contrast to each other. In computing, all objects have attributes that can be used to develop a custom signature. Signature-based analysis refers to detecting attacks by searching for specific patterns, like byte sequences in network traffic or known malicious instruction sequences used by malware [22]. This terminology is derived from anti-virus software, which refers to these detected patterns as signatures.
Although behavior-based analysis is an analysis that does not directly analyze the data like signature-based, there are some advantages of behavior-based analysis compared to signaturebased analysis. For example, it is more secure or effective in detecting new and novel forms of malware threat. In addition, it can detect a single instance of malware that targets a person or organization. It can also identify what the malware does when files are opened in a specific environment and obtains comprehensive malware information. However, according to Resende and Drummond [21], most research defines behavior-based analysis with anomaly-based detection, but anomaly detection can also be done using signature-based analysis. So, it means that anomaly-based cannot be defined as behavior-based analysis in malware detection.
The definition of behavior-based Resende and Drummond [21] is the most accurate to our definition. Resende and Drummond [21] define behavior-based analysis in Network Intrusion Detection Systems as detection techniques that are not evaluated or referred directly to the source, destination, and payload of packets. It is an analysis that assesses the behavior of an object. Behavior-based detection can be performed by using API call logs [24], network flow (NetFlow) [10], and is also a hybrid between API call and Netflow [25]. A flow is a collection of packets that come from the same source and destination. Flow-based botnet detection techniques employ statistics of all packet headers in a flow (flow record). Because the flow-https://doi.org/10.31436/iiumej.v23i1.1789 based approach only catches packet header information, it can reduce the computational complexity [24][25][26] and be processed very quickly.
Flow-based features are the characteristics chosen to illustrate the network flow pattern or connection to distinguish either the usual network or a botnet network. Flow-based characteristics also relate to packet data information, such as total packets per second, bytes per packet, total packet bytes, and the number of packets [29]. The description of flow-based functionality is shown in Table 1 and consists of the features, the time window, and the data tools derived from published work. Thus, when designing our characteristics and the time window, Table 1 became our crucial guide.
The feature selection process was interpreted in the same concept as used in aggregation. As mentioned in Gezer et al. [40], one of the challenges in machine learning is feature selection because feature selection needs a good understanding of the domain knowledge. Not all features in the data are relevant to building the model (Resende and Drummond [21]). On the other hand, aggregation is a process of transforming the data based on a specific theory, and it enables the extraction of sufficient data for analysis. Based on [30], aggregation is a part of the data mining technique in machine learning for efficient knowledge discovery about network flows.

Independent Component Analysis
Independent Component Analysis (ICA) is a source separation technique in signal processing. According to [31], in their survey, ICA and PCA are among the popular methods used to select essential network features. PCA extracts and reduces the dimension of features, while ICA separates the noise to enhance and maximize each feature's data pattern [32]. The authors in [33] claim that principal component analysis (PCA) is a technique for reducing features by identifying the relevant feature set. The implementations of ICA and PCA clustering algorithm in feature selection has been reported [16]. It is a semi-supervised model where the author combined unsupervised and supervised techniques.
In ICA, the mutual connection between features is minimized by maximizing the non-Gaussianity. Research from Palmieri et al. [34] is the most similar to our approach. The author used ICA in Network Anomaly Detection from the University of Naples, Italy's network traffic. On the other hand, we try to find the implementation of ICA in detecting botnet, and we only found an article from Mao et al. [35] where this author used ICA in detecting spamming botnet.
We can summaries that behavior-based analysis that used the flow-based features can solve the issues of concealment botnet, but it produces high false alarm (false positive rate). High false alarm in machine learning occurs due to the unclear separation between classes that also come from the unclear pattern produce by the data. Although some attempts have been made to address this issue, it still puts limitations on ICA implementation.

PROPOSED FRAMEWORK AND PRE-EVALUATION RESULT
In order to reach the objective, this study proposed a new framework as shown in Fig. . The proposed framework starts with selecting the network traffic dataset and pre-processing the dataset. This study highlighted the pre-processing phases where the data is provided to produce a high-performance during classification.

Input: Data Source and Data Distribution
For this study, the following vital data are extracted from the botnet benchmark dataset CTU-13 [42] from the website of the Stratosphere Research Laboratory. This dataset consists of 13 files with several types of botnets in different protocols and different structures. Since this framework aims to detect novel botnet, this dataset is divided into two sets, training and testing, for building the model and evaluating the dataset, as shown in Table 2. The model is evaluated with data from the evaluating dataset separated from data for building the model. The separation of this data ensures that the model is derived and tested using a different set of data, as explained in Step 2: Dividing Dataset in Section 3.2.  Table 2 summarizes the distribution of data based on the Data File No. In the third column are the names of bots in the dataset. The explanation of the bot name, bot category, and structure are given in Table 3. It is essential to have both structures (centralized and decentralized) in this research. As a result, our data source selection appears reasonable in terms of independent structure and bot reliability. Columns 5 and 6 in Table 2 show the separation of training and evaluating data for the novel bot.

Data Pre-processing
Pre-processing data is the phase in which the data is prepared before being incorporated into the algorithm to construct the prediction model. Since we used behavior-based analysis, the information needed to go through several steps. A behavior-based analysis is not a straightforward extraction process but rather a tool for analyzing the raw data. There are several vital components or measures that we have grouped into the pre-processing data module, such as Labeling, Cleaning, Dividing Dataset, Feature Selection, Aggregation, and Data Quality Process Implementation.

Step 1: Labeling and Cleaning
The first step was re-labeling the dataset. Although the CTU-13 dataset is supervised, the dataset contains labels but those labels are in string/text, not in numbers. There are 74 types of descriptive labels in CTU13, as we have shown in Appendix A, but basically, the label is based on 3 types of labels: 'Normal', 'Botnet', and 'Background'. Due to that, we re-labeled the CTU-13 as stated in (1) below. Once the labeling was completed, we removed the uncertain data in Label = 2 for the cleaning process. https://doi.org/10.31436/iiumej.v23i1.1789

Step 2: Dividing Dataset
After the cleaning process, the data was split into two main data sets: Creating Model Data and Analysing Data. This separation aimed to ensure that the construction model evaluation is performed on a novel botnet. The model was based on Constructing Model Data, divided into 70-30 ratios of training and testing data.

Step 3: Features Selection and Aggregation
Once the dataset division was completed, the dataset was ready for the following process: selecting the features and aggregating them in a specific time interval. We aimed to build the fastest detection, so for this research we chose a short time interval (1 sec) for aggregation.
Feature selection is a process of selecting specific variables/features/attributes in the data. The purpose of this process was to reduce the complexity and processing time. We chose the features based on the theory of communication. This theory is between botmaster and its bots during the C&C stage in the botnet life-cycle for this research. As mentioned in the bot life-cycle in subsection 1.1: Botnet Life-Cycle & Structure, during the C&C phase, bots and botmaster keep communicating. This communication pattern is different from the regular communication pattern, where a typical communication pattern is usually more random. In contrast, a bot's communication pattern is more uniform with the same amount of transferring data to multiple destinations. The features that we used for this research are shown in Table 4.
Since the data in the NetFlow is in continuous type and categorical data type, the aggregation of these two types of data is different, as shown in Eq. (3). If the data was continuous, we implemented the statistical technique such as minimum, maximum, median, standard deviation, and specific number, n(x). But if the data type was categorical data, we implemented only the total distinct number, n(x), where the total distinct number (n(x)) that define as the frequency of unique elements in the set can be described as shown in Eq. (4).
While for time, this data is used, whereas the aggregation or rounding process is shown in Eq. (1). Time is also used in calculating the Different Time (∆t) between the last time, tn and the start time, t1 in 1 second (time interval) duration. The equation for Different Time (∆t) is shown in Eqs. (5) and (6).

Step 4: Data Quality Process
This process is the point where the most interesting things occur. This process improves the quality of the data and features, improving the performance of detecting novel botnets. Therefore, we label it as Data Quality Process to be represented as the objective of this combination process. This process utilized a two-step approach of standardization and Independent Component Analysis (ICA). Specifically, we used this theory because the classifier we chose was related to a distance-based classifier.

A) Standardization
Standardization is a re-scaling process for the distribution of the dataset to obtain the mean of the data equal to 0, and the standard deviation equal to 1. In other words, standardization is a process of centering the data. Standardizing a data set for a wide range of machine learning estimators is a common need. However, it could be harmful if the individual features do not look more or less like standard normally distributed data (e.g., Gaussian with 0 mean and unit variance). Therefore, test x is calculated as the standard value of: where µ is the mean of the training samples or zero if, with mean= False, s is the standard deviation of the training samples or with std= False, respectively.
Centering and scaling occurred on each feature independently by computing the relevant statistics on the samples in the training set. Mean and standard deviation were then stored in a transform to be used on later data.

B) Independent Component Analysis
For this research, we used FastICA from sklearn. decomposition package in python, as shown in Algorithm 1 below. As the name FastICA implies, it is the short version of ICA. FastICA rotates the data until the data looks non-Gaussian in every axis. By making the mean equal to zero and normalization the variance in all directions, the algorithm can rotate the data in any direction. The process of normalizing the variance is called the whitening process. As shown in Algorithm 1, the python code to implement ICA is through the whitening process. The whitening process is the decorrelation to ensure that all features are treated equally before the Algorithm of ICA run.
After the centering process (mean equal to zero) and the whitening process (normalization of variance), the data ran the ICA algorithm. The main goal of ICA is to find the unmixing vector of W, where W is the inverse of A, X is the input data, and A is the mixing signal. The equation is shown in Eqs. (8) to (10):

C) Features Importance Ranking
We evaluated our feature selection through Features Importance Ranking calculated using Extra Tree Classifier, as shown in Algorithm 2. Extra Trees Classifier for features importance in Scikit.learn module is based on impurity-based importance where it calculated the importance of training data without reflecting the prediction ability.
From the feature's importance ranking in Fig. 2, we can see the percentage of the highest contribute features. For example, the 1 st feature increased from 22.75% to 27.74% and the lowest contributing feature, the 9 th feature, increased from 1.8% to 4.31%. Since no feature had a 0% contribution, the removal process was not executed.

Building Model using Classification
Once the data completed pre-processing, we moved to the classification process. The classifier that we chose is K-Nearest Neighbor (K-NN). K -Nearest Neighbor (KNN) is one of the most straightforward classification machines that stores all available cases and classifies new cases based on a similarity measure [2,41] . The idea behind KNN is that if a sample belongs to a specific class in the space of several similar samples (k), the sample is also in the category. Thus, techniques based on Nearest-Neighbor classify samples based on the similarity of the population. KNN falls into the algorithm family of supervised learning. Informally, this means that a labeled data set consisting of training observations (x, y) is provided, and the relationship between x and y wants to be captured. More formally, our objective was to learn a function h:X→Y to predict the corresponding y output confidently with an unseen observation x. First, we needed to determine the k-value of the number of groups (cluster) to use K-NN. For this research, we used Elbow Method to determine the k value.

Determine k-value (Elbow Method)
Elbow method ran the k-NN algorithm several times and calculated the WSS error for different values of k. To find the optimal value of k, we used the elbow method that derives from the Within-cluster Sum of Squared (WSS). The Elbow method is a heuristic approach in determining the number of clusters for k-means or k-NN. The equation of WSS is described by Eq. (11): (11) where; Ck = cluster of k µk = the mean value of the data that point to the cluster Xi = an observation to the Ck The optimal value of k is at the elbow curve, or the distortion point that starts decreasing linearly, as described in Fig. 6. Although the value of k = 2 looks like there is a curve/ distortion from k= 2 until the k = 4, the decrease is still significant and not linear. Due to that, for this research, the k-value was k = 4.

Building the Model
Once we have determined the k-value for K-NN, we started training and building the machine learning model. The Building Model Data was divided into training data and testing data with a 70:30 ratio, as explained in Section 3.2. During the building model process, once the model was built, the prediction of the Evaluation Data was ready to start. We tested it file-by-file to evaluate how well the model could predict a particular bot.

EVALUATION
We evaluated the performance of our techniques based on the Confusion Matrix in terms of accuracy, precision, recall, f-score, false-negative rate (FNR), and false-positive rate (FPR). A confusion matrix is the most widely used method to evaluate a machinelearning model's performance. The distribution of the results can be seen clearly by creating a confusion matrix from the model. The confusion matrix consisted of a two-dimensional table with the class "actual" and "cluster/projection" in a single-dimension structure and evaluated only two (2) classes. The other dimension was rated as "Botnet" positive and "Human" negative. Thus, the cases were classified into four fractions: False Positive (FP), False Negative (FN), True Positive (TP), and True Negative (TN), as shown in Table 5. When the data is in the state "True", either TP or TN, it shows that the classifier predicted it in the correct class. In the "False" state, there was an incorrect prediction class. For example, when the data was in a False Negative state, it means that the classifier Falsely predicted as Negative (Normal) where the data was positive (botnet), while the Positive and Negative indicate Botnet or Normal class. https://doi.org/10.31436/iiumej.v23i1.1789 The Confusion Matrix can generate several performance evaluation parameters, but for this research, we focused on accuracy, precision, recall, f-score, false-negative rate (FNR), and false-positive rate (FPR). These parameters were chosen to make a comparison with other researcher's results that used the same dataset. The overall performance was from the Accuracy, but we preferred to compare the overall performance using the f-score.

Accuracy
Accuracy is often used to measure the overall performance of the machine learning classifier because it is a parameter that measures how often the algorithm correctly classifies a data point. Accuracy is the number of correctly predicted data points from all data points where it can be described in the Eq. (12) below:-

Precision
The 'Precision' parameter is the count of data classified as a botnet (positive) that are genuinely botnet. Precision also can be described as in equation (13):-

Recall
A recall is also known as Sensitivity, where it is the fraction of actual positives that are identified correctly. Recall also can be described as the ability of a model to find the relevant cases.

F-score
F1 score is the harmonic combination of recall and precision. F1 score is the equal weight.

RESULTS AND DISCUSSION
The evaluation for the framework was conducted in 2 parts; one was when building the model and the other one was the prediction of novel data using the Evaluating Data.

Building Model
This section summarizes the findings during the Model Building. The performance of the flow-based features and K-NN framework was compared to the performance of the same framework but with additional ICA during pre-processing. Both performance evaluations are shown in Table 7. This result was evaluated on the testing data split from the Building Model Data illustrated in Fig. 3. Table 7 lists the parameters used in the evaluation. From the results in Table 7, if we compare the Accuracy and F-score, there are tiny increments, but the False Positive Rate (FPR) decreased from 1.88% to 1.69%.

Prediction Novel Bots
CTU-13, the benchmark botnet dataset, is also used by several researchers to evaluate their prediction model's performance. So, we compared the result from another article that used the same data source and the same evaluation parameter. There were 5 files in Evaluating Data that were separated before the building model process. The data is described in Table 8.  Based on Table 9, the model with ICA showed a better F score than the Model Without ICA. But, the model without ICA showed the lowest FPR compared to the other model. Thus, the yellow box in Table 9 indicates the best value, either the highest F score or the lowest FPR. Since we were focused on the F-score for comparison, we extracted it to Table  10.  Table 10 shows that 4 files from 5 novel botnet files had the highest F-score using the K-NN model with ICA. Only one file, File No 2, showed the opposite result. Due to that, we agree that the K-NN Model with ICA performed better than the K-NN Model without ICA. The result for the K-NN Model with ICA was compared to other researcher's results. The results were directly compared with previously reported findings on a novel botnet prediction model that used the same data sources. Table 11 summarizes the comparison result and lists the parameters used in every previous article. Based on Table 11, in the F-Score parameter, our technique defeated other results for Data number 8 (red font). However, three out of five novel botnet files hade the highest fscore from Fernandez Maimo et al. [31]. To measure the overall performance, we calculated the average for each parameter. The average of each data (Data from File 1,2,6,8,9) and parameter (Precision, Recall, and F Score) from Table 11 are illustrated in Fig. 7. These plots show that our method proposed here outperformed the other approaches except Fernandez Maimo et al. [31].
Since the Fernandez Maimo et al. [31] techniques outperformed the overall evaluation, we compared the different approaches they used. For example, Fernandez Maimo et al. [31] also used flow-based features, but they considered the features from dual-direction, incoming, and outgoing traffic. Their approach resembled our technique in that both methods focused on concealment network traffic and used statistical analysis to aggregate the data.

CONCLUSION
This paper proposes the framework for novel botnet detection that implements data standardization and Independent Component Analysis (ICA) during flow-based features pre-processing data. The strength of our framework is that we used flow-based features that have the benefits of detecting traffic from the concealment network. Other than that, the complexity and processing time can also be minimized by flow-based features compared to content-based ones. Also, for aggregation, for the quick detection method, we used the shortest time interval. Our approach can be applied to botnet concealment and new/novel forms of a botnet.
The use of data standardization and Independent Component Analysis (ICA) improves key rating attributes and classification outcomes. However, the overall result is still not the best relative to other previous approaches. Nevertheless, it generated an improved result using Data Standardization and Independent Component Analysis (ICA). We can also assume that behavior analysis caused some noise to the pattern.
Future directions are connected to enhancing the collection of functionalities. Further developments are expected to lead to a deeper understanding of the nature of the selection of functions.