Building attack detection system base on machine learning

These days, security threats detection, generally discussed to as intrusion, has befitted actual significant and serious problem in network, information and data security. Thus, an intrusion detection system (IDS) has befitted actual important element in computer or network security. Avoidance of such intrusions wholly bases on detection ability of Intrusion Detection System (IDS) which productions necessary job in network security such it identifies different kinds of attacks in network. Moreover, the data mining has been playing an important job in the different disciplines of technologies and sciences. For computer security, data mining are presented for serving intrusion detection System (IDS) to detect intruders accurately. One of the vital techniques of data mining is characteristic, so we suggest Intrusion Detection System utilizing data mining approach: SVM (Support Vector Machine). In suggest system, the classification will be through by employing SVM and realization concerning the suggested system efficiency will be accomplish by executing a number of experiments employing KDD Cup’99 dataset. SVM (Support Vector Machine) is one of the best distinguished classification techniques in the data mining region. KDD Cup’99 data set is utilized to execute several investigates in our suggested system. The experimental results illustration that we can decrease wide time is taken to construct SVM model by accomplishment suitable data set pre-processing. False Positive Rate (FPR) is decrease and Attack detection rate of SVM is increased .applied with classification algorithm gives the accuracy highest result. Implementation Environment Intrusion detection system is implemented using Mat lab 2015 programming language, and the examinations have been implemented in the environment of Windows-7 operating system mat lab R2015a, the processor: Core i7-Duo CPU 2670, 2.5 GHz, and (8GB) RAM.


Introduction
With the rise utilize the networkeds computerss fors crucial systemss ands thes common utilize of distributed ands large computers networks, the security of computers networks concern rises and network intrusions have been a dangerous risk in latest time. Intrusions detections systems (IDS) hass beens great used to be a seconds row of protection form networked computers systems sideways with additional network security methods for instance access controland firewall. The majoraimof IDS is to detect illegal utilize,abuse and misuse of computers systemss by boths systems insiderss ands outsiders intruders. Theres are differentsmethodssto construct intrusions detections systemss (IDS). IDSs can bes classifieds into twos classifications depend on the approachess utilized to detects intrusions: abuse detectionsand anomalysdetection [1,2,3]. Anomalysdetectionsmethodscreatessthe profilessofsusual actions ofs users, system resources, operatings systems, networks services ands traffic usingthe examination trails created by anetwork scanning program or a host operating system. This method detects intrusions by classifying important perversions from the usual attitude samples of these profiles. Anomaly detection method is not necessary that previous knowing of the security holes of the goal systems. So, this system is capable tosdetect not onlysidentified intrusionssbut alsosunidentifiedsintrusions. Moreover,thissmethodscan identify the intrusionssthatsare accomplished by the misuse of legal users or disguise without violation security politics [4,5,26]. The drawbackssof this methodwere it hadsrise fakespositive recognition fault, the hardness oftre atment progressive misbehavior, and costly calculation [4,6,7]. otherwise, misuse recognition method determines doubtful abuse signatures depended on known system weaknesses and a security procedure. Abuse method achievesswhether signaturessof identified attackssare existing or notsin thesauditing paths, andsany corresponded behavior is recognized ansattack. Misusesdetection method identifiessonlyspreviously recognized interferencessignatures. Thesbenefit ofsthis method issthatsit seldom defeat to identify prior toldsintrusions,si.e.decrease fake positivesrate [5,s8]. Thesdifficulties of this method cannot identify modern intrusions it have not ever before been detected, i.e. Greater fake negativesrate. Morever, thissmethod hassanother disadvantages as the hardness of misusessignature bases and the hardness of creating and updatingssignature rulessof intrusion [4,8,9]. These are twostypessof Intrusionsdetectionssystems IDS are NetworksIntrusion DetectionsSystem NIDSsand Host-basedsIntrusionsDetection (HIDS) [9,10,27]. In our study we built Intrusion detection systems (IDS) based on data mining.
Thesseeking forsproof ofsattacks depend onsthe information collectedsfrom recognizedsattacks.moreover, it is indicate to attacks type as detectionsby appearance or misusesdetection. Anomalysdetection is seeking for perversions from the pattern of uncommon attitude depend on the monitoring of a systemsduring a ordinary status andsis indicated to such anomalysdetection or find via behavior.

Soni andSharma in 2014[13]
suggested two techniques artificial neural network (ANN) and C5.0 are employed together with characteristic picking. Feature picking method seliminate several irrelevant characteristics while C5.0sand ANNsperformed such a classifiersto categorizesthe input datasin eithersnormal category orattack that onesof the fivestypes. KDD99sdataset is employed to test and train the system; C5.0ssystem through numberssof characteristics is make improvedsresults withsnearly 100% accuracy. Morever, they used ANN approach to categorize intrusion data depend on their partition size. A comparative result demonstrates that C5.0 is execution better than ANN and yields best outcome with 36 features.

Zargar and Baghaie in 2012[14]
offered a category-basedspicking of active parameterssfor detection of intrusion utilizing PrincipalsComponents Analysiss (PCA). They employ 32smain characteristics from Transmission Control Protocol// Internet Protocol (TCP/IP) sheader, also 116sresulted featuressfrom TCPsdump are picked in adataset of networkstraffic. Attackssare classified in fourssets, User attack (U2R), Denialsof Services (DoS), Remotesto and Probingsattack, Remotesto Usersattack (R2L). Moreover, they used TCP dumpsfromsDARPA 1998sdataset insthe tests as the pickedsdataset. PCAsapproach is utilized to define an ideal characteristic setsto produce thesdetection procedure higher speed. The experimentalsresults display that characteristic reductionscan get better detectionsrate for the categorybasedsdetection method while the continuing detectionsaccuracy within asuitable range. The KNNsclassification technique is utilized for the attackssclassification. The experimentalsresults illustrate that characteristic reductionswill importantly speedsup the testing and training the time for recognition of thesintrusion challenges.

Mukkamala and Sung in 2003 [15]
proposed Feature picking for Intrusion Detection utilizing Two learning machine classes for intrusion detection system (IDS) aresstudied: ArtificialsNeural Networkss (ANNs) andsSupport VectorsMachines (SVM). They display that SVMs are better than ANN insthree serious respectssof intrusion detection system: SVM execute and train are greatness quicker; SVMs scale much superior; and SVM provide greater classification accuracy. Moreover, address the concerning matter of rankingsthe significancesof input characteristics, whichsis a hazered of major significant. Sincesremoval of the useless and/orsunimportant inputssproduce a simplified hazered and probably quicker and extra precise detection, characteristic picking is quite significant in intrusion detection. The experimental results show that SVM-depend IDS utilizing a reduced feature numbers can deliver improved or comparable performance. In conclusion, IDS suggesteddepend on SVM for detecting an exact category.

Zhu et al.,2005 [16]
RICGAs (ReliefFsImmunesClonal GeneticsAlgorithm), a collective characteristic subsetspickingmethod depend on the Immune Clonal selection,ReliefFsalgorithm and GA is suggested in isemployed BP networks as classifier. RICGA has higher accuracy of classification (86.47%)forssmall scope characteristic subsetssthansReliefF-GA. In the paper, the features are not mentioned.a composite characteristic subset selection method, claimed RICGAs (ReliefFsimmune clonalsgenetic algorithm), depend on the immune clonal selection algorithm ,ReliefF algorithmand GA. In the RICGA method, they employed in the first ReliefFsto getsrid ofsirrelevant characteristics, then execute a improved geneticsalgorithm to get the lastly characteristicssubset. They analyse hardly theRICGA Markovschain modelsalgorithm and itssconvergence. Experimentalsresults on realsKDD CUP'99 datasets display that the RICGAsmethodsis eminent to the ReliefF-GAsand GAson classificationsprecision and input characteristic subsetssize.

Ming-Yang Su (2011) [17]
offered anapproach for featurespicking to identify DoS/DDoSsattacksfor designing in realstime ansanomaly-based Network Intrusion Detection System NIDS. Genetic algorithms (GA) collective with KNN (knearest-neighbor) are utilized for characteristic weighting and picking. The outcome of KNNsclassification is employedas the fitnesssfunction in a GA to improve the characteristics weightsvectors. First 35scharacteristics in the train stage aresweighted. The highest 19 characteristics are taking into account for recognized attackssand the tops28 characteristics for unidentifiedsattacks. In this paper, extractedscharacteristics are not aforesaid. Atotal accuracy rateof 97.42% is obtained for recognized attacks and 78% for unidentified attacks.

Data Mining and Intrusion Detection Systems
Intrusion0detection systems0(IDS) have been depend intraditionally on0the feature of an attack and the system tracking activity to check if it matches that description. IDS depend on data mining is creation their0appearance more ability. The system of Data0Mining approaches for0intrusion detection applications have been commonly employed these days. The0intrusion detection difficult has been0reduced to a0Data Mining mission of classifying data.Summarized, given a data pointsset belonging to various attacks activity (0Classes0) and purposes to isolate them assaccurately as possible by means of0a model0. Many various data0mining approaches found for intrusion detection0classification. In this system, we employed a Support Vector Machine (SVM) for attach detection as a classification algorithm. Also, we used feature extraction and dimensionalitysreduction algorithmss(PCAsand LDA, SVDs) basing on the KDD'99 Cupsdatasets.

Design and Implementation of Proposed System
The proposed intrusion detection system to scheme a proficient intrusion detection and recognition system is described as follows:

Figure 10 Intrusion Detection Classification proposed System0approach
The aim of analyses is to increase the intrusion detection system achievement; the data which used as input to proposed systemis KDD Cup 99 dataset. The KDD Cup 99 datasetis requirement to pre-processing which is done by converting all data into similar format. Then feature reduction is performed to extract and reduction features.Finally, intrusion classification stage is done by based on different kind of system insertions, the classification algorithms Support Vector Machine (SVM). As KDD Cup 99 dataset holds some symbolic attribute and also numeric attributes, two sorts of transformation technique have been utilized for these properties. The two machine learning procedures are prepared on both kind of transformed dataset and afterward their outcomes are looked at with respect to the correctness of intrusion detection. The suggested system is containing fundamentally of two essentialjobs which aresfeature reductionsand attack detection.
Our proposedsintrusionsdetection system steps are showed in Figure(1), swhich includes the main parts. Input KDD'99 Dataset, Dataset Pre-processing, Dimensionality Reduction and feature selection, Classification0 Algorithms0 and Performance0Measurement.

KDD'99Input Dataset
In first phase of the suggested intrusion detection system gets the KDD Cup 99dataset as an input where the whole record numbers. In our proposed system, we utilized the total KDD Cup 99 dataset. Each record on 42 features; the records have labeled either attackor normal type.
The KDDsCUP 1999 [18] standard datasetssare employed tosevaluate various characteristic selectionstechnique for Intrusionsdetectionssystem. This system contains of 4,940,000srelatedsrecords. Every relation had a labelsof eithersattack or the normal kind, with quite one exact attack category happens in one of the four attacks types [19] as: User to Root Attacks(U2R), Remote to Local Attacks(R2L),Denial of Service Attack (DoS) and Probing Attack.

Denial of Service Attack (DOS):
Attackssof this category deprivesthe legitimate or host user from utilizing the resources orservice.
Probe Attack: Thesesattacks mechanically scan a computer networks or a DNS server to getlegal IPsaddresses.

Remote to Local (R2L) Attack:
In thissattack category an attackerswho doessnot have an accountson a victim machine achievements localsaccess to the mechanism and changes the data.

KDD'99 Pre-processing
KDD'99 pre-processing is second phase and is one of the significant phases of system. This stage proper data to be accepted to next phase for extraction and reduction data. This phase contain from two step(Dataset Labeling, Normalization). The following subsection will illustrate all details about these steps:

Dataset Labeling
The Dataset Labeling is the first step in KDD'99 Pre-processing phase. This step is so important. The output of Dataset Labeling employed as input to next step in Pre-processing phase (Normalization). The dataset labeling is done by utilizing the whole features in thes KDD 10%0corrected0datasets at it displayed in the screen shot which is sited in the second cell of the entire dataset. The figure (2) isthe KDD 99 dataset screenshot that we took it from environment of our matlab.

Figure 2
First KDD cup dataset row of 10% correction (data sample) So, the The datasets records0includes 42 characteristics (e.g0,, 0service0, protocolstype0, andsFlag) and is labeledsasseither attack or a normal also illustrateany one of attack type as presented in Figure (1.2).as an example,if we take a sample of first rowfromthe KDD 99 dataset before doing the normalization. The Figure (2) is clearthat the feature numbers is (42) which has the definite attack categoryas describedin  There is also another issue in this step. There are many nominal values in the dataset such as HTTP, SF, and ICMP. Consequentlyin this step transformall nominal values to numeric values in advance. For instance, the service form of "tcp" is mapped to 1,"udp" is mapped to 2,"icmp" is mapped to 3 and the table (3) shows all transformation the dataset nominal value features into the numeric values. Figure (3) has shown the original KDDCUP1999 dataset will become after transformation as display in figure (3).

Figure 3
Pre-processing Original KDDCUP1999 dataset before and after transformation

Normalization
After we do the labeledfor all dataset feature space, we can do the dataset Normalization by using the whole KDD010% corrected0 datasets at it shown0in the screens shots which is located0in the second cell of the wholes datasets.
KDD'99 as an input dataset includes characteristic numbers and theses are in0different style. Somesare numbers of style and others are in character style. Consequently, in this stage this various style dataset is transformed into samestyle to be extracted0to thes next phases. Sinces theres are some KDD CUP 99 dataset features are continuous, therefore a normalize process is done on these features to become more suitablefor the DM classification algorithms. Normalization is utilized for preprocessing the data, where the characteristic data arerangeas to be in a tiny definite scaled for example 0.00 to 1.00 or-1.00 to 1.00. Normalizing0the0input values for every characteristic measuredsin thestraining patterns willsaid speedsup the learningsphase.

Features Extraction0 and Dimensionality0 Reduction 0of the0 KDD990
Features extraction0and dimensionality0reduction method is done by eliminatin gredundant and irrelevant features. Irrelevantis that features have little connection with class labels. The redundant features have robust relationship with picked features. in this suggested system we employed three various algorithms which are Singular value decomposition (SVD), Principal Component Analysis (PCA),and Linear Discriminant Analysis (LDA) approaches. We used these techniques for extracting appropriate characteristics from dataset, and reduce dimensions of the KDD as wellthen are given as an input to a next step.

Principal Component Analysis (PCA)
PCA is a convenient statistical approach that has found systems in fields for instance image compression and face recognition, and is a popular method for definition samples in high dimension data. Theswhole object ofsstatistics is depend on about the conception which big data set, and examine set describes of the relations betweensthe separate pointssin thatsset [20].The objective of PCAsis to limit the dimensionalitysof the dataswhile preserving as far as probable of the variance current in thesoriginalsdataset. It is asway of categorizing samples insdata, and term the data in as a technique as to focus their differences and similarities [21].

Singular-Value Decomposition( SVD )
Another approach we use it in this system which is Singular-Value Decomposition (SVD). the figure(4) explain the form of a SVD,Let X be an m × n matrix, and let the rank of X be r. The rank of a matrix is the biggest number of rows "or equally columns" we can selectfor which numberof non zero linear set of the rows is the vector 0 (all-zero) in this case a set of such columns or rows is independent. Also, the Figure (1.4) displays the matrices U, Σ, and V as with the following properties: Figure 4 The form of a singular-value decomposition 1.
is an × column0S-Sorthonormal matrix0S; that0is, each of its0Scolumns is a unit0 vector and the dot product of any two column0S is 0.

Linear discriminant analysis
Linear discernment analysis (LDA) is a different technique that employed for reduction of dimensionality and feature extraction. LDA requires reducing dimensionality while maintain as much of the class distinctive information.
Steps0 10: Computes thes between class scatters using completes features samples.
Step 2 Calculates the Total classs scatter matrixs Step 3 Computes Eigenvaluess and Eigenvectors using Eigens equation fors LDA.

STX =⋋ Si X
Step 4 Computes the Eigenvectorss corresponding to Eigenvalues such that and Eigenvectors: X1, X2, X3 … XN where N represents dimensionality of feature vector and N in our case Step 5 Evaluate the contribution of each feature vector Step 6 Sort the features vectors in descending orders corresponding to their impact or contributions.
Step 7 The dimensionality reductions phase based on largest0 eigenvalues is skipped as the selection of optimum subsets of linear components

ClassificationSupport Vector Machine Algorithms (SVM)
Support Vector Machine (SVM) is a machine learning method setsem ployed for regression and classification. SVM is depending on the idea of decisionplanes that describe decision boundaries. A decision plane is one that splits between a set of matters having various class memberships.Our suggested insertion detection system depend on dimensionality reduction which PCA and SVD, LDA algorithm which has employed one classification outcomes.

Performance Evaluation
The insertion detection system efficiency is evaluatedsby its capacity tosmake precise estimates. Accordingsto the realsnature of a grant eventscompared tosthe forecast fromsthe IDS, four probable results are presented insTable(4), famous assthe confusionsmatrix [4]. Detection Rates (DR) or True Negative Rates (TNR),True Positive Rates (TPR), FalsesPositive Rates (FPR) or FalsesAlarm Rates (FAR) and FalsesNegative Rates (FNR) are gauges that cansbe practical to quantifys the execution of IDS [4] depend ons the above confusionsmatrix We has gotten accuracy by recognition0rate is illustrate such as the ratio0between thes correct recognition numbers decisionsto the number0of total.

Results and discussion
Tables (5) and (6) displays the overall performance results of Support Vector Machines(SVM) on KDDsCup 99 dataset based on testingsand training by utilizing three various algorithms (0PCA and LDA, SVD0) that0 we have0 offered in our system0.

Experiential Results using the Whole Dataset Samples
Big data analysis is a major change theses day, so in term of dealing with a huge number of data samples (records).In our proposed system we design anapproach to transact with intrusion detection classification difficult. Also we used a huge number of data examples (494,201) with entire feature numbers (42 features). To execute and solve the problem of 10% KDD classification, we propose a data folds segmentation. In this part, we tried to display the experimental results employing the entire data samples which are (494,201). These experimental results of intrusion detection proposed system which is the intrusion0detection0classifications systems depend on various features reductions algorithms0 on the KDD Cup 99 dataset.
Each record of dataset labels includes one of the 5 type of attacks. Since the 494,201 is a big data analysis especially in our proposal which we employed three various algorithms to do the dimensionality reduction and feature selection. In each one we used for0reducing the042 features of the(0 KDD data sets) and two classification0 algorithmss to detect the four types of IDs attackss.
Our methodology proposed system for dealing with this type of big data analysis is to divide the entire dataset sample (494,201) to k-folds (k-folder). Each fold (folder) has n-sample numbers from the dataset. Each sample has been picked in term that no fold (folder) has the same data sample as another fold.
In experimental result of proposed system, we divide the dataset to 25-folds. The 24th folds, each one has 20000 data samples, and the last one has 14021 samples so all these shown in Table (7) below. The experimental results were examined and discussed to exemplify the proposed ID system. In this case, we described three major parts. The first part is the essential features by employing three algorithms that have selected from the entire feature space which is 42 features. The second part, we explained in this step the result of dimensionality reduction imprudent and feature selection algorithm by selecting the feature space (7). The last part is a part of comparing between IDS proposal experimental results and the previous works.
In this implementation system,we trust on scoring the Eigen value score to reorder the feature from the highest score ( the most significant one ) to the smallest one

Classification Experimental Results
classify the kinds of attackon the 10% of KDD Cup 99 dataset, we employed. Support Vector Machine (SVM) classification algorithm in this phase. This algorithm has been utilized with the reduction dimension space features.

Comparing our Classification Results
Compare the performance results of SVM classifiers by employing the whole data samples with all three dimensionality reduction methods that we have suggested.  We can see that by employing the SVM classifier to categorize the entire data samples according to the attack kinds, our method for PCA and LDA, SVD gives a higher accuracy in testing and training.

Figure 7
Accuracy Support vector machine (SVM) classification with three Algorithms Dimensionality Reduction Attack Detection

Comparing our Classification Results with Other studies
There were great number studies that have been done to classify the attack kinds using 10% of KDD Cup 99 dataset. In this part, we will compare our results utilizing reduction algorithm (PCA and LDA, SVD) with the other studies that have been done on the same dataset. Table ( 9) shows a briefly comparison between our proposed system's result and the other methods according to the performance results for the overall accuracy for testing and training.

Conclusion
Today, a large amount of threat attacks network and information security. In this paper, we proposed an intrusion detection system that reduces the set of features and classifies attack types. The reduction of features is performed by us also then the classification which the proposed algorithm is a combination of features selection. Reduced features for intrusive detection system and increased attack detection rate to the SVM applied classification algorithm, which gives the highest resolution. The cup1999 kdd selection attacks are identified with less Error rate and high accuracy. The feature selection and their reduction have both affected the performance of the classification algorithm. In the future swarm optimization function dynamically reduces the number of unused feature attribute of traffic data.