Efficient model intrusion detection system through feature selection based on decision tree algorithm


  • Johani Fauzi
  • Charles Lim
  • Eka Budiarto


When it comes to data analytics on machine learning, feature selection (FS) is one of the most significant responsibilities of data preparation. This research aimed to combine feature selection to assess significant aspects of massive network traffic, which is further utilized to increase the accuracy of traffic anomaly detection while also decrease the time it takes to complete the analysis. The filter-based and wrapper-based feature selection techniques are the most often utilized method in Intrusion Detection System (IDS) research. This study employed a combination of filter-based and wrapper-based methods by ranking the features according to their minimum weight values to select relevant and significant features, and then implemented the decision tree classifier for learning algorithms in experiments on two datasets of NSL-KDD and CIC-IDS2017 to determine appropriate and powerful features. The experiment investigation demonstrated that the number of relevant and essential features produced were combined by Information Gain and Chi-Square forward selection that has a substantial impact on the improvement of detection accuracy and the production of simpler elements. The feature selection combined filter-based and wrapper-based chi-square forward selection decision tree (IG+FS+DT) algorithm with nine features selected and has the highest accuracy of 98.98%. In this case, the nine features selected were f2, f5, f30, f32, f33, f36, f37, f38, f41 from 41 features on NSL-KDD dataset. In the CIC-IDS2017 dataset, the combination of CS+FS+DT selected best five feature and obtained the highest accuracy of 99.98%. In this case, the five features selected were f2, f17, f23, f65, and f68 from 80 features on the dataset of CIC-IDS2017.