摘要: |
针对单一分类方法在训练样本不足的情况下对于小样本网络流分类效果差的特点,
通过自适应增强(Adaptive Boosting, AdaBoost)算法进行流量分类。算法首先使用CFS(Correlat
ion-based Feature Selection)特征选择方法从大量网络流特征中提取出少量高效的
分类特征,在此基础上,通过AdaBoost算法组合决策树、关联规则和贝叶斯等5种单一分类
方法实现流量分类。实际网络流量数据测试表明,基于AdaBoost的组合分类方法的准确
率在所选的几种算法中是最高的,其能够达到9892%,且相对于单一的分类算法,组合流
量分类方法对于小样本网络流的分类效果具有明显提升。 |
关键词: 网络流 流量分类 相关特征选择 自适应增强算法 组合分类器 |
DOI: |
|
基金项目:陕西省自然科学基础研究计划重点项目(2012JZ8005) |
|
Ensemble classification overnetwork traffic based on AdaBoost |
ZHAO Xiao-huan,XIA Jing-bo,LIAN Xiang-lei,LI Qiao-li |
() |
Abstract: |
To cope with the poor performance of single classification algorithms
on minority flows when the train dataset is deficient, the AdaBoost (Adaptive Boosting) algorithm is introduced to classify network traffic. On the basis of selecting few but effective classification features with CFS (Correlation-based Feature Selection) method from a variety of flow′s features, the AdaBoost algorithm is used to combine five single classification algorithms which belong to Decision Tree, Rules and Bayes respectively for the sake of traffic classification.
The experiment over real network traffic shows that the AdaBoost
algorithm has the highest precision up to 9892% among the selected classification algorithms. Moreover, the AdaBoost algorithm achieves great improvement on the performance of minority flows′ classification compared with single classification algorithms. |
Key words: network traffic traffic classification correlation-based feature selection a
daptive boosting algorithm ensemble classifier |