摘要: |
为提高朴素贝叶斯(Naive Bayesian)分类器的分类准确率,对朴素贝叶斯属性选择算法及假
设属性概率
值先验分布中的参数设置问题进行分析,提出将属性先验分布的参数设置加入到属性选择的
过程中,并研究当先验分布服从Dirichlet分布及广义Dirichlet分布情况下的具体调整步骤
。以UCI数据库为例进行仿真实验,结果表明当先验分布服从广义Dirichlet分布时,该方法
可提高分类的准确率,如Parkinsons数据集,效率可提升1332%。 |
关键词: 朴素贝叶斯分类器;先验分布;属性选择法 广义Dirichlet分布 |
DOI: |
|
基金项目: |
|
Performance improvement of naive Bayesian classifier based on feature selection |
JIAO Peng,WANG Xin-zheng,XIE Peng-yuan |
() |
Abstract: |
In order to improve the accuracy of the naive Bayesian classifier(NBC)
, the selective naive Bayesian(SNB) method and the attributes′ prior distribu
ti
on are studied. A method for combining prior distribution and feature selection
together is proposed, which finds out the best prior for each attribute after al
l attributes have been determined by the SNB algorithm
. The experimental result on 10 data sets form UCI data repository shows that th
is method with the general Dirichlet prior generally achieves higher classificat
ion accuracy, such as the the efficiency of the data sets of Parkinson's can be
enhanced by 13.32%. |
Key words: naive Bayesian classifier prior distribution feature selection algorithm g
eneralized Dirichlet distribution |