Random naive Bayes

Random naive Bayes

Random naive Bayes extends the Naive Bayes classifier by adopting the random forest principles: random input selection (bagging, i.e. bootstrap aggregating) and random feature selection ( [Breiman, 2001] ).

Naive Bayes classifier

Naive Bayes is a probabilistic classifier simplifying Bayes' Theorem by "naively" assuming class conditional independence. Although this assumption leads to biased posterior probabilities, the ordered probabilities of Naive Bayes result in a classification performance comparable to that of classification trees and neural networks ( [Langley, Iba and Thomas, 1992] ). Nothwithstanding Naive Bayes' popularity due to its simplicity combined with high accuracy and speed, its conditional independence assumption rarely holds. There are mainly two approaches to alleviate this naivity:
1) Selecting attribute subsets in which attributes are conditionally independent (cf. Selective Bayesian Classifier [Langley and Sage, 1994] ).
2) Extending the structure of Naive Bayes to respresent attribute dependencies (cf. AODE [Webb et al. 2005] ).

Random naive Bayes' alleviation of the class conditional independence assumption

Random Naive Bayes adopts the first approach by randomly selecting a subset of attributes in which attributes are assumed to be conditionally independent. Naive Bayes' performance might benefit from this random feature selection. Analogous to AODE, Random Naive Bayes builds an ensemble, but unlinke AODE, the ensemble combines zero-dependence classifiers.

Random naive Bayes and random forest

Generalizing Random Forest to Naive Bayes, Random Naive Bayes (Random NB), is a bagged classifier combining a forest of "B" Naive Bayes. Each "b"th Naive Bayes is estimated on a bootstrap sample Sb with "m" randomly selected features. To classify an observation put the input vector down the "B" Naive Bayes in the forest. Each Naive Bayes generates posterior class probabilities. Unlike Random Forest, the predicted class of the ensemble is assessed by adjusted majority voting rather than majority voting, as each "b"th Naive Bayes delivers continuous posterior probabilities. Similar to Random Forest, the importance of each feature is estimated on the out-of-bag (oob) data.

Notes

References

* Breiman,L. (2001). Random Forests, Machine Learning, 45(1), 5–32.
* Langley, P., Iba, W. and Thomas, K. (1992). An analysis of Bayesian Classifiers, Proceedings of the Tenth National Conference on Artificial Intelligence, AAAI Press,223–228.
* Langley, P. and Sage, S. (1994). Induction of selective Bayesian Classifiers, Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, Seattle, WA: Morgan Kaufmann.
* [http://dx.doi.org/10.1007/978-3-540-74469-6_35 Prinzie, A., Van den Poel, D. (2007). Random Multiclass Classification: Generalizing Random Forests to Random MNL and Random NB, Dexa 2007, Lecture Notes in Computer Science, 4653, 349–358.]
* [http://www.crm.ugent.be/Video_Lecture_UGent_Random_Multinomial_Logit_Generalizing_RF_to_other_methods_Prinzie_and_Van_den_Poel.htm Video lecture on Random Naive Bayes and Random MultiNomial Logit]

See also

*Naive Bayes classifier
*Random forest
*Random multinomial logit


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Naive Bayes classifier — A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes theorem with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be independent feature model . In… …   Wikipedia

  • Random forest — In machine learning, a random forest is a classifier that consists of many decision trees and outputs the class that is the mode of the classes output by individual trees. The algorithm for inducing a random forest was developed by Leo Breiman… …   Wikipedia

  • Random multinomial logit — In statistics and machine learning, random multinomial logit (RMNL) is a technique for (multi class) statistical classification using repeated multinomial logit analyses via Leo Breiman s random forests. Rationale for the new methodSeveral… …   Wikipedia

  • Bayes' theorem — In probability theory, Bayes theorem (often called Bayes law after Thomas Bayes) relates the conditional and marginal probabilities of two random events. It is often used to compute posterior probabilities given observations. For example, a… …   Wikipedia

  • Classification bayésienne naïve aléatoire — La Classification bayésienne naïve aléatoire étend la Classification naïve bayesienne en adoptant les principes des forêts d arbres décisionnels : sélection aléatoire des entrées, bagging (i.e. « bootstrap aggregating ») et… …   Wikipédia en Français

  • Empirical Bayes method — In statistics, empirical Bayes methods are a class of methods which use empirical data to evaluate / approximate the conditional probability distributions that arise from Bayes theorem. These methods allow one to estimate quantities… …   Wikipedia

  • List of mathematics articles (R) — NOTOC R R. A. Fisher Lectureship Rabdology Rabin automaton Rabin signature algorithm Rabinovich Fabrikant equations Rabinowitsch trick Racah polynomials Racah W coefficient Racetrack (game) Racks and quandles Radar chart Rademacher complexity… …   Wikipedia

  • Bayesian inference — is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. The name Bayesian comes from the frequent use of Bayes theorem in the inference process. Bayes theorem… …   Wikipedia

  • Bag of words model in computer vision — This is an article introducing the Bag of words model (BoW) in computer vision, especially for object categorization. From now, the BoW model refers to the BoW model in computer vision unless explicitly declared.Before introducing the BoW model,… …   Wikipedia

  • Bayesian network — A Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG). For example …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”