Voting in Ensemble Learning
The ensemble technique is a machine learning algorithm that is like aggregating the feedback of a group of people to find the correct review of an article that you have just written. The idea of ensemble learning is to have multiple learners and combine their predictions.
A group of predictors is called an ensemble. The technique is called an ensemble technique and the ensemble learning algorithm is called an ensemble method.
Ensemble learners can be applied to base learners.
There are multiple techniques for Ensemble Methods:
1. Voting
2. Bagging
3. Boosting
4. Stacking
5. Clustering
In this article, I will go into detail about Voting.
Voting
The predictions of base classifiers can be aggregated and the class with a maximum number of votes can be selected for the final prediction. The idea behind the VotingClassifier is to combine conceptually different machine learning classifiers (like Decision Trees, Random Forest Classifier, Logistic Regression, SVM Classifier, etc) and use a majority vote or the average predicted probabilities to predict the class labels. Such a classifier can be useful for a set of equally well-performing models to balance out their individual weaknesses.
Voting is majorly of 2 types:
1. Hard Voting Classifier
The majority vote classifier is the hard voting classifier. The predicted class label for a particular sample is the class label that represents the majority (mode) of the class labels predicted by each classifier. Even if each classifier is a weak classifier, the ensemble can be a strong learner and provide higher accuracy provided there are sufficient number of weak learners who are sufficiently diverse. Diverse classifiers can be achieved by training with different algorithms. They will make various types of errors, improving the overall accuracy.
In the case of a tie, the Voting Classifier will select the class based on the ascending sort order. E.g., in the following scenario
- classifier 1 -> 2
- classifier 2 -> 1
the class label 1 will be assigned to the sample.
2. Soft Voting Classifier
If all the classifiers can estimate class probabilities, Scikit Learn can predict the class with the highest probabilities, averaged over individual classifiers. This is a weighted average, more weight can be given to highly confident votes.
Please check out my github page for a sample code and dataset for hard and soft voting.
Reference
https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1492032646