Building a very good classifier is hard, but building a weak one is easy (just need to be slightly better than pure chance). So, can we combine several weak classifiers to build a better one?
We can perhaps, take the weighted average, and then do activation function?
Letβs say we have classifiers , then the weighted average becomes
where is the voting weight assigned to classifier . Then the final prediction is .
Some ensemble learning examples:
- Ada Boost
- XGBoost