Boosted Random Forest and Transfer Learning
Random forest is a multi-class classifier method which has a high classification capability and which enables high-speed learning and classification. It is attracting attention in many fields such as computer vision, pattern recognition, and machine learning. Since a random forest, which is a type of ensemble learning, performs classification rejection by majority vote of a number of trees, it is necessary to construct a large number of decision trees. There is a problem in that performance improves as the number of decision trees increases, but a large amount of memory is necessary, whereas if the number of decision trees is reduced, the individual performance of precision instruments drops. We therefore propose a boosted random forest into which a boosting algorithm has been introduced, as a high-performance random-forest learning method with fewer decision trees. We also propose a transfer forest where transfer learning has been introduced into the random forest.
Boosted Random Forest
A boosted random forest introduces weighting into the learning samples and constructs decision trees sequentially by a boosting algorithm. This approach constructs decision trees that complement the learning samples, making it possible to maintain the generalized performance with a small number of decision trees. A boosted random forest is suitable when implementing a random forest on embedded hardware which has constraints in the memory environment. When evaluated with respect to the database of the UCI Machine Learning Repository, the boosted random forest can greatly reduce the use of memory in comparison with the ordinary random forest, with at least the equivalent performance (see lower graph).
Since a random forest is a statistical learning method, it is inevitable there will be a drop in performance if the learning environment and usage environment are very different, making it necessary to collect new learning samples in the usage environment and repeat the learning. To address this problem, a transfer learning method has been proposed, to construct classifiers efficiently by “transferring” from one to another, with respect to samples of a number of domains having differing distributions (target and prior domains). We propose a transfer forest where transfer learning based on covariate shifts has been introduced into a random forest. A transfer forest makes it possible to inhibit any drop in performance due to a reduction of samples in the target domain, by using covariate shifts when constructing the decision tree and performing an incremental construction of the decision tree, utilizing only prior domain samples that are suitable for learning in the target domain.