Основные понятия
The core message of this article is that decision tree learning can be formulated as a Markov Decision Process (MDP) to achieve a favorable trade-off between the complexity and performance of the learned trees. By carefully controlling the action space of the MDP using a state-dependent tests generating function, the authors demonstrate that their algorithm, Dynamic Programming Decision Tree (DPDT), can find trees that are competitive with the current state-of-the-art Branch-and-Bound (BnB) solvers in terms of training accuracy, while offering a richer set of solutions along the complexity-performance Pareto front.
Аннотация
The article presents a novel approach to decision tree learning by formulating it as a Markov Decision Process (MDP). The key aspects are:
MDP Formulation: The authors define an MDP where the states represent subsets of the training data and the actions correspond to binary tests (splits) to be applied. The reward function encodes a trade-off between the complexity (average number of tests performed) and the training accuracy of the learned tree.
Tests Generating Function: To make the MDP tractable, the authors introduce a state-dependent tests generating function that dynamically limits the set of admissible tests at each state. They propose using the test nodes of a CART tree as the set of admissible tests, which empirically outperforms a baseline that returns the most informative splits.
Dynamic Programming Solver: The authors use dynamic programming to solve the MDP and obtain the optimal policy, which can then be converted into a decision tree. Crucially, they can compute the optimal policies for a range of complexity-performance trade-offs (different values of the regularization parameter α) in a single backward pass.
Comparison to Baselines: The authors compare their DPDT algorithm to several baselines, including optimal BnB solvers and greedy approaches like CART. They show that DPDT can find trees that are competitive with the optimal BnB solutions in terms of training accuracy, while being orders of magnitude faster. DPDT also outperforms CART in terms of generalization performance on unseen data.
Interpretability Analysis: The authors demonstrate that DPDT can provide a rich set of trees along the complexity-performance Pareto front, allowing users to select the tree that best suits their interpretability needs. This is in contrast to BnB approaches, which only return a single tree.
Overall, the article presents a novel and effective approach to decision tree learning that achieves a favorable trade-off between scalability, optimality, and interpretability.