A method for using the techniques of dynamic programming to prune decision trees is presented. This method is particularly useful in pruning overtrained systems, and in cases where a tradeoff between accuracy and efficiency must be made.
A profit and a cost are assigned to each node in the initial tree. The profit measures the change in accuracy of treating a node as a leaf or as an internal node in the pruned tree, and the cost measures the increased computation time. By using a left-to-right evaluation order, it is possible to quickly search for a pruned tree that maximizes the profit (or accuracy) for a given pruned tree size. This is in contrast to Bohanec and Bratko’s earlier approach of bottom-up calculation [1], which required greater processing time and storage space to produce the same optimal tree. Additional work shows how a tradeoff can be made between computation time and storage requirements, thereby allowing larger trees to be pruned successfully on smaller machines.