The Department of Mathematical and Statistical Sciences will host a seminar titled, “A generalized random forest framework for improved prediction and interpretations,” on Friday, Nov. 15, at 1 p.m. in Cudahy Hall 401. The seminar will be presented by Dr. Tiffany Tang, Clare Boothe Luce assistant professor of applied and computational mathematics and statistics at the University of Notre Dame.
Tree-based models, including random forests, are among the most popular supervised learning algorithms, demonstrating state-of-the-art prediction performance over a wide class of learning problems. Motivated by this strong empirical performance, Tang’s work sheds light on a new interpretation of decision trees as the best-fit linear model, fitted on a collection of engineered features associated with each split from the tree.
This reinterpretation of decision trees as linear models opens the door to exciting opportunities for methodological innovations. In this talk, Tang will discuss three such examples, namely, (1) the development of RF+, a generalization of RFs, which yields superior prediction accuracy to traditional RFs; (2) extensions of RF+ that exploit complex data structures such as spatial or network data; and (3) a general framework called MDI+ for computing feature importances from RF+ models, which generalizes and improves upon the traditional mean decrease in impurity (MDI) importance. To
demonstrate their efficacy and wide applicability, Tang’s group applies these novel methods to real data case studies involving drug response prediction, breast cancer subtype classification, and school conflict reduction strategies.
After the seminar, refreshments will be served in Cudahy 342 starting at 2 p.m. For more information, visit the colloquium page or contact Dr. Naveen Bansal at naveen.bansal@marquette.edu.