Selective Inference for Multivariate Regression Trees
Tuesday, Aug 5: 12:05 PM - 12:20 PM
2670
Contributed Papers
Music City Center
We consider post-selection inference for regression trees when the response is multivariate. In particular, we study how to appropriately test hypotheses suggested by the fitted tree. We find, as is known when the response is univariate, that to control the Type I error rate one must condition on the recursive data splits leading to the hypothesis in question. One may wish, e.g., to test whether the populations represented by two sibling nodes have the same mean. With a univariate response, proper conditioning on the splits results in a truncation of the null distribution of the test statistic such that p-values must be computed with respect to truncated normal distributions. With a multivariate response, we find that the p-values must be computed with respect to truncated multivariate normal distributions, where the truncation set is defined by a list of quadratic constraints. We show that accept-reject Monte Carlo simulation can give reliable post-selection p-values with a bivariate response and a fairly small number of predictors. To accommodate more predictors, we must consider more efficient ways to obtain probabilities from truncated multivariate Normal distributions.
post-selection inference
regression tree
MCMC
Main Sponsor
Statistics Without Borders
You have unsaved changes.