Beta-trees: Multivariate histograms with confidence statements

QIAN ZHAO Co-Author
University of Massachusetts
 
Guenther Walther First Author
Stanford University
 
Guenther Walther Presenting Author
Stanford University
 
Sunday, Aug 3: 5:05 PM - 5:20 PM
1221 
Contributed Papers 
Music City Center 
Multivariate histograms are difficult to construct due to the curse of dimensionality. Motivated by k-d trees in computer science, we show how to construct an efficient data-adaptive partition of Euclidean space that possesses the following two properties: With high confidence the distribution from which the data are generated is close to uniform on each rectangle of the partition; and despite the data-dependent construction we can give guaranteed finite sample simultaneous confidence intervals for the probabilities (and hence for the average densities) of each rectangle in the partition. This partition will automatically adapt to the sizes of the regions where the distribution is close to uniform. The methodology produces confidence intervals whose widths depend only on the probability content of the rectangles and not on the dimensionality of the space, thus avoiding the curse of dimensionality. Moreover, the widths essentially match the optimal widths in the univariate setting. The simultaneous validity of the confidence intervals allows to use this construction, which we call Beta-trees, for various data-analytic purposes, as will be illustrated.

Keywords

curse of dimensionality

simultaneous inference 

Main Sponsor

Section on Nonparametric Statistics