Alternative Approaches for Estimating Highest-Density Regions

Abstract

Among the variety of statistical intervals, highest-density regions (HDRs) stand out for their ability to effectively summarise a distribution or sample, unveiling its distinctive and salient features. An HDR represents the minimum size set that satisfies a certain probability coverage, and current methods for their computation require knowledge or estimation of the underlying probability distribution or density $f$. In this work, we illustrate a broader framework for computing HDRs, which generalises the classical density quantile method. The framework is based on neighbourhood measures, that is, measures that preserve the order induced in the sample by $f$, and include the density as a special case. We explore a number of suitable distance-based measures, such as the $k$-nearest neighbourhood distance, and some probabilistic variants based on copula models. An extensive comparison is provided, showing the advantages of the copula-based strategy, especially in those scenarios that exhibit complex structures (e.g. multimodalities or particular dependencies). Finally, we discuss the practical implications of our findings for estimating HDRs in real-world applications.

Publication
Article in International Statistical Review

Related