Hardin and Rocke investigated the distribution of the robust Mahalanobis squared
distance (RSD) computed using the minimum covariance determinant (MCD) estimator.
They showed that the distribution of RSDs for outlying observations not part of
the MCD subset is well-approximated by an $F$ distribution. They developed
a methodology to adjust an asymptotic formula for the degrees of freedom parameters
of this $F$ distribution to provide correct parameter values in small-to-moderate
samples. This methodology was developed for the maximum breakdown point
version of the MCD, which is based on approximately half of the observations.
Whether the approximation remains accurate for the MCD using larger subsets of
the data is an open question. We show that their approximation
works quite well for the more general MCD, but can be noticeably inaccurate for
sample sizes less than 250 and when the MCD estimate uses nearly all of the observations.
Motivated by the desire to apply RSD-based outlier detection tests to financial
asset return and factor exposure data sets whose typical sample sizes are smaller
than 250, we develop a more general correction procedure that is accurate across a wider
range of sample sizes and MCD subset sizes than the Hardin and Rocke approach.
We use our approach to extend Cerioli's IRMCD procedure for accurate
RSD-based outlier tests to arbitrary MCD subset sizes.