The increasing complexity of models in the field of machine learning and statistics has led to the development of composite goodness-of-fit tests. These tests determine whether a dataset fits a distribution within a given parametric family, including families with unnormalised densities or generative models. This thesis examines two composite goodness-of-fit tests known from the literature: The Maximum Mean Discrepancy (MMD) for generative models and the Kernel Stein Discrepancy (KSD) for unnormalised densities. We first recapitulate the proof that the MMD test statistic exhibits appropriate asymptotic behaviour. We then rigorously derive the asymptotic distribution for the KSD test statistic, a result previously conjectured in the literature. In addition, we also investigate two implementation algorithms proposed in the literature. They are based on wild and parametric bootstrapping. We evaluate their performance on three case studies taken from the literature. The wild bootstrap test tends to be conservative, while the parametric bootstrap consistently achieves the desired significance level for the MMD and KSD test. For the MMD test, we present theoretical results from the literature that support these findings, while for the KSD test, we conjecture the conditions necessary for an analogous validation.
«
The increasing complexity of models in the field of machine learning and statistics has led to the development of composite goodness-of-fit tests. These tests determine whether a dataset fits a distribution within a given parametric family, including families with unnormalised densities or generative models. This thesis examines two composite goodness-of-fit tests known from the literature: The Maximum Mean Discrepancy (MMD) for generative models and the Kernel Stein Discrepancy (KSD) for unnorm...
»