Foundations of Model Selection

Title: Foundations of Model Selection

Research Question: How can we determine the best model for explaining a given set of data, especially when considering the complexity of the model?

Methodology: The authors propose a new approach to model selection called Kolmogorov's structure function. This function measures the relationship between the individual data and its explanation (model), and can be expressed as a two-part code consisting of a model description and a data-to-model code. The authors also consider a one-part code consisting of just the data-to-model code, which is essentially the maximum likelihood estimator.

Results: The main result of this study is that, for all data, minimizing the two-part code or the one-part code subject to a given model-complexity constraint, selects a model that is a "best explanation" of the data within the given constraint. This means that the best fit (minimal randomness deficiency under complexity constraints on the model) cannot be computationally monotonically approximated, but the two-part code or the one-part code can be monotonically minimized, allowing for an approximation of the best fitting model.

Implications: This research has significant implications for the field of model selection. It shows that the Kolmogorov structure function and its variations are relevant and common concerns in statistical theory. The practical consequence of this work is that it provides a method for selecting the best model for explaining a given set of data, even when considering the complexity of the model. This can be particularly useful in complex video and sound analysis, where the part of the support of the probability density function that will ever be observed has about zero measure.

Link to Article: https://arxiv.org/abs/0204037v3 Authors: arXiv ID: 0204037v3

Foundations of Model Selection

Navigation menu

Search