25.11.2024 16:30 Jago Silberbauer: No-Free-Lunch for Autoregressive Models
No-Free-Lunch theorems are important results in the mathematical foundations of
statistical learning. They typically state that, in expectation w.r.t. a uniformly
chosen target concept, no machine learning algorithm performs better on
unseen data than random guessing. Put differently, one algorithm can only
outperform another when being supplied with sufficient a priori knowledge by
means of training data or design. In this talk, I will present a new kind of No-Free-Lunch theorem, namely for
so-called autoregressive models, most prominently used in Large Language models
powering, e.g., OpenAI's ChatGPT. These can be represented by higher-order Markov chains whose kernels are learned during training. I will discuss the key points of its proof and put the result into perspective to scenarios relevant to natural language processing.
Source