7 DE MARÇO
24 MIN

Learning without training: The implicit dynamics of in-context learning

This research explores the mechanisms of in-context learning (ICL) in Large Language Models, proposing that transformers learn by implicitly updating their internal weights during inference. The authors demonstrate that a transformer block effectively transforms prompt examples into a rank-1 weight update of the model's MLP layer. This process allows the model to adapt to new patterns without permanent training, mathematically mirroring stochastic gradient descent as tokens are processed. Theoretical formulas are provided to map these context-driven adjustments exactly, showing that MLP layers are naturally structured to absorb and store contextual information. Experimental results on linear regression tasks confirm that modifying model weights using these formulas produces identical predictions to providing the original in-context prompt. The study ultimately unifies ICL with model editing and steering vectors, offering a principled framework for understanding how LLMs reorganize their internal representations dynamically.

Página do episódio

Podcast

Best AI papers explained
Frequência

Diário
Publicado

7 de março de 2026 às 22:46 UTC
Duração

24 min
Classificação

Livre

Learning without training: The implicit dynamics of in-context learning

Informações