10月26日
14 分钟

Continual Learning via Sparse Memory Finetuning

This research paper proposes a novel approach to address catastrophic forgetting in large language models (LLMs) during continual learning, introducing sparse memory finetuning. This method utilizes memory layer models, which are designed for sparse updates, by selectively training only the memory slots that are highly activated by new knowledge relative to existing information, using a TF-IDF ranking score. The authors demonstrate that this technique achieves new knowledge acquisition comparable to full finetuning and LoRA, but with substantially less degradation of previously acquired capabilities on held-out question-answering benchmarks. The results suggest that leveraging sparsity in memory layers is a highly promising strategy for enabling LLMs to continually accumulate knowledge over time.

单集网页

节目

Neural intel Pod
频率

一周一更
发布时间

2025年10月26日 UTC 11:17
长度

14 分钟
分级

儿童适宜

Continual Learning via Sparse Memory Finetuning

信息