10월 26일
14분

Continual Learning via Sparse Memory Finetuning

This research paper proposes a novel approach to address catastrophic forgetting in large language models (LLMs) during continual learning, introducing sparse memory finetuning. This method utilizes memory layer models, which are designed for sparse updates, by selectively training only the memory slots that are highly activated by new knowledge relative to existing information, using a TF-IDF ranking score. The authors demonstrate that this technique achieves new knowledge acquisition comparable to full finetuning and LoRA, but with substantially less degradation of previously acquired capabilities on held-out question-answering benchmarks. The results suggest that leveraging sparsity in memory layers is a highly promising strategy for enabling LLMs to continually accumulate knowledge over time.

에피소드 웹페이지

프로그램

Neural intel Pod
주기

매주 업데이트
발행일

2025년 10월 26일 오전 11:17 UTC
길이

14분
등급

전체 연령 사용가

Continual Learning via Sparse Memory Finetuning

정보