LightMem (ICLR 2026): Lightweight and Efficient Memory-Augmented Generation — 10×+ gains with 100× lower cost
Posted by zxlzr@reddit | LocalLLaMA | View on Reddit | 15 comments
We’re excited to share that our work **LightMem** has been accepted to **ICLR 2026** 🎉
**Paper:** [https://arxiv.org/abs/2510.18866](https://arxiv.org/abs/2510.18866)
**Code:** [https://github.com/zjunlp/LightMem](https://github.com/zjunlp/LightMem)
LightMem is a lightweight, modular memory system for LLM agents that enables scalable long-context reasoning and structured memory management across tasks and environments.
# 🧩 Motivation
LLMs struggle in long, multi-turn interactions:
* context grows noisy and expensive
* models get “lost in the middle”
* memory layers add latency & token cost
Existing memory systems can be accurate — but often heavy on tokens, API calls, and runtime.
https://preview.redd.it/5zoz8i0wgvlg1.png?width=672&format=png&auto=webp&s=6bb278e942b4587a5e4c4271c57a077aa59f4136
# 💡 LightMem keeps memories compact, topical, and consistent:
**1️⃣ Pre-compress sensory memory**
Filter redundant / low-value tokens before storage.
**2️⃣ Topic-aware short-term memory**
Cluster turns by topic and summarize into precise memory units.
**3️⃣ Sleep-time long-term consolidation**
Incremental inserts at runtime + offline high-fidelity updates (no latency hit).
# 🔬 Results
On **LongMemEval**:
* Accuracy ↑ up to **\~10.9%**
* Tokens ↓ up to **117×**
* API calls ↓ up to **159×**
* Runtime ↓ **>12×**
So LightMem often improves reasoning **while dramatically cutting cost**.
# 🧪 Recent updates
* Baseline evaluation framework across memory systems (Mem0, A-MEM, LangMem) on LoCoMo & LongMemEval
* Demo video + tutorial notebooks (multiple scenarios)
* MCP Server integration → multi-tool memory invocation
* Full LoCoMo dataset support
* GLM-4.6 integration with reproducible scripts
* Local deployment via Ollama, vLLM, Transformers (auto-load)
# 🧱 Positioning
LightMem is designed as a **modular memory layer** that can sit inside agent stacks:
* long-context agents
* tool-using agents
* autonomous workflows
* conversational systems
Think: structured memory that scales without exploding tokens.
# 🙌 Feedback welcome
We’d love input from:
* agent framework devs
* memory / RAG researchers
* long-context model folks
* applied LLM teams
Issues & PRs welcome: [https://github.com/zjunlp/LightMem](https://github.com/zjunlp/LightMem)
Let’s make agent memory practical, scalable, and lightweight 🚀
15 Comments
ruizibdz@reddit
Other_Chest_1039@reddit
crusoe@reddit
zxlzr@reddit (OP)
smwaqas89@reddit
zxlzr@reddit (OP)
ClimateBoss@reddit
zxlzr@reddit (OP)
blakeheron@reddit
zxlzr@reddit (OP)
blakeheron@reddit
nuclearbananana@reddit
zxlzr@reddit (OP)
Busy_Entrepreneur709@reddit
AI_Novice2@reddit