AI Memory System by Resident Evil Actress: 56k Stars Behind a Route Debate

AI MemPalace Mem0 Agent Memory Open Source

发布于 2026-07-02 10:52:15 5 次浏览

AI Memory System by Resident Evil Actress: 56k Stars Behind a Route Debate

56,268 stars. 56,268 developers gave a star to an AI memory project.

It wasn't made by OpenAI. It wasn't made by Google.

It was made by a Hollywood actress.

Milla Jovovich — yes, the same Alice from Resident Evil who shoots more than she speaks — and her partner Ben Sigman spent a few months using Claude Code to write an open-source project called MemPalace.

Its manifesto is bold: The best-benchmarked open-source AI memory system. And it's free.

Translation: We benchmark the most comprehensively, and it costs nothing.

GitHub address: github.com/MemPalace/mempalace

Stars: 56,268 (as of June 24, 2026)
Forks: 7,284
Language: Python
License: MIT (free for commercial use)
Core dependencies: ChromaDB + PyYAML

But what makes this project truly interesting is not who built it or how many stars it has.

It's that between this and Mem0 (another project with 59k stars), there are two completely different technical routes.

Two Routes: Store Original Text vs Store Experience

The memory problem for AI agents boils down to one sentence: Models don't have a hard drive.

When you close a conversation, everything you said is thrown away. Your preferences, your project structure, the mistakes you corrected — all gone.

RAG can solve part of it, but it's more like an "external hard drive": you ask, it searches, answers, and that's it. It won't remember that you said "don't use async patterns" last week, nor will it remind you three days later that "you rejected this requirement before."

How to solve it? Currently, there are two routes.

Route A: MemPalace — Store original text, zero API calls

MemPalace's approach is hardcore: store the original conversation verbatim, and use regex and keyword scoring for classification and retrieval.

Its memory is divided into four layers:

Level	Content	Function
Working memory	Current conversation window context	Keep single-turn tasks on track
Episodic memory	Summary of recent N conversation turns	Cross-session continuity
Semantic memory	Deduplicated and compressed "facts"	RAG stores documents, this stores experience
Knowledge graph	Entity-relationship graph	Reasoning along relationship chains

Key detail: The memory infrastructure uses no LLM at all. Classification, chunking, and retrieval all rely on regex heuristics and keyword scoring. The LLM is only used for actual conversation.

Result: zero API cost, fully local, completely deterministic.

Official benchmark: 96.6% R@5 on LongMemEval, 100% in hybrid mode.

Sounds impressive.

Route B: Mem0 — Store experience, use LLM to distill

Mem0 (pronounced mem-zero) takes the opposite approach: have the LLM automatically extract key information from conversations and compress it into compact memory entries.

It doesn't care about the original text. It only stores "distilled knowledge points."

For example, after 20 turns of conversation, Mem0 might store only 3 entries:

"User prefers Python's dataclass"
"Project A uses FastAPI"
"Performance over readability"

Mem0 also has three layers: user memory, session memory, and agent memory. The underlying storage is a triple combination of vectors + graphs + key-values. Conflicts are automatically merged and deduplicated without duplication.

It also has a $24M Series A funding, incubated by Y Combinator, with a complete hosted platform.

GitHub: 59,331 stars / 6,849 forks.

Benchmark Controversy: Is 96.6% Real?

MemPalace officially claims 96.6% R@5 on LongMemEval.

The community immediately started stress testing.

An independent tester (GitHub Issue #39) ran the full pipeline and got 82.6% QA accuracy — good, but 14 percentage points away from 96.6%.

The reason is metric mismatch:

System	Metric	Score
MemPalace	Retrieval recall R@5	96.6%
OMEGA	QA accuracy	95.4%
Independent test MemPalace	QA accuracy	82.6%
agentmemory	R@5	95.2%

MemPalace uses "retrieval recall" — whether it can find relevant memory when asked. But finding doesn't mean answering correctly.

Meanwhile, agentmemory (another competitor) even surpasses MemPalace in R@10 with its BM25+Vector approach: 98.6% vs ~97.6%.

In other words, 96.6% is not fake, but it doesn't mean what you think it means.

Mem0 isn't doing much better either. On LongMemEval-S, Mem0 with GPT-4o scores 49.0%, while Zep/Graphiti reaches 63.8% — 15 points difference.

Benchmarks: whoever runs them wins.

Cold Water

Both projects are impressive, but both have serious flaws.

MemPalace's problems:

Incomplete memory garbage collection — Long-term memory can bloat, relying on manual cleanup. An agent that has been running for 3 months could have a memory store full of outdated information
Multi-agent shared memory not verified — If Claude Code and Cursor use it simultaneously, will memories conflict? The official documentation doesn't say
API still iterates rapidly — Code written today may not work next month
"Milla Jovovich wrote this code" — Many on Reddit suspect this is a marketing stunt. More likely, a CEO hired engineers, the engineers deleted their accounts, and Milla was put forward as the face. This doesn't affect code quality but affects trust

Mem0's problems:

Self-hosted version heavily neutered — Automatic conflict resolution, hosted graph memory, all missing; you have to build it yourself
Reliance on LLM calls — Each memory extraction requires an API call, incurring non-negligible costs. The free tier only gives 1000 memories per month
Weak recall for long-tail scenarios — Works well for user preferences and factual memory, but recall drops sharply for scenarios requiring reasoning (e.g., "why was this proposal rejected last time")
Lock-in risk from closed-source platform — The free tier is just bait; production usage must be paid for

There's also a problem neither side solves: memory poisoning.

Mem0 itself published an article on June 22 admitting that malicious input can contaminate the agent's memory, causing it to make wrong decisions in subsequent conversations. This is an OWASP ASI06-level security vulnerability with no good defense currently.

Getting Started in 5 Minutes

If you just want to try them, both are quick.

MemPalace:

pip install mempalace
mempalace serve --port 8000

Integration with Claude Code (MCP):

{
  "mcpServers": {
    "memory": {
      "command": "mempalace",
      "args": ["mcp"]
    }
  }
}

Mem0:

pip install mem0ai

from mem0 import Memory
m = Memory()
m.add("I like using Python's dataclass", user_id="lininn")
result = m.search("Python", user_id="lininn")

MemPalace is fully local and zero-cost. Mem0 has a cloud platform, but that also means your data goes to someone else's servers.

Which one to choose? Depends on whether you trust "original text can be traced back" or "AI extraction is more efficient."

The Essence of This War

In the 2026 agent space, there is consensus: The difference between agents that work and those that don't is not model capability but memory design.

When everyone can call GPT-5.5, Claude 4, DeepSeek V4, what determines whether an agent is useful is not "whether the model understands" but "whether the agent remembers."

MemPalace represents one philosophy: Memory is a filesystem problem. Store original text, retrieve with deterministic algorithms, don't rely on black boxes.

Mem0 represents another: Memory is a cognition problem. Let the LLM decide what's worth remembering, how to remember, and when to forget.

Both paths are racing. MemPalace went from 0 to 56k stars in two months; Mem0 accumulated 59k stars in three years and got $24M in funding.

But at the end of the day, they start from the same point — an absurd truth: In 2026, the world's smartest AI forgets everything the moment you close the window.

This is not normal.

And normal people don't think it's abnormal.

Project data as of June 24, 2026. MemPalace: github.com/MemPalace/mempalace | Mem0: github.com/mem0ai/mem0

AI Memory System by Resident Evil Actress: 56k Stars Behind a Route Debate

Two Routes: Store Original Text vs Store Experience

Benchmark Controversy: Is 96.6% Real?

Cold Water

Getting Started in 5 Minutes

The Essence of This War

评论