How MemPalace Hit #1 on GitHub in 48 Hours, And What Builders Actually Learned
The Milla Jovovich repo, 41,200 stars, a benchmark controversy, and what the viral moment reveals about the enormous unmet demand for local AI memory.
On April 5, 2026, actress Milla Jovovich pushed a Python repository called MemPalace to her personal GitHub. Within 48 hours it had 7,000+ stars. By April 8 it crossed 23,000 stars and nearly 3,000 forks, hitting #1 trending on GitHub across all languages. A community controversy followed within hours: the claimed 100% LongMemEval score was revised to 96.6% after developers identified benchmark overfitting. The tool still works. The core architecture is real. And the viral moment revealed something more important than the benchmark drama: the market for local, private AI memory is enormous, underserved, and ready to explode. This is the full breakdown of what happened, what was real, what was not, and what it means for the AI memory space.
The story: a Hollywood actress, a Bitcoin startup CEO, and a viral repo
Milla Jovovich, the Resident Evil and Fifth Element actress, had been using AI tools heavily since late 2025, logging thousands of conversations across Claude, ChatGPT, and Cursor. She hit the wall every power user hits: start a new session, the model has complete amnesia. Months of project context, decision history, and architectural preferences, gone.
She found Ben Sigman, CEO of Libre Labs, a Bitcoin lending company. They spent months building MemPalace together, using Claude Code as the primary development tool. Jovovich designed the organizational architecture, the "memory palace" metaphor from classical Greek rhetoric. Sigman handled the engineering. On April 5, 2026, the repo went live on Jovovich's personal GitHub.
The numbers
What MemPalace actually builds
The core problem it solves. Most AI memory systems, Mem0, Zep, standard RAG, use an LLM to decide what information to keep and what to discard. A conversation becomes "user prefers Postgres" and the original reasoning is thrown away. MemPalace inverts this: store every word verbatim, then organize it so semantic search works across the full archive.
The palace metaphor as a data structure. Inspired by the ancient method of loci used by Greek orators, a Wing is a top-level domain (a project, a person, a company), a Room is a sub-topic within a wing, and a Hall is a memory type that runs across every wing (facts, events, advice, decisions).
The retrieval numbers. On LongMemEval, a 500-question benchmark covering information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and abstention, MemPalace scores 96.6% R@5 in raw mode with zero API calls, running entirely on ChromaDB and SQLite locally. The system loads just 170 tokens at startup, a dramatic efficiency advantage over CLAUDE.md file approaches that balloon context on every launch.
The AAAK compression layer. A 30x compression system that packs entity names and relationships into shorthand readable by any LLM without a decoder. AAAK mode scores 84.2% on LongMemEval vs 96.6% in raw mode, you trade recall for token efficiency.
The benchmark controversy: what the community caught
Within 48 hours of launch, GitHub Issue #29 revealed that the initial 100% LongMemEval claim was achieved by identifying which specific questions the system got wrong, engineering targeted fixes for those exact questions, and retesting on the same set. This is benchmark overfitting, not improvement.
Within 72 hours, Issue #27 flagged multiple discrepancies between README claims and the actual codebase: "30x lossless compression" was lossy (AAAK at 84.2% vs raw at 96.6%), the "+34% palace boost" was metadata filtering rather than a novel architectural advantage, and contradiction detection existed as a utility but was not wired into main graph operations.
The team's response was correct. By April 8, the README was updated with a public "A Note from Milla and Ben" acknowledging each mistake. The corrected 96.6% figure was independently reproduced on an M2 Ultra. After the MemPalace incident, and a separate gstack benchmark controversy the same month, the developer community has zero tolerance for AI memory benchmark fabrication. (Cybernews)
Why it trended, beyond the celebrity hook
The celebrity co-author explains the initial spike. The 41,200-star endpoint requires the tool to actually be useful. Three things kept developers engaged:
- 01 Local-first is underserved. Mem0 costs $19–$249/month. Zep starts at $25/month. MemPalace runs entirely locally, MIT-licensed, zero API costs. The r/LocalLLaMA community, hundreds of thousands of developers who self-host everything, had no serious local memory option before this.
- 02 Verbatim storage is architecturally distinct. Storing conversations in full rather than AI-extracting summaries is a meaningful trade-off. For projects where a decision from eight months ago might suddenly become critical, full-verbatim retrieval pays off. Mem0 and Zep's extraction approaches lose the original reasoning.
- 03 The palace structure as cognitive UI. The wing/room/hall taxonomy gives developers a mental model for organizing AI memory that mirrors how humans categorize knowledge. That structure enables scoped search, search within a project, not across all memory, that flat vector stores cannot match.
The loophole MemPalace exposed for the market
MemPalace's 41,200-star count represents demand. But the tool itself has a clear ceiling: it is personal, local, and unmanaged. There is no multi-agent coordination, no cross-session reasoning about change, no proactive surfacing of relevant context before the agent knows to ask, and no organizational memory shared across a team or company.
The gap MemPalace reveals is the production problem: multiple AI agents with shared, evolving organizational context, proactive recommendations, and temporal conflict resolution. For the architecture that addresses this, see Context Graph: The Future of AI Memory and Intelligence Layers in AI Agents.
MemPalace is excellent at individual developer memory on a local machine. The production problem, multiple AI agents with shared, evolving organizational context, proactive recommendations, temporal conflict resolution, requires a different architecture. GeniOS's Context Graph (Section A) stores and scores organizational facts with confidence, freshness, consistency, signal, and authority. Context Intelligence (Section B) runs continuously, monitors for change, and pushes proactive recommendations to agents before they know to ask. That is the organizational-scale version of MemPalace.
Key technical takeaways for builders
- Verbatim storage beats summary extraction when recall depth matters more than storage efficiency. Choose based on your use case.
- 96.6% on LongMemEval in raw mode is a real number. The initial 100% methodology was wrong; the corrected figure is reproducible.
- Palace structure improves search precision, not recall. Independent benchmarks found the architecture can regress raw recall vs flat ChromaDB when scope is not set correctly.
- MCP tools are the right integration primitive in 2026. MemPalace ships 29 MCP tools, that is the correct interface for LLM tool use.
- The stdout bug (Issue #225) corrupts Claude Desktop MCP integration. Claude Code works; Claude Desktop does not until the patch ships.
What is MemPalace?
MemPalace is a free, open-source, local-first AI memory system that stores conversations verbatim and retrieves them via semantic search. It uses the ancient method of loci as an organizational metaphor, with wings (top-level domains), rooms (sub-topics), and halls (memory types).
How many GitHub stars did MemPalace get?
MemPalace hit 23,000 stars in under 48 hours of launch (April 5–8, 2026) and crossed 41,200 stars within two weeks, reaching #1 trending on GitHub across all languages.
What score does MemPalace get on LongMemEval?
96.6% R@5 in raw mode, with zero API calls. The initial 100% claim was revised after developers identified benchmark overfitting in GitHub Issue #29.
Who built MemPalace?
Actress Milla Jovovich designed the palace architecture and co-created the project. Ben Sigman, CEO of Libre Labs, handled the engineering. The project is MIT-licensed.
How is MemPalace different from Mem0 or Zep?
MemPalace stores conversations verbatim with zero extraction and runs entirely locally at no cost. It lacks organizational memory, multi-agent coordination, and proactive reasoning. Mem0 and Zep are managed services with graph layers, multi-tenant support, and production-grade uptime guarantees.
What is the AAAK compression layer in MemPalace?
A 30x compression system that packs entity names and relationships into shorthand readable by any LLM without a decoder. AAAK mode trades recall (84.2% on LongMemEval) for token efficiency vs raw mode (96.6%).