supermemory.aiAI tool

SuperMemory.ai

supermemory.ai
Plans tarifaires

Aucun plan tarifaire detaille n'est encore disponible pour cet outil.

Presentation detaillee

Research #1 on MemoryBench: best in latency, quality, and cost → Your AI is only as good as what it remembers Context infrastructure for your AI agents: user profiles, memory graph, retrieval, extractors, and connectors. Not just memory. Complete understanding. Start Building → Talk to Founder Get a personal supermemory ↗ $ npx skills add supermemoryai/skills Get a personal supermemory → Start building ↗ · Talk to founder ↗ WHAT WE DO [1/8] Bring your data. We build understanding. Your agent just knows. Context infrastructure for AI agents. One API, every capability. FOR DEVELOPERS & TEAMS The Supermemory API We have built the state of the art in retrieval and memory. Whether you need RAG, memory, or extraction, it's all built in. One API, every capability. <300msAlways fast. Always. 100B+Tokens monthly. We scale. #1Quality across metrics. Self-hostable · SOC 2 · TypeScript & Python SDKs Start building FOR EVERYONE Personal Supermemory One single memory across everything you use. What you teach one AI, every AI remembers. Supermemory App Control panel for all your context Claude Code · Cursor · OpenClaw · OpenCode AI plugins Chrome Extension One-click save 10,000+ power users Get Personal Supermemory THE CONTEXT STACK [2/8] Five layers. Complete context. Most memory solutions give you one layer. We give you all five, working together. So instead of using five services, you just use Supermemory. Saving cost, effort, and getting better context to your agent. 01 User Profiles An internal model that builds deep user profiles from behavior. Your AI doesn't just recall. It understands intent, preferences, and context. 02 Memory Graph Custom vector graph engine with ontology-aware edges. Knowledge updates, merges, contradicts, and infers. It never just appends. 03 Retrieval Hybrid vector + keyword search with sub-300ms latency. Context-aware reranking ensures the most relevant memories surface first. 04 Extractors Understand any format: PDFs, web pages, images, audio. Smart chunking that preserves meaning across document boundaries. 05 Connectors Pull from Notion, Slack, Google Drive, S3, Gmail, and custom sources. Data stays in sync automatically. No manual imports, no stale context. UNDER THE HOOD [3/8] Knowledge that evolves, not just stores Not another vector database. A custom engine built from scratch. Vector Graph Engine Maps real relationships between memories, not just similarity scores. Ontology-aware edges that understand how knowledge connects. User Understanding Model Builds deep profiles from behavior. Your AI doesn't just recall. It understands intent, preferences, and context. 100B+ tokens processed monthly · Every query <300ms BENCHMARKS [4/8] We don't think benchmarks tell the full story. But we lead every major one anyway. State-of-the-art on LongMemEval, LoCoMo, and ConvoMem. We also built MemoryBench, an open eval platform for memory systems. LongMemEval 85.2% Long-term memory evaluation LoCoMo #1 Long conversation memory ConvoMem #1 Conversational memory 5 context layers Others offer 1-2 <300ms p95 latency At any scale 100B+ tokens/month Production proven Feature Supermemory Mem0 Zep Memory Graph ✓ Partial ✗ User Profiles ✓ ✗ ✓ Document Retrieval ✓ ✗ ✓ Connectors ✓ ✗ Partial Document Extractors ✓ ✗ ✗ Sub-300ms Latency ✓ ✗ ✓ Self-hostable ✓ ✓ ✓ Consumer Plugins ✓ ✗ ✗ Open Eval Platform ✓ ✗ ✗ Read the full research paper → DEVELOPER EXPERIENCE [5/8] Setup in 5 minutes The simplest SDKs for memory. Three lines to add, one API to rule them all. TypeScript, Python, REST: pick your weapon. TypeScript Python REST API import Supermemory from 'supermemory' const client = new Supermemory() // Add a memory await client.add({ content: "User prefers dark mode and TypeScript", containerTags: ["user_123"] }) // Search memories const results = await client.search.documents({ q: "What are the user's preferences?", containerTags: ["user_123"] }) from supermemory import Supermemory client = Supermemory() # Add a memory client.add( content="User prefers dark mode and TypeScript", container_tags=["user_123"] ) # Search memories results = client.search.documents( q="What are the user's preferences?", container_tags=["user_123"] ) # Add a memory curl https://api.supermemory.ai/v3/add \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"content": "User prefers dark mode", "containerTags": ["user_123"]}' # Search memories curl https://api.supermemory.ai/v3/search \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"q": "user preferences", "containerTags": ["user_123"]}' Works with everything TypeScript Python REST API Claude Code OpenClaw OpenCode Vercel AI SDK LangChain LangGraph CrewAI OpenAI SDK Mastra Zapier n8n Pipecat See all integrations → PERSONAL SUPERMEMORY One memory. Every tool. What you teach one AI, every AI remembers. Your context follows you everywhere. Plugins available for & more Save memories from anywhere Save links, chats, PDFs, images, videos and documents in one click using our Chrome extension, apps, or API. Watch your memories come alive Supermemory indexes your memories and automatically extracts meaning to find relevant connections. Summarize the key ideas from My Gita. Which memories connect design and AI? What are the main themes across my memories? Talk to everything you've ever saved Ask questions across all your memories. Get instant answers from your bookmarks, notes, documents and conversations. Organize with projects Group memories into focused spaces for research, work, ideas, or anything you care about. 10,000+ power users 10+ integrations < 300ms recall ENTERPRISE [6/8] Loved by teams of every scale. Whether you're a startup, scaling, or a large enterprise, Supermemory's engine deploys in any stack and environment. State-of-the-art memory, wherever you need it. Some of the best teams trust Supermemory for their agents Self-host on your premise Supermemory Enterprise deploys in your VPC, your cloud, your rules. Full control. SOC 2, HIPAA, GDPR SOC 2 HIPAA GDPR See our security page → You own your data Export anytime. We don't train models on your data. Ever. TESTIMONIALS [7/8] Builders love us. And they can't stop telling the world about it. Dillon Mulroy @dillon_mulroy bullish on @supermemory Dec 31, 2025 ^ he's a VP at Cloudflare!!!! Armin Daryabegi @saasjesus This is crazy... We just ditched RAG completely and went memory only through @supermemory. Reduced avg response time from 40s -> 12s Using about 40-50% fewer tokens Just memory & near realtime web-search even for volatile information which shouldn't read a 5 days old, cached https://t.co/5jmiY9cOq9 Feb 19, 2026 ^ Switched to the enterprise plan after using it for a day. what an epic customer zan @zanbuilds vibe coding feels good but have you vibed with @supermemory yet integrated it today in my hackathon project & took 120 seconds https://t.co/48wk4fCcQc Mar 18, 2026 Nicolas codet @NicolasCodet @supermemory one of the best products ive used in a long time props to the team behind this Mar 9, 2026 Lovenya Jain @lovenyajain @signulll @supermemory is pretty great Mar 11, 2026 Kyle @kyl3kan @anshnanda @supermemory saved my ass last week Feb 23, 2026 Nick Horob @NickHorob @jakefromfargo I've had no issues! I have mine connected to @supermemory and set up with QMD. So far, so good Feb 27, 2026 lucacadalora (e/aiccelerate.id) @lucaxyzz This is beautiful, 4 days old openclaw @supermemory https://t.co/N6draHxamN Feb 23, 2026 shaan @epistetechnic @insider0x I really like @supermemory Mar 7, 2026 Harshil Mathur @harshilmathur @levelsio Tried almost everything mentioned here - structured memory files, qmd etc. the only thing that works reliably is @supermemory . Not facing any memory issues after setting it up. Feb 16, 2026 ^ CEO of a billion dollar company!!!! Ry @heyhaigh My AI agent memories just got supercharged ⚡ with @supermemory. Supermemory is a fantastic RAG tool that has provided much more sophisticated memory for my personal voice agent and for user sessions alike. Here's a full breakdown of how I've leveraged it. Hats off to https://t.co/Dh2m37Zrrs Dec 30, 2025 Zimm @Dan_Zmann @DeRonin_ Running an actual J.A.R.V.I.S. with read / write access, custom tools, live web and x search, 4 unique Google workspace auth, supermemory. Sub 2s response time, sub 1s tool call Will film a demo soon Using @zocomputer and @supermemory is cheat code Mar 17, 2026 ChaosCrux @ChaosCruxFL @corey_fransen @steeke7 @openclaw @supermemory Been running @supermemory across 4 machines and multiple repos.. watching the memory graph grow and start connecting context across projects has been one of the most satisfying things I've seen in a dev tool. The graph visualization is genuinely beautiful. https://t.co/AGVKU01jMO Feb 23, 2026 Aditya Vellanki @aditya_vellanki my entire life is pretty much stored on @supermemory now https://t.co/Lm8wgUI7kU Mar 10, 2026 M3BIONIX @m3bionix Really loved your documentation @supermemory Feb 25, 2026 Prasanna @Mrmemehead This is where @supermemory shines! A central repo tided to no single provider. Mar 2, 2026 Light 🌾 @lightwaslost @NickPlaysCrypto @openclaw try @supermemory, works well. Feb 20, 2026 Micky @Rasmic Let’s see if @supermemory cooks https://t.co/nybko6Uwy9 Feb 18, 2026 ^ Devrel at our favorite company!!!! Dominik Koch @dominikkoch Been in love with @supermemory lately such a cool thing to add to your agent Also shoutout to their founder who sat down and helped me with my supermemory installation Mar 9, 2026 Dewaldt Huysamen @dewaldt_h @supermemory I just migrated my entire ChatGPT history into Supermemory using Claude Code, parsed 1,797 conversations, extracted the facts, and automatically routed them into the right containers. Mar 14, 2026 Pikissou @pikissou @ziwenxu_ I use @supermemory, it's life-changing Feb 24, 2026 Michael Steeke 🌾 @steeke7 Its been a little under a week with @openclaw .... The memory graph, from @supermemory buildout is so cool. The goal: Make connections faster between data in my database, then learn faster, fail faster, and learn faster again. https://t.co/a1oRweCIYO Feb 21, 2026 Beau Johnson @BeauJohnson89 one openclaw hack i wish i knew first was setting up and fixing this memory problem @supermemory fixed the issue, handed it over to openclaw. no more failing memory issues. remembers everything we talk about, also improves redundant tasks that we work on, creates a skill Mar 7, 2026 Dhruval @DhruvalGolakiya Memory powered by @supermemory for each workspace So your agent has context of your all channels chats Feb 18, 2026 anmol @trex3x00 @karpathy @supermemory doin just right Feb 25, 2026 View all 25 testimonials ↓ PRICING [8/8] Simple, transparent pricing. One pricing structure covers everything — plugins, enterprise features, and API access. No separate bills, no per-product charges. Unlimited storage Unlimited users Free multi-modal extraction — in every plan FREE $0 Get started with basic memory features 1M tokens/month 10K search queries/month Unlimited storage & users Free multi-modal extraction Email support Get Supermemory → Or get the personal app → PRO $19/mo. For developers building with AI memory 3M tokens/month 100K search queries/month Unlimited storage & users Free multi-modal extraction Priority support All plugins (Claude Code, Cursor, OpenCode, OpenClaw) Get Supermemory Pro → Or get the personal app → SCALE $399/mo. For teams and production workloads 80M tokens/month 20M search queries/month Unlimited storage & users Free multi-modal extraction Dedicated support Gmail, S3, Web Crawler connectors Get Supermemory Scale → Or get the personal app → ENTERPRISE Custom Custom deployments with dedicated engineering Unlimited tokens Unlimited search queries Forward-deployed engineer Custom integrations & SSO Talk to Founder → OVERAGE · PRO & SCALE $0.01 per 1,000 tokens $0.10 per 1,000 queries Only charged when you exceed your plan limits. No surprises. Startup Program $1,000 in credits · Dedicated support · 6 months to build Apply now Your Agent needs its supermemory. Read the docs --- Supermemory is the newState-of-the-Art in agent memory A new memory architecture that solves long-term forgetting in LLMs, delivering state-of-the-art performance on LongMemEval by enabling reliable recall, temporal reasoning, and knowledge updates at scale. Check Code → Explore Docs LongMemEval-S Benchmark: Supermemory vs Zep vs Full context Category Supermemory Zep Full context single-session-user 97.14% 92.9% 81.4% single-session-assistant 96.43% 80.4% 94.6% single-session-preference 70.00% 56.7% 20.0% knowledge-update 88.46% 83.3% 78.2% temporal-reasoning 76.69% 62.4% 45.1% multi-session 71.43% 57.9% 44.3% Overall 81.6% 71.2% 60.2% SSU SSA SSP KU TR MS Introduction Large Language Models (LLMs) fundamentally suffer from "forgetting". They treat every interaction as a new, discrete event, lacking the persistent continuity required for personalized user experiences. While Context Windows are growing, LLMs are still prone to forgetting context in the middle of the context window as shown by Liu et al. [1] and high latency. In this report, we introduce Supermemory, a memory engine designed to solve the problem of long-term coherence. We demonstrate that Supermemory achieves State-of-the-Art (SOTA) results on LongMemEval_s [2], effectively solving the challenges of temporal reasoning and knowledge conflicts in high-noise environments (115k+ tokens). Authors Soham Daga ↗ AI Researcher, Supermemory Sreeram Sreedhar ↗ AI Researcher, Supermemory Dhravya Shah ↗ CEO, Supermemory The Evaluation Landscape: Why LongMemEval? Current benchmarks in the LLM memory space often fail to capture the chaos of real-world production environments. Benchmarks like LoCoMo [3] are insufficient for modern models due to their limited context size and lack of knowledge updates (testing ability to overwrite or update old and obsolete information with newer information). We utilized LongMemEval [2] for validation because it represents the most rigorous approximation of real-world chat history. It challenges the retrieval system not just on recall, but on reasoning over time and filtering out noise. Unlike other benchmarks (which present human-human interaction), LongMemEval tests for human-assistant interactions, which is representative of real-world usage, as also highlighted by Rasmussen et al. [4] LongMemEval_s spans 500 questions split into 6 categories and evaluates five core memory capabilities: Information Extraction: Accurately extracting and storing factual information from conversations. single-session-user: Retrieving literal context mentioned by the user within a single session. single-session-assistant: Retrieving literal context mentioned by the assistant within a single session. single-session-preference: Extracting implicit user preferences to inform personalized responses. Multi-Session Reasoning: Synthesizing information scattered across multiple conversation sessions. Categories: multi-session Knowledge Update: Handling scenarios where newer information contradicts or supersedes older facts. Categories: knowledge-update Temporal Reasoning: Understanding the sequence of events, calculating time intervals, and reasoning about relative timestamps. Categories: temporal-reasoning Abstaining on Unanswerable Questions: Recognizing when sufficient information is not available and appropriately declining to answer. Categories: all These capabilities cover a broad variety of general real-world use-cases. Methodology: Supermemory's Architecture Supermemory outperforms existing solutions by minimizing semantic ambiguity, which is a big reason for context not being utilized effectively in LLMs as demonstrated by Keluskar et al. [5] We achieve this by coupling memories with temporal metadata, relations, and raw chunks. 1. Chunk-based Ingestion & Contextual Memories Standard RAG (Retrieval Augmented Generation) [6] often fails because it retrieves raw chunks that lack context when isolated from the conversation. [7] Example of where standard RAG fails due to ambiguity. Chunking: We decompose large sessions into manageable semantic blocks. Memory Generation: As we index the chunk, we also generate memories — single (atomic) pieces of information that resolve ambiguous references within the chunk using a modified version of Contextual Retrieval. [8] 2. Relational Versioning & Knowledge Chains Supermemory also defines semantic relationships between new and existing memories. This allows us to map evolution of facts: updates (State Mutation): Handles contradictions or corrections (e.g., "My favorite color is now Green" updates "My favorite color is Blue"), creating a version history of sorts. extends (Refinement): Supplements existing nodes with new details without contradiction (e.g., adding a job title to an existing employment memory). derives (Inference): Captures second-order logic inferred from combining multiple distinct memories. Example of relational versioning in Supermemory. 3. Temporal Grounding A core differentiator in our architecture is the dual-layer time-stamping approach, which drove our high scores in the temporal-reasoning, knowledge-update, and multi-session categories. For every memory, we extract: documentDate: The time the conversation took place. eventDate: The extracted timestamp of when the event described in the conversation actually occurred. Example of temporal grounding in Supermemory. 4. Hybrid Search Strategy We perform semantic search on the memories to identify relevant concepts. As memories encapsulate singular pieces of information, i.e. high signal and low noise, this is generally more accurate than directly searching for the noisy chunks as noted by several sources. [9][10] Once a hit is found, we inject the original source chunk for the memory into the result output. This allows the LLM to access the "finer details" required for nuance while relying on the atomicity of the memory for high-precision retrieval. This serves to resolve the concern of information loss brought up in Section 5.2 of the LongMemEval [2] paper. 5. Session-Based Ingestion Unlike the evaluation methodology described in the LongMemEval [2] paper, which processes conversation history round-by-round (one user message followed by one assistant response), we choose to ingest the dataset session-by-session. Performance Results Supermemory demonstrates superior performance across all categories on LongMemEval_s. The system shows particular strength in Multi Session (71.43%) and Temporal Reasoning (76.69%), areas where standard vector-store approaches historically struggle. LLM-as-Judge Evaluation Categories SSU SSA SSP KU TR MS Overall Full-context (gpt-4o) 81.4% 94.6% 20.0% 78.2% 45.1% 44.3% 60.2% Zep (gpt-4o) 92.9% 80.4% 56.7% 83.3% 62.4% 57.9% 71.2% Supermemory (gpt-4o) 97.14% 96.43% 70.00% 88.46% 76.69% 71.43% 81.6% Supermemory (gpt-5) 97.14% 100% 76.67% 87.18% 81.20% 75.19% 84.6% Supermemory (gemini-3-pro) 98.57% 98.21% 70.00% 89.74% 81.95% 76.69% 85.2% Delta (gpt-4o) ↑4.56% ↑19.94% ↑23.46% ↑6.19% ↑22.90% ↑23.37% ↑14.61% How to reproduce these results We believe in transparency and rigorous cross-validation. Data & Prompts: The full prompt used for answering is provided in our appendix. For answer evaluation, we used gpt-4o with the question-specific prompts provided in the LongMemEval paper. [2] Codebase: The ingestion pipeline, search logic, and evaluation scripts are available in our GitHub repository. Conclusion The ability to accurately recall user details, respect temporal sequences, and update knowledge over time is not a "feature" — it is a prerequisite for Agentic AI. By moving beyond simple vector similarity and implementing this form of disambiguation, Supermemory provides a robust backend for enterprise applications. It transforms the LLM from a stateless processor into a stateful assistant, capable of maintaining long-term, personalized user narratives with high fidelity. Citations Liu, N. F., Lin, K., Hewitt, J., Paranjape, A., Bevilacqua, M., Petroni, F., & Liang, P. (2024). Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics, 12, 157-173. Wu, D., Wang, H., Yu, W., Zhang, Y., Chang, K. W., & Yu, D. (2024). Longmemeval: Benchmarking chat assistants on long-term interactive memory. arXiv preprint arXiv:2410.10813. Maharana, A., Lee, D. H., Tulyakov, S., Bansal, M., Barbieri, F., & Fang, Y. (2024). Evaluating very long-term conversational memory of llm agents. arXiv preprint arXiv:2402.17753. Rasmussen, P., Paliychuk, P., Beauvais, T., Ryan, J., & Chalef, D. (2025). Zep: a temporal knowledge graph architecture for agent memory. arXiv preprint arXiv:2501.13956. Keluskar, A., Bhattacharjee, A., & Liu, H. (2024, December). Do llms understand ambiguity in text? A case study in open-world question answering. In 2024 IEEE International Conference on Big Data (BigData) (pp. 7485-7490). IEEE. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33, 9459-9474. Barnett, S., Kurniawan, S., Thudumu, S., Brannelly, Z., & Abdelrazek, M. (2024, April). Seven failure points when engineering a retrieval augmented generation system. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering (pp. 194-199). Ford, D. (2024, September). Introducing Contextual Retrieval. In Anthropic Engineering Blog. ↗ Doval, Y., Vilares, J., & Gómez-Rodríguez, C. (2020). Towards robust word embeddings for noisy texts. Applied Sciences, 10(19), 6893. Shah P. (2024, August). The Effects of Data Noise on the Efficiency of Vector Search Algorithms. In LinkedIn Pulse. ↗ Appendix Answering Prompt You are a question-answering system. Based on the retrieved context below, answer the question. Question: ${question} Question Date: ${questionDate} Retrieved Context: ${retrievedContext} Understanding the Context: The context contains search results from a memory system. Each result has multiple components you can use: Memory: A high-level summary/atomic fact (e.g., "Alex loves hiking in mountains", "John reports to Maria") This is the searchable title/summary of what was stored Chunks: The actual detailed raw content where the memory was extracted from Contains conversations, documents, messages, or text excerpts This is your primary source for detailed information and facts Look here for specifics, context, quotes, and evidence Temporal Context (if present): Question Date: The date when the question was asked (provided above). Use this to understand the temporal perspective of the question. documentDate: ISO date string for when the content was originally authored/written/said by the user (NOT the system createdAt timestamp). eventDate: Array of ISO date strings for when the event/fact being referenced actually occurred or will occur. Profile Data (if present): Static Profile: Permanent user characteristics (name, preferences, core identity) Dynamic Profile: Contains a subset of the recently added memories Version: Shows if a memory has been updated/extended over time How to Answer: Start by scanning memory titles to find relevant results Read the chunks carefully - they contain the actual details you need Use temporal context to understand when things happened Use profile data for background about the user Synthesize information from multiple results if needed Instructions: If the context contains enough information to answer the question, provide a clear, concise answer If the context does not contain enough information, respond with "I don't know" or explain what information is missing Base your answer ONLY on the provided context Prioritize information from chunks - they're the raw source material Answer: Results for Zep were taken from their paper. [4] Intelligence withoutmemory is justrandomness. Talk to founder → --- Skip to main contentsupermemory | Memory API for the AI era home pageSearch...⌘KAsk AISearch...NavigationGetting StartedOverview — What is Supermemory?Developer PlatformAPI IntegrationsPluginsAPI ReferenceMemoryBenchCookbookChangelogYour DashboardDeveloper PlatformSupermemory MCPGetting StartedOverviewQuickstartInstall with AIConceptsHow Supermemory WorksGraph MemoryContent TypesSuperRAGMemory vs RAGMulti-Tenancy / FilteringUser ProfilesCustomizationAuthenticationUsing supermemoryAdd contextSearch Memories and DocsUser ProfilesManage ContentUse CasesConnectors and syncOverviewConnectorsTroubleshootingManaging ResourcesMigration GuidesFrom another providerOn this pageHow does it work? (at a glance)Supermemory is context engineering.Ingestion and ExtractionMemory API — Learned user contextUser profilesRAG - Advanced semantic searchNext stepsSupermemory is the long-term and short-term memory and context infrastructure for AI agents. It is the state of the art across multiple different benchmarks, like LongMemEval and LoCoMo. With supermemory, developers can provide perfect recall about their users to build AI agents that are more intelligent, more personalized, and more consistent. Additionally, supermemory has all the pieces of the context stack built in: Agent memory Content extraction Connectors and syncing Managed RAG platform All this, coming together, makes supermemory the best abstraction to provide to agents. ​How does it work? (at a glance) You send Supermemory text, files, and chats. Supermemory intelligently indexes them and builds a semantic understanding graph on top of an entity (e.g., a user, a document, a project, an organization). At query time, we fetch only the most relevant context and pass it to your models. ​Supermemory is context engineering. ​Ingestion and Extraction Supermemory handles all the extraction, for any data type that you have. Text Conversations Files (PDF, Images, Docs) Even videos! … and then, We offer three ways to add context to your LLMs: ​Memory API — Learned user context Supermemory learns and builds the memory for the user. These are extracted facts about the user, that: Evolve on top of existing context about the user, in real time Handle knowledge updates, temporal changes, forgetfulness Creates a user profile as the default context provider for the LLM. This can then be provided to the LLM, to give more contextual, personalized responses. ​User profiles Having the latest, evolving context about the user allows us to also create a User Profile. This is a combination of static and dynamic facts about the user, that the agent should always know Developers can configure supermemory with what static and dynamic contents are, depending on their use case. Static: Information that the agent should always know. Dynamic: Episodic information, about last few conversations etc. This leads to a much better retrieval system, and extremely personalized responses. ​RAG - Advanced semantic search Along with the user context, developers can also choose to do a search on the raw context. We provide full RAG-as-a-service, along with Full advanced metadata filtering Contextual chunking Works well with the memory engine See the full API Reference tab for detailed endpoint documentation. All three approaches share the same context pool when using the same user ID (containerTag). You can mix and match based on your needs. ​Next steps QuickstartMake your first API call in minutesHow it WorksUnderstand the knowledge graph architectureWas this page helpful?YesNoQuickstart⌘IAssistantResponses are generated using AI and may contain mistakes.SuggestionsUser profiles and tool use together for multimodal use casesI am building XYZ. how do I use user profiles with the python SDK?Contact support --- How we build supermemory - best memory engine on the planet. Supermemory Blog - Memory infrastructure for LLMs Read the follow up here! https://x.com/DhravyaShah/status/2036243995500966260?s=20 TLDR: This was a big social experiment that we did to create a new standard for reporting memory system’s quality. It was a parody. A few months ago, we published our first research report showing Supermemory Details of the March 6, 2026 service degradation at Supermemory — root cause analysis, timeline, and steps taken to prevent future incidents. We built a plugin for Claude Code and OpenCode that gives your coding agent persistent memory. It remembers your preferences, learns your codebase, and never loses context mid-conversation. The result is an agent you can run for months without starting over. Here's how it works, in the order TLDR: Today, we are releasing a new version of our openclaw plugin - https://github.com/supermemoryai/openclaw-supermemory. This post is going to be a bit technical, so bear with me (or bookmark for later!) In this post, I will talk about what we do about OpenClaw memory, and how Today, we are launching the Supermemory plugin for Claude Code! TLDR: You can use supermemory in claude code now. - https://github.com/supermemoryai/claude-supermemory Claude code has genuinely changed how I work. But there's this one thing that drives me crazy... Every day, I have to explain I'm the founder of supermemory. Clawd/Molt bot is blowing up right now, with many, many use cases. I set it up, too, and have been using it through telegram. TLDR: just go to https://supermemory.ai/docs/integrations/clawdbot to set up supermemory for your clawd bot. You are probably thinking of AI memory in the wrong way. Over the last few years, we've all seen a lot of absolutely world-changing trends in AI. Things that totally changed the way we interact with computers today. The first one was data (models start getting smarter), then “Why would I use Supermemory when I can just build memory myself?” Fair question. It’s also the classic build vs buy argument, and if you’re an engineer, your default instinct is usually correct: If something is core to your product, you should consider building it. But here’s Over the last year, one belief has guided almost everything we’ve built at Supermemory AI becomes meaningfully useful only when it remembers. Memory shouldn’t be something developers rebuild from scratch. It shouldn’t be fragile, expensive, or trapped inside a single tool. So during Unforgettable Launch Week, we If there’s one thing we’ve learned while building Supermemory, it’s that most startups don’t fail because they didn't build features; they fail when infrastructure slows them down, or they built too slow. Modern AI teams are forced to solve the same problem again and At Supermemory, we're building context engineering infrastructure for AI. A huge part of that is dealing with code: ingesting repos, understanding structure, and making it searchable. The problem is that most code chunking solutions are terrible. We built code-chunk to fix this. It's now the best Embeddings are the cornerstone of any retrieval system. And the larger the embeddings, the more information they can store. But large embeddings require a lot of memory, which leads to high computational costs and latency. To reduce this high cost, we can use models that produce embeddings with small dimensions, Details of the October 18, 2025 service degradation at Supermemory — root cause analysis, timeline, and steps taken to prevent future incidents. Let’s get practical here: have you ever dropped a PDF into Cursor, then pasted the same content into Claude just to “remind it”? Or tried to follow up on a thread, only to realize the memory lives in a different tool? It’s annoying. It breaks your flow. And Today, I am excited to announce our first funding round to accelerate our mission of building an interoperable, scalable and reliable memory for LLMs and agents. Memory is one of the hardest challenges in AI right now. We have really intelligent models (Claude, GPT-5, etc) and tools (Cursor, and many “Mem0 was not great. Glad to have found supermemory” That’s how Zaid Mukaddam, founder of Scira AI, summed up his team’s first attempt at adding memory to their product. Latency was unbearable, indexing was unreliable, and it simply didn’t scale. Scira is an open-source Perplexity alternative, built Campbell Baron, the founder of Montra, has been making videos since he was twelve. By thirteen, he was already doing brand work. Today, he’s betting on a very different future for creators: a world where recording is the exception, and most videos are generated from scratch. Montra’s vision Supermemory has a fascinating open-source tool called OpenSearchAI. It's essentially a search assistant similar to Perplexity, but it remembers everything you've searched for and enriches future responses with that memory. I thought to myself, “This seems cool. But how complicated is it to build something like Hi everyone, I’m Dhravya, the founder of Supermemory. I want to start with a little story behind why this product means so much to me. You can also skip straight to what it is and how it works below. Anyways, when I started Supermemory one year ago, we were Contract compliance reviews are a serious drain on time and focus. It’s a repetitive process that takes away from actual legal thinking, and the workflow is absolutely broken. Files live in different places. You’re never sure if you’re reading the latest version. And no one has time What is Supermemory? Supermemory completes the missing part of the LLM puzzle: memory. Just as memory is crucial for human intelligence, it's essential for truly intelligent AI systems. We've built a portable memory engine that works seamlessly across different LLMs through multiple interfaces, including an API, If you’ve ever built a retrieval-augmented generation (RAG) system using embeddings and vector databases, you already know the drill: you turn your data into vectors, stuff them into a store like FAISS, and let your model retrieve similar chunks during inference. And it works, until it doesn’t. Why People are obsessed with prompts and prompt engineering. Sure, what you say is important, but what the model knows when you say it is the difference between a stateless text generator and an intelligent AI system. In short, context is the most crucial component. Karpathy’s viral tweet called it One API to rule them all, One spec to find them, One library to bring them all and in the TypeScript, bind them. When we were building the the Infinite Chat API, initially, we only supported the OpenAI format. This was fine, until a lot of our customers started asking, Transformer-based large language models have become the poster boys of modern AI, yet they still share one stark limitation: a finite context window. Once that window overflows, performance drops like a rock or the model forgets key details. This guide walks through two complementary strategies that lift those limits: * Semantic