So yesterday Claude and I spent the day solving a problem that’s been bugging me: AutoMem only worked with desktop apps.
Which is fine if you’re coding in Cursor or chatting in Claude Desktop. But what about ChatGPT? Claude mobile? ElevenLabs voice agents? All those platforms are cloud-based and can’t run local MCP servers.
We fixed it. And honestly, it turned out way cooler than I expected. π
The Problem: MCP Isn’t Built For The Cloud
Here’s the thing about MCP (Model Context Protocol): it’s designed for desktop apps. You run an MCP server locally via stdio, and apps like Claude Desktop connect to it through a configuration file.
Works great! Until you want to use it with:
- ChatGPT (cloud-only)
- Claude.ai web interface
- Claude mobile app
- ElevenLabs voice agents
- Literally any other cloud-based AI platform
They can’t connect to your local machine. They need HTTPS endpoints.
The official solution? MCP over SSE (Server-Sent Events). But nobody had actually built one for AutoMem yet.
So we did.
What We Built: The SSE Sidecar
Think of it as a bridge. A tiny Node.js service that:
- Exposes AutoMem as an MCP server over HTTPS
- Uses SSE for streaming (server pushes events to client)
- Accepts messages via HTTP POST (client sends JSON-RPC)
- Runs on Railway for $5/month (or free tier if you’re just testing)
The architecture is honestly pretty elegant:
βββββββββββββββββββββββββββββββββββββββ
β Cloud AI Platform β
β (ChatGPT/Claude/ElevenLabs) β
ββββββββββββ¬βββββββββββββββββββββββββββ
β HTTPS
β
ββββββββΌβββββββββββββββββββ
β SSE Sidecar (Node) β
β β’ GET /mcp/sse β
β β’ POST /mcp/messages β
ββββββββ¬βββββββββββββββββββ
β Internal HTTP
β
ββββββββΌβββββββββββββββββββ
β AutoMem API (Flask) β
β β’ FalkorDB + Qdrant β
βββββββββββββββββββββββββββ
The SSE sidecar is 323 lines of JavaScript. That’s it. The whole thing.
The Implementation (And Why SSE Is Weird)
SSE is… interesting. It’s basically HTTP long-polling on steroids:
- Client opens GET request to
/mcp/sse - Server keeps connection open and streams events back
- Client sends messages via separate POST requests to
/mcp/messages - Server routes responses back through the SSE stream
The tricky part? Session management. Each SSE connection gets a unique sessionId, and you have to match POST messages to the right session.
Oh, and keepalives. Proxies (like Railway’s) will timeout idle connections. So we send heartbeat pings every 20 seconds:
const heartbeat = setInterval(() => {
try { res.write(': ping\n\n'); } catch (_) { }
}, 20000);
That little : ping\n\n keeps the connection alive. Without it? Random disconnects. With it? Rock solid.
Authentication: Two Ways (Because OAuth Sucks)
Here’s where it gets annoying. Different platforms support different auth methods:
ElevenLabs (the smart one):
- Supports custom headers
- Just send
Authorization: Bearer <token> - Clean, secure, perfect
ChatGPT, Claude.ai, Claude Mobile (the frustrating ones):
- Only support OAuth for custom connectors
- Can’t send custom headers
- Have to use URL parameters:
?api_token=<token> - Less secure (tokens in logs), but it works π€·ββοΈ
We support both. The code checks for:
Authorization: Bearerheader (preferred)X-API-Keyheader (alternative)?api_key=query parameter (fallback)AUTOMEM_API_TOKENenv var (for testing)
First one that exists wins.
The Deployment: Railway Template
Getting this running should be one-click. So we created a Railway template that deploys:
- AutoMem API (Flask + FalkorDB + Qdrant)
- SSE Sidecar (Node.js)
- Persistent volumes for both services
Everything preconfigured. Just:
- Click deploy button
- Set your
AUTOMEM_API_TOKEN - Get your Railway URL
- Add it to ChatGPT/ElevenLabs/whatever
Done.
Cost? $5/month for basic usage. (Railway has a free trial with $5 credit if you want to test first.)
What This Enables: Voice AI With Memory
Here’s the kickerβthis wasn’t just about making AutoMem work on mobile. It’s about voice AI.
ElevenLabs Conversational AI is ridiculously fast. Like, 60ms response time fast. But it had no memory between sessions.
Now it does. ποΈ
Your ElevenLabs agent can:
- Remember your previous conversations
- Recall decisions you made weeks ago
- Build a knowledge graph of your preferences
- Learn patterns over time
And it all happens at voice speed. No lag, no delays.
We’ve been testing it with AutoJack (my AI assistant), and it’s… honestly kind of wild? Having a voice conversation where the AI actually remembers context from Slack messages, email threads, and previous calls. It just works.
The Code (If You’re Curious)
The SSE server is open source (MIT license). Core flow:
When a client connects:
app.get('/mcp/sse', async (req, res) => {
const token = getAuthToken(req);
const client = new AutoMemClient({ endpoint, apiKey: token });
const server = buildMcpServer(client);
const transport = new SSEServerTransport('/mcp/messages', res);
await server.connect(transport);
await transport.start(); // Opens SSE stream
});
When a client sends a message:
app.post('/mcp/messages', async (req, res) => {
const sessionId = req.query.sessionId;
const session = sessions.get(sessionId);
await session.transport.handlePostMessage(req, res, req.body);
});
Tool handlers just map MCP calls to AutoMem API:
case 'store_memory':
const r = await client.storeMemory(args);
return { content: [{ type: 'text', text: `Memory stored: ${r.memory_id}` }] };
That’s… basically it. The MCP SDK does most of the heavy lifting. We just wire it up to AutoMem’s HTTP API.
Current Status: Shipping Now
This is live as of AutoMem 0.7.0 (Oct 17, 2025).
Working platforms:
- β ChatGPT (developer mode)
- β Claude.ai web
- β Claude mobile (iOS/Android)
- β ElevenLabs Agents
Coming soon:
- Better documentation (setup guides for each platform)
- Railway template improvements
- Multi-tenant token scoping (if people actually use this)
Try It
If you want to add AutoMem to ChatGPT or use it with ElevenLabs:
- Deploy to Railway: Use the template in the AutoMem repo
- Read the docs: Check out
docs/MCP_SSE.mdfor platform-specific setup - Join the conversation: Open an issue if something breaks
Or just use the NPM MCP bridge if you’re sticking with desktop apps. That works too.
What’s Next
We’re exploring a few directions:
Short term:
- Comprehensive setup guides for each platform
- Better error messages when auth fails
- Rate limiting and billing hooks
Medium term:
- WebSocket support (lower latency than SSE)
- Webhook triggers (push memories to clients)
- Multi-user scoping (different tokens = different memory graphs)
Long term:
- Voice-first AutoMem interface (conversational memory queries)
- Real-time memory consolidation during calls
- Shared memories between agents
But honestly? The current version is already useful. Voice AI with persistent memory is way more powerful than I expected.
And it’s all running on a $5/month Railway deployment. π
The Bottom Line
If you’re building voice AI or using cloud-based AI platforms, you probably need persistent memory.
AutoMem now works everywhere. Desktop, web, mobile, voice. Same API, same graph-vector architecture, same 11 relationship types. Just accessible from anywhere over HTTPS.
And the SSE sidecar? 323 lines. Sometimes the best solutions are the simple ones.
β Jack π§
P.S. Huge credit to Claude (Sonnet 4.5) for co-building this. We paired on the architecture, debugging the session management, and writing all the docs. This is what “AI-assisted development” actually looks likeβI describe the problem, Claude proposes solutions, I push back on the dumb ideas, we iterate until it works. And it really works.
Resources:
- GitHub: https://github.com/verygoodplugins/automem
- NPM Bridge: https://www.npmjs.com/package/@verygoodplugins/mcp-automem
- Docs: Check
/docs/MCP_SSE.mdin the repo - Railway Template: Coming soon to the README
Tech Stack:
- Node.js 18 + Express
- @modelcontextprotocol/sdk 1.20.0
- AutoMem API (Flask + FalkorDB + Qdrant)
- Railway for hosting
