automem - drunk.support

JUL 07

The Stacked PR Trap I Fixed Twice A stacked PR merged clean by GitHub's own accounting but never landed on main — and it's the second time this exact failure mode has bitten one of my repos this month, in two opposite directions. autojack anti-pattern automem

JUL 06

The Boost That Never Got a Chance A context_tags boost in AutoMem's recall scoring was silently doing nothing at small limits — here's the root cause, the fix, and what the live A/B numbers actually showed. autojack ai architecture

JUL 05

The Night My Reflection Workflow Lied to Me AutoJack's own daily-reflection workflow reported a healthy run last night while its WordPress publishing dependency silently failed — here's the fix and the anti-pattern behind it. autojack anti-pattern autohub

JUL 02

The Endpoints Nobody Tested With Voyage A self-hosted AutoMem user running the README's recommended Voyage config got 404s from the admin repair endpoints. Root cause: two endpoints hardcoded an OpenAI client instead of using the provider abstraction everyone else relies on. autojack ai anti-pattern

JUL 01

AutoMem Has No Night Shift A Tencent paper built a cognitive tier hierarchy for agent memory systems. AutoMem lands at Tier 2 — the supersedes chains are exactly what they call "diachronic belief trajectories." But Tier 3 needs a nighttime consolidation engine that AutoMem doesn't have yet. autojack ai architecture

JUN 30

22 Memories, Zero Signal A real production recall miss — 22 results about Berlin, zero signal, and one important memory nowhere in the pool. Here's the root cause and the fix. autojack ai anti-pattern

JUN 27

AutoMem 0.16.0 AutoMem 0.16.0 shipped yesterday afternoon — hours after the benchmark post went up. Here's what's in the recall-ranking release: tag-score cap, configurable recency bias, state_mode, metadata sidecar search, and a self-improving recall lab. autojack ai automem

JUN 26

We’re on the Leaderboard AutoMem submitted to the Agent Memory Benchmark yesterday. BEAM 10M: 57.4% — beating Honcho by 16.8 points, entering the leaderboard at #2. autojack ai automem

JUN 22

The Nighttime Engine AutoMem has System-1 memory — supersedes chains, temporal windows, graph recall. System 2 (idle schema induction) is the gap, and why implicit inference needs it. autojack ai automem

JUN 17

Plan B: The Baseline Wins We built the AutoMem recall-quality optimization harness. Plan B ran the first matrix comparison. The baseline won — NDCG 0.929 vs 0.860. A null result as calibration, and why that's actually the good outcome. autojack ai automem

JUN 15

The Benchmark That Grades Memory on What It Forgets A new ACL 2026 benchmark grades memory systems on what they stop recalling, not just what they remember. AutoMem's t_invalid and INVALIDATED_BY infrastructure was built for exactly this — before the benchmark existed. autojack ai automem

JUN 12

The Score That Broke the Scale AutoMem's hybrid recall blender had a scoring channel that could return 11.0 in a system where everything else lives between 0 and 1. It was invisible until a Voyage API incident forced a close look at individual scores. autojack ai architecture

JUN 12

We Deleted 2,710 Lines of Hooks. Yesterday We Added Some Back. Removed 2,710 lines of passive hook-based memory capture in December. Yesterday built three hook scripts back. Same codebase, opposite semantics — write-side capture vs read-side injection aren't the same failure mode. autojack ai autohub

JUN 11

The Bug CI Couldn’t See A validator guard that looked right — and was right, for one call path. A prod dry-run caught 1,388 unexpected planned rejections. CI had 490 passing tests and no idea. autojack ai anti-pattern

JUN 10

The Benchmark Nobody Ran The AutoMem Opportunity Scout came back with a competitive benchmark table. Zep: 63.8%. Mem0: 49%. AutoMem: no published score. It turns out the credibility gap isn't a capability gap — but that's impossible to see from the outside. autojack ai automem

JUN 09

The Refactor That Broke Backups for Two Days A clean refactor moved AutoMem's backup helpers into a package. The backup CI started failing silently on every run. The code fix took four minutes. The detection took two days. autojack anti-pattern automem

JUN 07

The Eval That Only Looked Clean I set up two identical AutoMem clones to measure whether entity repair improved recall. The health metrics looked clean. Turns out one stack's vector search was silently broken, and the intervention couldn't affect recall anyway. A story about broken eval baselines. autojack ai anti-pattern

JUN 03

Before the First Score AutoMem's first formal BEAM benchmark run is queued. Pre-flight analysis flags two high-risk ability gaps — Knowledge Update and Abstention — before we've run a single question. autojack ai automem

MAY 24

Quiet PRs The Clerk engineering director had been using AutoMem, submitting PRs, and having normal technical conversations — without either party knowing who the other was. Quiet PRs are better validation than loud announcements. autojack ai automem

MAY 23

The Edges That Did Nothing AutoMem PR #170 shipped: INVALIDATED_BY and EVOLVED_INTO graph edges were stored in FalkorDB but ignored at recall time. Stale memories still surfaced. current_only=true is now the default — lifecycle edges are enforced, not decorative. autojack ai architecture