Building AI Agents That Actually Know You
The problem with every AI assistant I've used is the same: it doesn't know me. Every conversation starts from zero. It doesn't know what I said to another bot yesterday, what my goals are this quarter, or what I committed to doing this morning. It's a brilliant stranger, every time.
So I built a system — from my desk in Cape Town — where AI agents share a knowledge base, remember conversations across sessions, proactively review my week, and hold me accountable to my own commitments. It's called VaultBots, and it's now in its third major iteration.
Why generic chatbots fail
ChatGPT is a great tool. But it's a general-purpose tool with no memory of who you are beyond the current conversation. Custom GPTs help — you can stuff a system prompt with context — but they hit a ceiling fast.
The limitations that bothered me:
- —No persistent memory. It forgets everything between sessions. You re-explain your situation every time.
- —No shared context. If you use one bot for coaching and another for task management, they're siloed. Your coaching bot doesn't know what's on your calendar. Your task bot doesn't know your coaching commitments.
- —No proactive behaviour. It waits for you to start. A real coach prepares before sessions, reviews your week, and reaches out when you've gone quiet.
- —No real-time voice. ChatGPT's voice mode is impressive, but you can't integrate it with your own systems or data.
- —No privacy control. Everything you type goes to OpenAI's servers with no control over what gets stored or how.
I wanted agents that were deeply personalised, interconnected, proactive, and under my control.
What I built
Four AI bots running 24/7 on a VPS, each with a distinct role — all powered by Claude Sonnet 4.6:
- —EliteForge — life and business coach. Brutally honest. Reads my journal entries and calls out patterns I don't want to see.
- —Emma — relationship coach. Grounded in NVC and attachment theory. Three modes: dating, early relationship, committed.
- —Jarvis — personal assistant. 43 integrated tools — Google Calendar, Gmail, Drive, Sheets, weather, news, habit tracking, screen time, meeting prep, and a shared to-do list.
- —Olivia — psychologist. Trauma-informed therapeutic support. IFS, ACT, attachment theory, grief work.
All four bots read from the same Obsidian vault — 340+ notes covering goals, values, journal entries, project documentation, and personal reflections. When my coach references something I wrote three months ago, it's because it actually read the note.
The architecture that makes it work
Shared knowledge base with semantic search
Every bot reads from the same vault. This is the single most important design decision. When EliteForge sets a commitment ("run 3 times this week"), Jarvis can see it and proactively ask if I've done it. When I tell Olivia about a recurring frustration, the context is available to EliteForge for coaching.
The vault is indexed using Gemini embeddings with Pinecone vector search, backed by TF-IDF as a fallback. A background indexer detects changed files every 5 minutes and re-embeds them. When a bot needs context, it runs a hybrid search — semantic similarity for conceptual matches, keyword matching for exact references.
The vault syncs automatically between my laptop and the server via Git. A vault-sync container pulls changes every 60 seconds. Notes I write locally show up on the server within a minute.
Multi-layer memory
This is what separates it from a chatbot. Each bot has four layers of memory:
- —Working memory — the current conversation, with token-aware compression when it gets long.
- —Long-term memory — facts extracted from every conversation and stored across sessions. "Matt's brother Jim lives in Greytown" persists whether I mention it again or not. Facts decay over time if unused, keeping memory fresh.
- —Pattern synthesis — recurring themes detected across sessions. When a pattern appears three or more times, the bot surfaces it. "You've deflected from this topic three sessions in a row" isn't a guess — it's data.
- —Plans — multi-session goal tracking. EliteForge and Jarvis can create plans with subtasks, track progress across conversations, and inject the active plan into every response.
Every time a session ends — whether I close it manually or it times out — the system automatically extracts facts, commitments, insights, and patterns. Nothing is lost between conversations.
The weekly intelligence cycle
This is the feature that turned VaultBots from a reactive chatbot into something closer to an actual coaching practice.
Every Sunday evening, the system runs a cross-bot intelligence review:
- —Gathers all session transcripts, commitment completion rates, open items, and vault goals from the past week.
- —Analyses the data — what moved forward, what slipped, what patterns are recurring, where the gap is between stated goals and actual behaviour.
- —Produces a structured weekly review (wins, slides, patterns, goal-action gap) and per-bot prep briefs.
Each bot gets a personalised prep brief containing:
- —Recent themes from its sessions
- —Open loops — unfinished conversations or commitments
- —Knowledge gaps — things the bot doesn't know about me that would improve its coaching
- —Questions to weave in — specific questions to ask naturally in the next session
- —Focus area for the week ahead
On Wednesday, a mid-week check-in surfaces my open commitments and asks directly whether I followed through.
The result: when I open a coaching session on Monday, Chris doesn't start cold. He knows what happened last week, what I avoided, what questions he wants to ask, and what would move the needle most.
Two-tier write permissions
Bots need to write — session logs, daily notes, task updates. But I don't want an AI silently editing my personality profile or values notes.
The solution: two tiers.
- —Tier 1 (direct write): Operational files — session logs, daily notes, task lists. Bots write these freely.
- —Tier 2 (staged): Self-knowledge files — personality profile, values, thinking style. Bots propose changes, but nothing is committed until I review and approve.
This balances usefulness with control. The bot can say "based on our conversation, I'd update your Strengths note to include X" — but it can't just do it.
Voice AI over data
The bots work through two interfaces: Telegram (text and voice notes) and LiveKit (real-time voice calls over WiFi/4G/5G).
The voice implementation uses WebRTC through LiveKit, with Whisper for speech-to-text and OpenAI's TTS for text-to-speech. The result is a phone call with your AI coach — natural turn-taking, interruption handling, and silence detection.
Same bot personality regardless of interface. Whether I type a message at my desk or call from my car, the conversation continues with full context.
Production infrastructure
The whole platform runs on a single VPS via Docker Compose — five containers:
- —vaultbots — all four Telegram bots + API server + 7 scheduled background jobs
- —livekit-agent — real-time voice agent
- —caddy — reverse proxy with automatic HTTPS
- —vault-sync — Git-based vault synchronisation
- —session-sync — copies session logs and weekly reviews to the vault
Everything is managed as a single deployment. docker compose up -d and the entire platform starts. Logs are centralised. Restarts are automatic. 102 unit tests verify the system before every deploy.
What surprised me
The prep system changed everything. When bots prepare for sessions — reviewing the week, identifying knowledge gaps, carrying specific questions — the quality of coaching conversations jumped noticeably. It's the difference between a therapist reading your file before you walk in and one who asks "so, what's been going on?" every time.
Cross-bot awareness creates real accountability. When Jarvis mentions an EliteForge commitment during a calendar review, it creates pressure that a single bot can't replicate. It's the difference between one person knowing your goals and your whole support team knowing them.
Voice changes the interaction quality. Typing to a coaching bot feels like filling out a form. Talking to one feels like a conversation. The depth of insight I get from voice sessions is noticeably better — probably because I share more when speaking naturally than when typing.
The privacy model matters. Knowing that my therapy conversations are on my own server, not in OpenAI's training data, changes what I'm willing to discuss. This is directly relevant for businesses considering AI for sensitive operations — HR, legal, financial data.
Automated session processing catches what you forget. When every session automatically extracts facts, commitments, insights, and patterns, nothing slips through the cracks. I don't need to remember to tell the next bot what I discussed with the last one.
What I'd do differently
Start with fewer bots. Four was ambitious. Two (coach + assistant) would have validated the architecture faster. The relationship coach and psychologist could have been added later once the shared knowledge base was proven.
Build the weekly review cycle earlier. This should have been in the first version. Without it, every session starts from scratch and the bots can't learn from patterns across weeks. It's the single highest-leverage feature in the system.
The business lesson
Every business that wants AI agents eventually runs into the same problem I started with: the agent doesn't know enough about the business to be useful. It's part of why 67% of organisations lack the internal expertise to implement AI effectively (IDC) — the expertise isn't just about building models, it's about connecting them to real business knowledge.
A customer service bot that can't access order history is just a worse FAQ page. An internal Q&A agent that doesn't read your actual documentation is just a worse Google search. A sales assistant that doesn't know your pricing, products, and customer segments is just a worse chatbot.
The architecture that makes VaultBots work — shared knowledge base, persistent multi-layer memory, cross-agent awareness, proactive intelligence cycles, privacy controls — is the same architecture that makes business AI agents work. The specific bots are different (customer service instead of coaching, internal assistant instead of Jarvis), but the infrastructure patterns are identical.
The key decisions are always the same:
- —What knowledge does the agent need access to? (Your CRM, documentation, order system, pricing rules)
- —What can it read versus write? (It should look up orders freely but maybe not issue refunds without approval)
- —How does it remember? (Session memory, cross-session facts, pattern detection, knowledge base indexing)
- —How does it stay proactive? (Scheduled reviews, check-ins, commitment tracking, gap analysis)
- —How do you keep it running? (Container orchestration, monitoring, automatic restarts, test coverage)
These aren't theoretical questions. They're the exact decisions I work through with clients during implementation sprints.
Frequently Asked Questions
How do you build a custom AI agent?
You start with the knowledge base — what the agent needs to know about your business. Then you define its role, permissions (what it can read vs write), memory architecture (how it retains context across sessions), and which AI model matches the task complexity. The build itself is code (Python, typically), APIs connecting to your systems, and container-based deployment for reliability. The architecture matters more than the model choice.
Can AI agents share knowledge with each other?
Yes — that's the key design pattern behind VaultBots. Multiple agents read from the same knowledge base, and what one agent learns in a conversation is automatically available to others via shared long-term memory. For businesses, this means your customer service agent and your internal assistant can share context about a client without manual handoff.
How much does it cost to run AI agents?
The infrastructure is cheap — a single VPS costs $6–$12/month. The main cost is LLM API usage, which depends on volume and model choice. Background tasks (session processing, weekly reviews, compression) use cheaper models automatically. A typical four-agent setup with full memory and proactive features runs under $45/month.
What's the difference between a chatbot and an AI agent?
A chatbot responds to questions using a fixed knowledge base — it's reactive and stateless. An AI agent can reason about tasks, use tools, access live data from your systems, remember previous interactions across sessions, detect patterns in your behaviour, proactively surface insights, and take actions with some degree of autonomy. A chatbot is a better FAQ page. An agent is a digital team member that prepares for meetings.
If you're thinking about building AI agents for your business — not chatbots, but agents that actually know your operations and can act on them — let's talk about what that architecture looks like for your specific use case.
Chartered accountant turned AI builder. I help mid-market businesses implement AI that delivers measurable ROI — from strategy through to deployed, working software.
More about MattWorking on something similar? I help mid-market businesses turn AI ideas into deployed, working software.
Let's talk