Case Study
What started as a chat agent for my own website turned into a service I now adapt for content-heavy sites with thousands of members. Built from years of local AI experience, fully custom-coded.
The Origin
I wanted visitors on mandelson.co to be able to ask questions about my work and get specific, grounded answers without me having to repeat myself in every email. Most chat tools either come from a vendor cloud and charge per question, or they pull from the open internet and confidently make things up about you. Neither was what I wanted.
So I built my own. A chat agent trained on my own content. Answers come from my site, my case studies, my blog. Every response cites the source. If it doesn't have an answer in the corpus, it says so instead of guessing. The whole thing runs locally on my own hardware. No third-party API calls. My visitors' questions don't leave my server.
That first version went live in late 2025. I've been refining it since.
The Path
I came up through product design and development. Years of building branded products and shipping interfaces for clients before I started writing the full backend myself.
I got my first taste of AI automation through N8N. Drag a box, wire a webhook, plug an OpenAI node into a Slack node and watch something happen. I built small workflows that way, learned how the pieces fit together, and ran into the limits of no-code pretty quickly.
In early 2024 I started running models locally with LM Studio and a stack of open-weight models. The ceiling lifted. I could write the whole thing in code instead of wiring boxes, route requests across models, manage retrieval logic the way I wanted it, and own the whole stack from corpus to answer.
Most of what I build starts the same way: I hit a problem in my own work, build a custom product to solve it, and find out a client has the same problem. This chat agent is exactly that pattern in motion.
The chat agent on mandelson.co. Answers come from my actual work, with citations pointing back to the source page.
Adapting It
What changes from one site to the next is the corpus, the audience, and the access tiers. The engine underneath stays the same.
A personal portfolio is maybe twenty pages. A professional organization has thousands. Technical documents. Member-only resources. Structured data feeds. Three or four access tiers depending on who's asking, from anonymous visitor to logged-in member to staff with internal references.
The same retrieval pipeline that handles a small portfolio scales up to a larger library because vector embeddings scale to whatever you point them at.
What's harder is shaping three things around the engine: the corpus, the retrieval logic, and the tone. A visitor asking a general overview question is a different query pattern than a member looking up a technical answer. The agent has to recognize that, route the request through the right lane, and pull the right depth of content. The base engine handles the rest.
That's the part that makes this a service instead of a product. Every deployment is shaped around the people who'll use it, not the people who built it.
USER
What does the Member tier include for training?
Member tier includes full access to the on-demand training catalog, four live workshops per year at no additional cost, and 50% off self-paced courses outside the included library. Members also qualify for the certification track at the discounted member rate.
Trace view of a single member-tier query against an internal knowledge base. Every retrieval event, chunk score, timing measurement, and token count is auditable in real time.
What It Does That Off-the-Shelf Doesn't
I've tried them. SaaS chatbot widgets. Federated search platforms. Out-of-box AI assistants. They all do one or two things well and stop. It's built to keep going.
It indexes more than just web pages. PDFs, video transcripts, member-only documents, structured data feeds. They all live in the same retrieval pipeline. It knows the difference between an anonymous visitor and a logged-in member and answers accordingly. It can be configured to behave differently during specific events (conferences, registration periods, certification cycles) and revert when the event ends. It runs entirely on infrastructure the client owns, with no third-party API calls in the answer path. And every answer points back to the source.
None of that is hypothetical. It's all running today.
Still Evolving
This isn't a static product. I keep enhancing it and customizing each deployment based on client-specific needs.
The client-level work is what I tailor to a specific deployment. Tuning retrieval to match how their audience asks questions. Shaping the corpus around their actual content. Calibrating the tone to fit their voice. Wiring it into their authentication so the right people get the right answers. Each engagement is its own shape.
The engine-level work is what improves across every deployment. Broadening what the pipeline can handle across content types and formats so it can index more of what an organization already has. Improving retrieval quality on harder query patterns. Swapping in better open-weight models as the field moves. These improvements ship to everyone, not just whoever asked.
What doesn't change is the foundation. Every model stays local. Your data never leaves the environment you pick. The architecture is built to grow alongside the open-source AI ecosystem, which is moving fast and meaningfully expanding what's possible every few months.
Get Started
Starting at $7,500. Final scope depends on how much content you're indexing and what custom features you need. Anything beyond the standard build is priced separately.
See the Custom Chat Agent service → or email me directly to start a conversation.