Custom Chat Agent

Case Study

Custom Chat Agent

What started as a chat agent for my own website turned into a service I now adapt for content-heavy sites with thousands of members. Built from years of local AI experience, fully custom-coded.

Role

Solo Developer & AI Engineer

Timeline

Originally late 2025, evolving since

Stack

Local ModelsRAGHybrid RetrievalVector EmbeddingsOn-Prem / Private

Status

mandelson.co/chat

The Origin

I built it for my own site first.

I wanted visitors on mandelson.co to be able to ask questions about my work and get specific, grounded answers without me having to repeat myself in every email. Most chat tools either come from a vendor cloud and charge per question, or they pull from the open internet and confidently make things up about you. Neither was what I wanted.

So I built my own. A chat agent trained on my own content. Answers come from my site, my case studies, my blog. Every response cites the source. If it doesn't have an answer in the corpus, it says so instead of guessing. The whole thing runs locally on my own hardware. No third-party API calls. My visitors' questions don't leave my server.

That first version went live in late 2025. I've been refining it since.

The Path

Product design background. Local AI since early 2024.

I came up through product design and development. Years of building branded products and shipping interfaces for clients before I started writing the full backend myself.

I got my first taste of AI automation through N8N. Drag a box, wire a webhook, plug an OpenAI node into a Slack node and watch something happen. I built small workflows that way, learned how the pieces fit together, and ran into the limits of no-code pretty quickly.

In early 2024 I started running models locally with LM Studio and a stack of open-weight models. The ceiling lifted. I could write the whole thing in code instead of wiring boxes, route requests across models, manage retrieval logic the way I wanted it, and own the whole stack from corpus to answer.

Most of what I build starts the same way: I hit a problem in my own work, build a custom product to solve it, and find out a client has the same problem. This chat agent is exactly that pattern in motion.

Custom chat agent embedded on mandelson.co answering a question about a recent project with inline source citations

The chat agent on mandelson.co. Answers come from my actual work, with citations pointing back to the source page.

Adapting It

What changes is the content. The architecture stays.

What changes from one site to the next is the corpus, the audience, and the access tiers. The engine underneath stays the same.

A personal portfolio is maybe twenty pages. A professional organization has thousands. Technical documents. Member-only resources. Structured data feeds. Three or four access tiers depending on who's asking, from anonymous visitor to logged-in member to staff with internal references.

The same retrieval pipeline that handles a small portfolio scales up to a larger library because vector embeddings scale to whatever you point them at.

What's harder is shaping three things around the engine: the corpus, the retrieval logic, and the tone. A visitor asking a general overview question is a different query pattern than a member looking up a technical answer. The agent has to recognize that, route the request through the right lane, and pull the right depth of content. The base engine handles the rest.

That's the part that makes this a service instead of a product. Every deployment is shaped around the people who'll use it, not the people who built it.

10:24:08 membership MEMBER page /membership/benefits 1 msg

USER

What does the Member tier include for training?

RETRIEVAL 42MS · 5 CHUNKS
#1

Member Tier — Training Benefits

Tier Comparison Guide — Section 3

DOCUMENT 0.091

From "Member Tier — Training Benefits" (Tier Comparison Guide — Section 3):

Includes unlimited access to the on-demand training library, four live workshops per year at no additional cost, and discounted rates on self-paced courses.

Source URL
#2

Annual Training Catalog

Training Catalog 2027 — Member Pricing

DOCUMENT 0.087

From "Annual Training Catalog" (Training Catalog 2027 — Member Pricing):

Member rate on premium courses is 50% of the standard catalog price. Bundled course tracks are available at an additional discount.

Source URL
#3

Live Workshop Schedule

Workshop Calendar — Spring Series

DOCUMENT 0.082
#4

Member Discount Structure

Pricing FAQ — Subscription Tiers

DOCUMENT 0.081
#5

Certification Track Eligibility

Certification Guide — Member Path

DOCUMENT 0.080
ASSISTANT TTFT 712MS · TOTAL 1.4s · 96 TOK

Member tier includes full access to the on-demand training catalog, four live workshops per year at no additional cost, and 50% off self-paced courses outside the included library. Members also qualify for the certification track at the discounted member rate.

Trace view of a single member-tier query against an internal knowledge base. Every retrieval event, chunk score, timing measurement, and token count is auditable in real time.

What It Does That Off-the-Shelf Doesn't

More ground than anything you'd find on a marketplace.

I've tried them. SaaS chatbot widgets. Federated search platforms. Out-of-box AI assistants. They all do one or two things well and stop. It's built to keep going.

It indexes more than just web pages. PDFs, video transcripts, member-only documents, structured data feeds. They all live in the same retrieval pipeline. It knows the difference between an anonymous visitor and a logged-in member and answers accordingly. It can be configured to behave differently during specific events (conferences, registration periods, certification cycles) and revert when the event ends. It runs entirely on infrastructure the client owns, with no third-party API calls in the answer path. And every answer points back to the source.

None of that is hypothetical. It's all running today.

Still Evolving

Growing on two fronts.

This isn't a static product. I keep enhancing it and customizing each deployment based on client-specific needs.

The client-level work is what I tailor to a specific deployment. Tuning retrieval to match how their audience asks questions. Shaping the corpus around their actual content. Calibrating the tone to fit their voice. Wiring it into their authentication so the right people get the right answers. Each engagement is its own shape.

The engine-level work is what improves across every deployment. Broadening what the pipeline can handle across content types and formats so it can index more of what an organization already has. Improving retrieval quality on harder query patterns. Swapping in better open-weight models as the field moves. These improvements ship to everyone, not just whoever asked.

What doesn't change is the foundation. Every model stays local. Your data never leaves the environment you pick. The architecture is built to grow alongside the open-source AI ecosystem, which is moving fast and meaningfully expanding what's possible every few months.

Get Started

Have a content library that should talk back?

Starting at $7,500. Final scope depends on how much content you're indexing and what custom features you need. Anything beyond the standard build is priced separately.

See the Custom Chat Agent service → or email me directly to start a conversation.