Excel

The hidden cost of Excel: where the time actually goes

HCOMS April 2026 8 min read

The interesting question for a software studio in 2026 isn’t "should we add AI?" — the marketing department has already answered that. It’s "how do you embed an AI assistant inside production software for an institutional client without turning their compliance officer’s hair white?"

We shipped one inside our Diocese Management System in 2024 and have learned six things since. Here they are, ordered by how badly we wished we’d known them sooner.

1. The first thing to design is what the AI is not allowed to do

Before you write a single prompt, write the list of things the assistant must refuse. For us:

The system prompt enforces these as hard refusals. The model is instructed to say "I can’t help with that — please ask {named human}" rather than try and fail.

2. Role-scoping is non-negotiable

The data the AI assistant sees has to be exactly what the logged-in user is authorised to see. Not most. Not almost. Exactly.

The cleanest pattern: do the database query first, with the user’s normal permissions. Then hand the rows to the LLM with a prompt like "answer the user’s question using only the rows below." The model never sees data the user wasn’t entitled to. Cross-tenant leakage simply isn’t possible.

The wrong pattern, which several big SaaS vendors are still shipping: dump the whole database into a vector store, embed the user’s query, and trust the retrieval to filter. It’ll be 99% right and 1% catastrophic.

3. Audit-trail every interaction

Every prompt, every response, every tool call — written to an immutable log with the user ID, the timestamp, and the model version. This is the question every regulated buyer asks within the first thirty minutes of a demo. Not having an answer is the difference between a contract and a polite "we’ll think about it".

The log doesn’t need to be searchable in real time. It needs to exist and be exportable when the auditor asks.

4. Refuse confidently, often

The instinct of most LLMs is to be helpful. In a regulated context, that’s the wrong instinct. We tune ours to refuse aggressively when:

The first version of our assistant tried to be helpful in all cases. We retrained, tightened the prompts, and it’s now refusing more — with much higher trust scores from users.

5. The model is the cheap bit. The infrastructure is the work.

Per token, modern frontier models cost almost nothing. The work is everywhere else:

6. Don’t hide that it’s AI

UI signals matter. We mark every AI response with a small indicator and keep the underlying source rows expandable. Users learn to trust the assistant precisely because they can see what it based its answer on. The opposite design — making the assistant feel omniscient — backfires the first time it gets something subtly wrong.

What to take away if you’re considering this

If you’re thinking about embedding AI into a regulated product, the order of work matters:

  1. Write the refusal list before any prompts.
  2. Build the role-scoped retrieval layer next.
  3. Get audit-logging in before user testing.
  4. Then start on the prompts.
  5. The model choice is last, and it’ll change every six months anyway.

If your team is figuring out how to do this for a system you already run, we’ve been there. Half the value of getting it right is knowing which questions to ask up front.

Related notes