Chatbot maintenance workflow on a technical workstation
Chatbot Maintenance Guides

Keeping chatbots
running well over time

Practical reference material for teams managing deployed chatbots — covering update cycles, intent drift, log audits, and when to escalate to a full retraining run.

6 guides Technical reference Piscandolu — since 2015
Routine maintenance
Issue diagnosis
Model updates
Reference library

What gets missed in maintenance

Most chatbot problems are not model failures — they are process gaps. These guides address the specific decisions teams face after a chatbot has been live for more than a few weeks.

02
Model updates

When retraining is worth the effort

Retraining takes time and resets your confidence baselines. Before starting, confirm that the performance issue is not a single misconfigured entity or a single outdated response — those take minutes to fix without touching the model.

Retraining is justified when more than 12% of weekly conversations end in fallback, or when a product or policy change has made a significant portion of trained examples factually wrong.

Read guide
03
Issue diagnosis

Reading confidence scores without misreading them

A low confidence score on a correct response is not the same problem as a high confidence score on a wrong response. The second is more disruptive to users and harder to catch without deliberate review.

Set separate review queues for these two failure types and prioritize the high-confidence errors — they are the ones users act on.

Read guide
Handling entity extraction failures in maintenance logs

Entity failures are quieter than intent failures — the chatbot may still respond, just with the wrong specifics filled in. A user asking about a return for a particular order number gets a generic return policy instead. Technically resolved; practically useless.

In your log review, filter for sessions where an entity slot was expected but left empty, or where the filled value does not match the expected format. Review these alongside the transcript to determine whether the issue is a pattern gap in training data or an edge case in the entity recognizer.

Add synonym lists before adding new entity examples — often faster and sufficient for coverage gaps under 6 sessions per week.
Scheduling updates without disrupting live conversations

Most chatbot platforms reload the model between conversations, not mid-session, so the risk of interrupting an active session is lower than people assume. The actual risk is deploying an update that introduces a regression — a response that worked before and no longer does.

Run a small regression test set before every deploy: a fixed list of 20–30 inputs covering your highest-volume intents with expected outputs confirmed. If all pass, deploy with low risk. If any fail, hold the update and trace the conflict in training data before proceeding.

Off-peak windows (early morning, weekends) reduce exposure but do not substitute for a regression test — run both.
Response versioning — tracking what changed and why

After a few months of updates, it becomes difficult to explain why a particular response was changed or what problem it was meant to fix. Without a simple change log, teams repeat past mistakes or undo changes that were made for a reason no one remembers.

A plain text file or shared document with date, intent name, what changed, and a one-line reason is enough. It does not need to be formal — just consistent. Teams that maintain this habit spend noticeably less time diagnosing regressions after updates.