Building Sub-Second AI Patient Conversations for SimTutor

We’ve just shipped a major new capability for SimTutor: Realtime AI Patient Conversation — a real-time, native Speech-to-Speech engine that replaces the platform’s legacy turn-based voice pipeline.

The previous system required learners to click to record, click to send, then wait 2–3 seconds for a response. Every exchange felt like a walkie-talkie, not a clinical encounter.

The new engine delivers sub-second response latency, natural interruption handling, and reactive audio visualisations driven by the patient’s voice. Sessions auto-terminate after inactivity to manage cost, then seamlessly resume with full conversation memory.

What we solved:

• Latency: Sub-second response time (down from 2–3 seconds)

• Interruptions: AI stops immediately when the learner speaks

• Immersion: Reactive audio visualisations, clinical environments, diverse patient voices

• Cost control: Automatic session timeout with full context persistence on resume

Read the full case study →