Overview
TheSession object on ctx.session is your interface to the live call. You can speak text, interrupt the agent’s current utterance, transfer the caller, pause/resume recording, and end the call - all from within your entrypoint function.
Speaking Text
say(text)
Immediately speak a string via the configured TTS provider:
say() dispatches the utterance and returns immediately. The TTS synthesis and playback happen asynchronously on the call. Use run() to keep the call alive and process responses.set_filler(text)
Set a filler phrase hint. The bridge plays it during brief processing silences to keep the call feeling natural:
Interrupting the Agent
interrupt()
Stop the current TTS utterance immediately:
Transferring Calls
Transfer to a Human Queue
Cold-transfer the caller to a human agent queue:queue value is the queue identifier in your telephony setup.
Transfer to Another Agent
Cold-transfer to a different Unpod AI agent:Ending a Call
end(reason)
Gracefully hang up the call:
| Reason | When to use |
|---|---|
"completed" | Normal call resolution |
"no_response" | User went silent for too long |
"error" | Unrecoverable error state |
"transferred" | After a transfer (for logging) |
"max_duration" | Hard duration cap reached |
Recording Control
Pause Recording
Pause the call recording, e.g. before collecting sensitive information:Resume Recording
Per-Call Custom Data
session.data is a plain dict you can use to store anything scoped to the current call:
The Main Loop - run()
session.run() is the event loop that keeps the call alive. It:
- Reads events from the bridge (user speech, interruptions, errors)
- Fires registered hooks
- Routes user text to the dialog adapter if one is set
- Streams the adapter’s reply back via TTS
After
run() returns, the call is over. Any code after the await session.run() line runs as a post-call cleanup hook.Live Metrics
Access per-call metrics inside or afterrun():
CallMetrics fields
| Field | Type | Description |
|---|---|---|
turn_count | int | Number of dialog turns |
avg_stt_ms | float | Average STT latency per turn |
avg_llm_ms | float | Average LLM latency per turn |
avg_tts_ms | float | Average TTS latency per turn |
tokens_in | int | Total input tokens consumed |
tokens_out | int | Total output tokens generated |
Session API Reference
| Method | Signature | Description |
|---|---|---|
say | async (text: str) → None | Speak text via TTS |
interrupt | async () → None | Stop current utterance |
set_filler | async (text: str) → None | Set filler phrase |
transfer_to_human | async (queue: str) → None | Cold transfer to human |
transfer_to_agent | async (agent_id: str) → None | Cold transfer to agent |
end | async (reason: str) → None | End the call |
run | async () → None | Main event loop |
on | (event: str) → decorator | Register hook |
recording.pause | async (reason: str) → None | Pause recording |
recording.resume | async () → None | Resume recording |
metrics | property → MetricsTracker | Per-call metrics |
dialog_machine | property (get/set) | Dialog adapter |
data | dict[str, Any] | Per-call scratch space |
Out-of-Band Session Control
The controls above run inside your entrypoint, onctx.session. You can also
act on a live session from outside it - e.g. from your backend or an ops tool
- using the management SDK. These target the session by ID and are useful for supervisor handoffs, bulk control, or reacting to events elsewhere in your stack.
These operate on the orchestrator that owns the live session, so the management
client must be configured to reach it (see the SDK client
orchestrator_base_url
option). The in-call ctx.session.transfer_to_* helpers remain the simplest way
to transfer from within your dialog logic.Next Steps
Hooks & Events
React to every turn, interruption, silence, and lifecycle event.
SuperDialog Integration
Plug a dialog flow into session.dialog_machine.