AI agent patterns

Production AI agents on Catalyst combine a non-deterministic component (the LLM) with a deterministic durable runtime (Dapr workflows). The patterns below are how teams reconcile the two — making agent executions reliable, replay-safe, and operable at scale.

For framework-specific implementations, see the framework guides under Develop AI agents.

Replay-safe LLM calls

Stub — populate. The LLM response must be persisted on first call so replay returns the same value. Show: an activity wrapping the LLM call, why calling the SDK directly from the orchestrator breaks determinism, and the framework-specific helpers that handle this automatically (Dapr Agents @tool, LangGraph durable wrapper, etc.).

Tool-call durability

Stub — populate. Tools must be activities, not direct calls. A tool call that times out resumes on the same agent step, not from scratch. Show: tool registration patterns, retry semantics, and how tool output flows back into LLM context on replay.

Idempotency for external side effects

Stub — populate. If an activity is retried, side effects (API calls, payments, emails) must not duplicate. Cover: idempotency keys derived from the workflow instance ID, conditional writes, and the "at-least-once with idempotent receiver" pattern.

Multi-agent coordination

Stub — populate. Cover: agent-as-activity, agent handoff via pub/sub, supervisor / sub-agent patterns, and how the Catalyst App Graph reveals the resulting topology. Source: develop/agents/dapr-agents/multi-agent-orchestrator-quickstart.mdx.

Long-running sessions and human-in-the-loop

Stub — populate. Cover: wait_for_external_event for human approval, durable timers for SLAs, and how session state is persisted across crashes. Show the request-escalation example already used in Workflow patterns but framed for agents.

Cost implications of replay

Stub — populate. Each replay re-walks the history but does not re-call activities that already completed — so persisted LLM responses are not re-billed. Cover: when token costs do recur (uncached calls, new branches in replay), how to log token usage to API Logs, and a back-of-envelope cost model.

Replay-safe LLM calls​

Tool-call durability​

Idempotency for external side effects​

Multi-agent coordination​

Long-running sessions and human-in-the-loop​

Cost implications of replay​

See also​