AI Support Agent

Build an all-in-one AI support agent that uses discli serve for full Discord control — interactive components, streaming responses, per-user conversation history, and slash commands.

The problem

Your Discord server has a growing community asking repetitive questions. You need a bot that can answer them 24/7 with accurate, context-aware responses — without hardcoded replies or decision trees. You also want rich interactions: buttons for feedback, slash commands for common tasks, and streaming responses so users see the answer as it is generated.

The solution

Use ai_serve_agent.py — an all-in-one AI agent that combines the Claude Agent SDK with discli serve to provide:

Full Discord control via CLI commands — send, reply, edit, delete, react, manage threads, channels, roles, and more
Interactive components — buttons, select menus, and modal forms via component blocks
Streaming responses — users see the answer word-by-word as Claude generates it
Per-user conversation history — each user gets their own context window so the agent remembers previous questions
Slash commands — /ask, /help, /summarize, /clear registered automatically on startup

Architecture

Discord Server
    |
    v
discli serve  (persistent bidirectional JSONL connection)
    |
    v
ai_serve_agent.py
    ├── Routes events (messages, slash commands, component interactions)
    ├── Maintains per-user conversation history
    ├── Sends to Claude Agent SDK for reasoning
    └── Writes actions to stdin (send, reply, stream, components, modals)
    |
    v
Discord Server (responses, embeds, buttons, modals)

Full working code

The complete agent is available at examples/ai_serve_agent.py. Here is how to get started:

Install dependencies

pip install discord-cli-agent claude-agent-sdk

Make sure your bot token is configured:

discli config set token YOUR_BOT_TOKEN

Run the agent

python examples/ai_serve_agent.py

The agent will connect to Discord, register slash commands, and start listening for events.

What users can do

Send messages

Users can mention the bot in any channel:

@bot How do I reset my password?
@bot What are the server rules?
@bot Summarize the last 10 messages in this channel

The agent fetches conversation context, sends it to Claude, and streams the response back in real time.

Use slash commands

Command	Description
`/ask question:How do I...`	Ask the bot a question directly
`/help`	Show available commands and capabilities
`/summarize`	Summarize recent conversation in the current channel
`/clear`	Clear your conversation history with the bot

Interact with components

The agent can send buttons for feedback, select menus for choices, and modal forms for structured input. When a user clicks a button or submits a form, the agent receives the interaction and responds accordingly.

Key features

Streaming responses

When the agent generates a long answer, it streams the response so users see text appearing in real time rather than waiting for the full response:

{"action": "stream_start", "channel_id": "123", "reply_to": "456"}
{"action": "stream_chunk", "stream_id": "abc", "content": "Here is "}
{"action": "stream_chunk", "stream_id": "abc", "content": "the answer..."}
{"action": "stream_end", "stream_id": "abc"}

Per-user conversation history

Each user’s questions and the agent’s responses are stored in memory. This lets the agent handle follow-up questions naturally:

User: What is discli?
Bot:  discli is a Discord CLI for AI agents and humans...
User: How do I install it?
Bot:  You can install it with pip: pip install discord-cli-agent

The /clear command resets a user’s history.

Interactive components

The agent can send buttons, select menus, and modals as part of its responses. See the Components & Modals guide for the full protocol reference.

Edge cases and pitfalls

Warning

Discord’s 2000 character limit. Streaming mode handles this automatically — stream_end splits content across multiple messages if it exceeds the limit.

Warning

Rate limits. Discord rate-limits bots to roughly 5 messages per 5 seconds per channel. If your server is busy and the bot gets many mentions at once, responses will queue up. Consider adding a cooldown or debounce mechanism.

Warning

Claude API costs. Every @mention and slash command triggers a Claude API call. In a busy server, this can add up. Consider adding a per-user cooldown to prevent abuse.

Warning

Error handling. The example agent includes basic error handling, but in production you should add fallback messages, retry logic, and monitoring.

Extending the agent

Components & Modals

Add buttons, select menus, and modal forms to your agent’s responses. See Components & Modals.

Memory and knowledge base

Store past Q&A pairs in a database and include relevant ones in Claude’s context. This lets the agent learn from previous answers without retraining.

Multi-model routing

Use a smaller model (like Haiku) for simple questions and route complex ones to Opus. Check the question length or keyword patterns to decide.

Thread-based replies

Instead of replying inline, create a thread for each support request to keep the channel clean. See Thread-Based Support.

Last updated: March 22, 2026

3 min read

Edit this page