AI Support Agent
Build an all-in-one AI support agent that uses discli serve for full Discord control — interactive components, streaming responses, per-user conversation history, and slash commands.
The problem
Your Discord server has a growing community asking repetitive questions. You need a bot that can answer them 24/7 with accurate, context-aware responses — without hardcoded replies or decision trees. You also want rich interactions: buttons for feedback, slash commands for common tasks, and streaming responses so users see the answer as it is generated.
The solution
Use ai_serve_agent.py — an all-in-one AI agent that combines the Claude Agent SDK with discli serve to provide:
- Full Discord control via CLI commands — send, reply, edit, delete, react, manage threads, channels, roles, and more
- Interactive components — buttons, select menus, and modal forms via component blocks
- Streaming responses — users see the answer word-by-word as Claude generates it
- Per-user conversation history — each user gets their own context window so the agent remembers previous questions
- Slash commands —
/ask,/help,/summarize,/clearregistered automatically on startup
Architecture
Discord Server | vdiscli serve (persistent bidirectional JSONL connection) | vai_serve_agent.py ├── Routes events (messages, slash commands, component interactions) ├── Maintains per-user conversation history ├── Sends to Claude Agent SDK for reasoning └── Writes actions to stdin (send, reply, stream, components, modals) | vDiscord Server (responses, embeds, buttons, modals)Full working code
The complete agent is available at examples/ai_serve_agent.py. Here is how to get started:
Install dependencies
pip install discord-cli-agent claude-agent-sdkMake sure your bot token is configured:
discli config set token YOUR_BOT_TOKENRun the agent
python examples/ai_serve_agent.pyThe agent will connect to Discord, register slash commands, and start listening for events.
What users can do
Send messages
Users can mention the bot in any channel:
@bot How do I reset my password?@bot What are the server rules?@bot Summarize the last 10 messages in this channelThe agent fetches conversation context, sends it to Claude, and streams the response back in real time.
Use slash commands
| Command | Description |
|---|---|
/ask question:How do I... | Ask the bot a question directly |
/help | Show available commands and capabilities |
/summarize | Summarize recent conversation in the current channel |
/clear | Clear your conversation history with the bot |
Interact with components
The agent can send buttons for feedback, select menus for choices, and modal forms for structured input. When a user clicks a button or submits a form, the agent receives the interaction and responds accordingly.
Key features
Streaming responses
When the agent generates a long answer, it streams the response so users see text appearing in real time rather than waiting for the full response:
{"action": "stream_start", "channel_id": "123", "reply_to": "456"}{"action": "stream_chunk", "stream_id": "abc", "content": "Here is "}{"action": "stream_chunk", "stream_id": "abc", "content": "the answer..."}{"action": "stream_end", "stream_id": "abc"}Per-user conversation history
Each user’s questions and the agent’s responses are stored in memory. This lets the agent handle follow-up questions naturally:
User: What is discli?Bot: discli is a Discord CLI for AI agents and humans...User: How do I install it?Bot: You can install it with pip: pip install discord-cli-agentThe /clear command resets a user’s history.
Interactive components
The agent can send buttons, select menus, and modals as part of its responses. See the Components & Modals guide for the full protocol reference.
Edge cases and pitfalls
Discord’s 2000 character limit. Streaming mode handles this automatically — stream_end splits content across multiple messages if it exceeds the limit.
Rate limits. Discord rate-limits bots to roughly 5 messages per 5 seconds per channel. If your server is busy and the bot gets many mentions at once, responses will queue up. Consider adding a cooldown or debounce mechanism.
Claude API costs. Every @mention and slash command triggers a Claude API call. In a busy server, this can add up. Consider adding a per-user cooldown to prevent abuse.
Error handling. The example agent includes basic error handling, but in production you should add fallback messages, retry logic, and monitoring.
Extending the agent
Components & Modals
Add buttons, select menus, and modal forms to your agent’s responses. See Components & Modals.
Memory and knowledge base
Store past Q&A pairs in a database and include relevant ones in Claude’s context. This lets the agent learn from previous answers without retraining.
Multi-model routing
Use a smaller model (like Haiku) for simple questions and route complex ones to Opus. Check the question length or keyword patterns to decide.
Thread-based replies
Instead of replying inline, create a thread for each support request to keep the channel clean. See Thread-Based Support.