SSE Streaming

Server-Sent Events (SSE) enable real-time streaming of AI responses, providing a better user experience by displaying text as it's generated rather than waiting for the complete response.

Overview

When you send a message with stream: true, the API returns a stream of events instead of a single JSON response. Each event contains incremental data about the agent's processing and response.

Enabling Streaming

To enable streaming, set stream: true in your message request:

curl -N -X POST "https://api.codeer.ai/api/v1/chats/12345/messages" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "message": "Hello!",
    "stream": true,
    "agent_id": "550e8400-e29b-41d4-a716-446655440000"
  }'

Tool Tag Visibility

include_tool_tags defaults to false.
When omitted or false, <tool ...>...</tool> blocks are removed from streaming text events.
Internal clients that need raw tool blocks must explicitly send include_tool_tags=true.

Event Format

Events follow the SSE specification:

event: <event_type>
data: <json_payload>

Each event includes common fields:

Field	Description
`type`	The event type
`response_id`	Unique identifier for this response
`chat_id`	The chat session ID

Event Types

response.created

Emitted when the LLM is initialized and ready to process.

{
  "type": "response.created",
  "response_id": "abc123",
  "chat_id": 12345,
  "agent_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "gpt-4",
  "conversation_group_id": "group_abc",
  "user_conversation_id": 1001
}

response.chat.title.updated

Emitted when the chat title is auto-generated (typically after the first message).

{
  "type": "response.chat.title.updated",
  "response_id": "abc123",
  "chat_id": 12345,
  "name": "Question about business hours"
}

response.reasoning_step.start

Emitted when the agent begins a reasoning step (e.g., searching knowledge base, calling a tool).

{
  "type": "response.reasoning_step.start",
  "response_id": "abc123",
  "chat_id": 12345,
  "step": {
    "id": "step_abc123",
    "content": "Searching knowledge base for business hours",
    "tool_name": "retrieve_context_objs",
    "args": {
      "query": "business hours"
    },
    "timestamp": "2024-01-15T10:30:00Z"
  }
}

`tool_name`	Description
`search_web`	Searching the web
`fetch_web_content`	Fetching content from a URL
`retrieve_context_objs`	Selecting relevant knowledge objects
`get_context_obj_lines`	Reading relevant lines from selected knowledge objects
`call_agent`	Calling another agent
`request_form`	Requesting form input
`memory`	Writing user memory
`http_request`	Calling a configured HTTP endpoint
`lookup_history` / `lookup_attachments`	Looking up prior history or attachments
`generate_image`	Generating images

response.reasoning_step.end

Emitted when a reasoning step completes.

{
  "type": "response.reasoning_step.end",
  "response_id": "abc123",
  "chat_id": 12345,
  "step": {
    "id": "step_abc123",
    "tool_name": "retrieve_context_objs",
    "result": {
      "success": true,
      "data": "Found 3 relevant documents..."
    },
    "timestamp": "2024-01-15T10:30:01Z",
    "token_usage": {
      "total_prompt_tokens": 150,
      "total_completion_tokens": 45,
      "total_tokens": 195,
      "total_calls": 1
    }
  }
}

response.received

Emitted when Codeer accepts the user's message but the current turn is intentionally handled without an AI response, such as a human-only handoff window.

{
  "type": "response.received",
  "response_id": "abc123",
  "chat_id": 12345,
  "conversation_group_id": "group_abc",
  "user_conversation_id": 1001,
  "user_meta": {
    "mode": "human_only"
  }
}

When you receive this event, show that the message was received and do not wait for response.output_text.completed.

response.output_text.delta

Emitted for each chunk of generated text. Concatenate these to build the full response.

{
  "type": "response.output_text.delta",
  "response_id": "abc123",
  "chat_id": 12345,
  "delta": "Our business hours are "
}

When include_tool_tags=false, chunks that only contain <tool ...>...</tool> are removed and empty deltas are not emitted.

response.output_text.completed

Emitted when the response is fully generated. Contains the complete text and token usage.

{
  "type": "response.output_text.completed",
  "response_id": "abc123",
  "chat_id": 12345,
  "final_text": "Our business hours are Monday to Friday, 9 AM to 6 PM EST.",
  "output_components": [],
  "usage": {
    "total_prompt_tokens": 250,
    "total_completion_tokens": 85,
    "total_tokens": 335,
    "total_calls": 1
  }
}

When include_tool_tags=false, final_text is returned with <tool ...>...</tool> blocks removed.

output_components contains structured visual presentation components when the agent generated them and the channel supports them. It is an empty array when no visual presentation was produced.

response.error

Emitted when an error occurs during processing.

{
  "type": "response.error",
  "response_id": "abc123",
  "chat_id": 12345,
  "message": "Failed to process request",
  "code": 10005
}

Error Code Values

The code field contains an error code (not HTTP status). Common values:

10005 (SYS_SERVER_ERROR) - Internal server error
10006 (SYS_BAD_REQUEST) - Invalid request

See Error Codes for the full list.

response.interaction_request

Emitted when the assistant run is intentionally interrupted and requires user interaction.

This event uses a discriminated union with interaction_type:

form: collect structured user input and resume with resume_form_request_id.
payment: initiate external checkout and track async transaction status.

Form interaction example

{
  "type": "response.interaction_request",
  "response_id": "abc123",
  "chat_id": 12345,
  "interaction_type": "form",
  "conversation_group_id": "cvg-xxx",
  "form_request_id": "3c8e...",
  "form_schema": {
    "id": "contact_form",
    "title": "Contact Info",
    "fields": [
      {
        "name": "email",
        "label": "Email",
        "type": "shortText",
        "required": true
      }
    ]
  }
}

Payment interaction example

{
  "type": "response.interaction_request",
  "response_id": "abc123",
  "chat_id": 12345,
  "interaction_type": "payment",
  "conversation_group_id": "cvg-xxx",
  "payment": {
    "payment_request_id": "23db...",
    "checkout_url": "https://api.example.com/api/v1/payments/checkout/<token>",
    "merchant_order_no": "CDR202603...",
    "amount_twd": 1200,
    "currency": "TWD",
    "status": "pending",
    "item_desc": "Consultation fee"
  }
}

Payment status lifecycle

Payment interaction status can move through these states:

pending: request created, checkout not started yet (cancellable).
processing: checkout has started (for example, user opened checkout); no longer cancellable.
succeeded / failed: terminal gateway result.
cancelled: request cancelled by user while still pending.
expired: request timed out before completion.

Additional behavior:

Cancel action is accepted only when current status is pending.
Stale pending/processing requests are periodically marked expired by backend jobs.

Why payment is a separate interaction type

Payment is intentionally not modeled as a normal form:

It has a different lifecycle (external checkout + webhook/polling), not one-shot form submit.
It has a different payload contract (checkout_url, transaction identifiers, status).
It has different UI actions (open checkout / refresh status) and continuation behavior.

We keep one event entrypoint (interaction_request) for orchestration consistency, and use interaction_type for type-safe branching.

Payment persistence behavior

For interaction_type="payment", the backend also persists a payment summary as a conversation record. This summary includes at least:

request number (merchant_order_no)
title (item_desc)
amount (amount_twd, currency)
status (pending/processing/succeeded/failed/cancelled/expired/...)

Payment callbacks (for example, gateway notify webhooks) update the same persisted record so reloading history or continuing chat keeps payment context available to both users and later model turns.

response.pending_interaction_resolved

Emitted when Codeer resolves one or more pending interactions before continuing the stream.

For example, when the caller resumes from a submitted form or a payment request reaches a terminal status, the stream can include a resolution event before the next text events.

{
  "type": "response.pending_interaction_resolved",
  "response_id": "abc123",
  "chat_id": 12345,
  "resolutions": [
    {
      "interaction_type": "form",
      "action": "submit",
      "status": "submitted",
      "conversation_group_id": "cvg-xxx",
      "display_message": "Form submitted",
      "form_request_id": "3c8e...",
      "payment_request_id": null,
      "resolved": true
    }
  ]
}

Use this event to update or dismiss any pending client-side UI before rendering the next assistant response.

Stream Termination

The stream ends with a special message:

data: [DONE]

Always handle this marker to properly close your connection.

Timeout Handling

The default stream timeout is 180 seconds. If no events are received within this period, the server emits response.error and closes the stream. For long-running operations, ensure your client handles timeout and reconnection appropriately.

Client Implementations

JavaScript (Browser)

Using the native EventSource API:

const url = 'https://api.codeer.ai/api/v1/chats/12345/messages';

// EventSource doesn't support POST, use fetch with streaming
const response = await fetch(url, {
  method: 'POST',
  headers: {
    'x-api-key': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    message: 'Hello!',
    stream: true,
    agent_id: 'your-agent-id'
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = line.slice(6);
      if (data === '[DONE]') {
        console.log('Stream completed');
        return;
      }
      try {
        const event = JSON.parse(data);
        handleEvent(event);
      } catch (e) {
        // Not JSON, might be event name line
      }
    }
  }
}

function handleEvent(event) {
  switch (event.type) {
    case 'response.output_text.delta':
      // Append text to UI
      document.getElementById('output').textContent += event.delta;
      break;
    case 'response.interaction_request':
      // Render the requested form or other interaction
      openInteractionUi(event);
      break;
    case 'response.pending_interaction_resolved':
      // Update or dismiss pending form/payment UI
      resolvePendingInteractionUi(event.resolutions);
      break;
    case 'response.received':
      // The message was accepted, but no AI text will follow for this turn
      markMessageReceived(event);
      break;
    case 'response.output_text.completed':
      console.log('Final response:', event.final_text);
      break;
    case 'response.error':
      console.error('Error:', event.message);
      break;
  }
}

Python

Using the requests library:

import requests
import json

def stream_chat(chat_id: int, message: str, agent_id: str, api_key: str):
    url = f"https://api.codeer.ai/api/v1/chats/{chat_id}/messages"
    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
        "Accept": "text/event-stream"
    }
    payload = {
        "message": message,
        "stream": True,
        "agent_id": agent_id
    }

    with requests.post(url, headers=headers, json=payload, stream=True) as response:
        response.raise_for_status()

        for line in response.iter_lines():
            if not line:
                continue

            line = line.decode('utf-8')

            if line.startswith('data: '):
                data = line[6:]
                if data == '[DONE]':
                    print("\nStream completed")
                    break

                try:
                    event = json.loads(data)
                    handle_event(event)
                except json.JSONDecodeError:
                    pass

def handle_event(event: dict):
    event_type = event.get('type')

    if event_type == 'response.output_text.delta':
        print(event.get('delta', ''), end='', flush=True)
    elif event_type == 'response.interaction_request':
        print(f"\nInteraction required: {event}")
    elif event_type == 'response.output_text.completed':
        print(f"\n\nTokens used: {event.get('usage')}")
    elif event_type == 'response.error':
        print(f"\nError: {event.get('message')}")

# Usage
stream_chat(
    chat_id=12345,
    message="What are your business hours?",
    agent_id="your-agent-id",
    api_key="your-api-key"
)

Python (Async)

Using aiohttp:

import aiohttp
import asyncio
import json

async def stream_chat_async(chat_id: int, message: str, agent_id: str, api_key: str):
    url = f"https://api.codeer.ai/api/v1/chats/{chat_id}/messages"
    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json",
        "Accept": "text/event-stream"
    }
    payload = {
        "message": message,
        "stream": True,
        "agent_id": agent_id
    }

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, json=payload) as response:
            async for line in response.content:
                line = line.decode('utf-8').strip()
                if not line:
                    continue

                if line.startswith('data: '):
                    data = line[6:]
                    if data == '[DONE]':
                        break

                    try:
                        event = json.loads(data)
                        if event.get('type') == 'response.output_text.delta':
                            print(event.get('delta', ''), end='', flush=True)
                    except json.JSONDecodeError:
                        pass

# Usage
asyncio.run(stream_chat_async(
    chat_id=12345,
    message="Hello!",
    agent_id="your-agent-id",
    api_key="your-api-key"
))

Best Practices

Buffer handling: SSE data may arrive in chunks that don't align with event boundaries. Always buffer incoming data and parse complete events.
Error recovery: Implement reconnection logic for network failures. Store the last received event to potentially resume processing.
UI updates: Batch UI updates to avoid excessive re-renders. Consider using requestAnimationFrame for smooth text display.
Cleanup: Always close connections properly when the user navigates away or cancels the request.
Timeout handling: Implement client-side timeout handling that matches or exceeds the server's 180-second timeout.