API3 Technical Documentation

1. API Overview

API3 provides OpenAI-compatible endpoints for accessing advanced reasoning models with support for chat completions, function calling, custom reasoning efforts, image processing, and real-time streaming.

Key Features

OpenAI Compatibility: Drop-in replacement for OpenAI's chat completions API
Advanced Reasoning: Multiple reasoning effort levels (minimal, low, medium, high)
Function Calling: Built-in support for tool/function calling with parallel execution
Image Support: Process images with vision-capable models
Real-time Streaming: SSE-based response streaming with chunked delivery
Full Unicode Support: Complete Korean and international language support with UTF-8 encoding
Sandbox Modes: Multiple security levels for code execution
Rate Limiting: Configurable rate limits per IP

Quick Start: Use API key "eric" for immediate testing without setup.

2. Authentication

Bearer Token Authentication

API uses Bearer token authentication. The default test key is "eric".

Authorization Header

Authorization: Bearer eric

Implementation

# Using curl with Bearer token
curl -H "Authorization: Bearer eric" \
     -H "Content-Type: application/json" \
     https://api3.exploit.bot/v1/chat/completions

Note: If no API key is configured on the server, authentication is disabled and requests proceed without token validation.

3. Available Models

🚀 26 Model Variants Available

API3 provides access to a comprehensive model ecosystem with base models and reasoning variants for precise control over performance, speed, and cost.

🎯 Working Base Models

✅ Confirmed Working Models

The following models have been tested and confirmed to work with the current API setup:

Model	Category	Default Reasoning	Response Time	Best For
gpt-5.1-codex-max	🔥 Premium Codex	High	2-4s	Complex algorithms, production code, architecture
gpt-5.1-codex	🔥 Standard Codex	High	2-3s	Balanced coding performance, daily development
gpt-5.1-codex-mini	⚡ Fast Codex	Medium	1-2s	Quick scripts, prototyping, cost-effective
gpt-5.1	🤖 General Purpose	High	2-3s	Conversations, analysis, mixed tasks
gpt-5	🤖 Legacy General	Medium	1-2s	Basic tasks, compatibility

🎛️ Dynamic Reasoning Control

⚙️ Use x_codex Parameter for Reasoning Control

Instead of using separate model variants, control reasoning effort dynamically via the x_codex parameter:

✅ Working Reasoning Levels

Reasoning Level	API Parameter	Speed	Analysis Depth	Use Cases	Example
⚡ Low	`"reasoning_effort": "low"`	🏃 Fast (1-2s)	🟢 Basic	Quick fixes, simple functions, rapid prototyping	`Basic debugging, simple scripts`
⚙️ Medium	`"reasoning_effort": "medium"`	⚡ Balanced (2-3s)	🟡 Moderate	Production code, standard development tasks	`Most daily coding work`
🧠 High	`"reasoning_effort": "high"`	🐢 Slow (3-5s)	🔠 Deep	Complex algorithms, architecture, optimization	`Advanced problem solving`

⚠️ Currently Unavailable

The following reasoning levels and model variants are not supported with the current account type:

Extra High reasoning level
Specialized model variants (e.g., gpt-5.1-codex-max-low, gpt-5.1-codex-high, etc.)
Separate thinking levels (gpt-5.1-codex-max-think-1, etc.)

These models return 400 errors indicating they're "not supported when using Codex with a ChatGPT account"

📚 Usage Examples

Basic Model Selection

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex-max",
    "messages": [{"role": "user", "content": "Write a Python function"}]
  }'

Reasoning Level Control (Recommended)

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [{"role": "user", "content": "Write a Python function"}],
    "x_codex": {
      "reasoning_effort": "high"
    }
  }'

Low Reasoning for Speed

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [{"role": "user", "content": "Quick fix for syntax error"}],
    "x_codex": {
      "reasoning_effort": "low"
    }
  }'

🎯 Model Selection Guide

Scenario	Recommended Model	Reasoning Level	Expected Performance
🚀 Production Applications	gpt-5.1-codex-max	Medium	Reliable, efficient, maintainable code
⚡ Quick Scripts	gpt-5.1-codex-mini	Low	Fast responses, cost-effective
🧠 Complex Algorithms	gpt-5.1-codex-max	High	Deep analysis, optimized solutions
🗣️ General Conversations	gpt-5.1	Medium	Natural dialogue, broad knowledge
🔧 Rapid Debugging	gpt-5.1-codex	Low	Quick fixes, basic debugging
🏗️ System Architecture	gpt-5.1-codex-max	High	Comprehensive design, best practices

📊 Performance Characteristics

⚡ Speed Rankings

gpt-5.1-codex-mini (Low reasoning) - Fastest
gpt-5 - Very Fast
gpt-5.1-codex (Low reasoning) - Fast
gpt-5.1 - Balanced
gpt-5.1-codex-max (High reasoning) - Slowest

⭐ Quality Rankings

gpt-5.1-codex-max (High reasoning) - Exceptional
gpt-5.1-codex (High reasoning) - Excellent
gpt-5.1-codex-max (Medium reasoning) - Great
gpt-5.1 (High reasoning) - Very Good
gpt-5.1-codex-mini (Low reasoning) - Good

💰 Cost Efficiency

gpt-5.1-codex-mini (Low reasoning) - Most Efficient
gpt-5 - Very Efficient
gpt-5.1-codex (Low reasoning) - Efficient
gpt-5.1 (Medium reasoning) - Moderate
gpt-5.1-codex-max (High reasoning) - Most Costly

🎯 Best Practices

Use x_codex.reasoning_effort for dynamic control instead of separate model variants
Start with Medium reasoning for balanced performance
Use Low reasoning for quick tasks and debugging
Use High reasoning only for complex problems requiring deep analysis
Prefer gpt-5.1-codex-max for production-critical code
Use gpt-5.1-codex-mini for cost-sensitive applications
Avoid unsupported models to prevent 400 errors

4. Endpoints

GET /v1/models

Lists all available models including reasoning effort aliases.

Headers

Header	Value	Required
Content-Type	application/json	No
Authorization	Bearer eric	Yes

Response Schema

{ "data": [ { "id": "gpt-5", "object": "model", "created": null, "owned_by": null }, { "id": "codex-cli", "object": "model", "created": null, "owned_by": null } ] }

POST /v1/chat/completions

Creates a chat completion response with optional function calling. OpenAI-compatible endpoint.

Headers

Header	Value	Required
Content-Type	application/json	Yes
Authorization	Bearer eric	Yes

Request Body Schema

{ "model": "string", // Required: Model ID "messages": [ // Required: Array of messages { "role": "system|user|assistant|tool", "content": "string" | [ // String or array for multimodal { "type": "text", "text": "string" }, { "type": "image_url", "image_url": { "url": "string" // Base64 or URL } } ], "tool_calls": [ // Optional: For tool messages { "id": "string", "type": "function", "function": { "name": "string", "arguments": "string" } } ], "tool_call_id": "string" // Optional: For tool response messages } ], "stream": false, // Optional: Enable streaming "temperature": 1.0, // Optional: 0-2, controls randomness "max_tokens": 4096, // Optional: Max tokens to generate "top_p": 1.0, // Optional: Nucleus sampling "frequency_penalty": 0.0, // Optional: -2.0 to 2.0 "presence_penalty": 0.0, // Optional: -2.0 to 2.0 "tools": [ // Optional: Function definitions { "type": "function", "function": { "name": "function_name", "description": "string", "parameters": { "type": "object", "properties": {}, "required": [] } } } ], "tool_choice": "auto", // Optional: "auto", "required", "none", {"type": "function", "function": {"name": "my_function"}} "parallel_tool_calls": true, // Optional: Enable parallel tool execution "x_codex": { // Optional: Codex-specific options "sandbox": "read-only|danger-full-access", "reasoning_effort": "minimal|low|medium|high", "network_access": true, "hide_reasoning": false } }

Response Schema (Non-streaming)

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1704067200, "model": "codex-cli", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "string", // Text response (null if tool_calls only) "tool_calls": [ // Optional: Function calls { "id": "call_abc123", "type": "function", "function": { "name": "function_name", "arguments": "{\"arg\": \"value\"}" } } ], "refusal": null }, "finish_reason": "stop|tool_calls|length|content_filter" } ], "usage": { "prompt_tokens": 20, "completion_tokens": 30, "total_tokens": 50 } }

Streaming Response Format

When stream: true, responses are sent as Server-Sent Events:

data: {"choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"choices":[{"delta":{"content":"Hello"},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_123","function":{"name":"weather","arguments":""}}]},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"location\":\"New York\"}"}}]},"index":0}]}
data: {"choices":[{"delta":{},"index":0,"finish_reason":"tool_calls"}]}
data: [DONE]

POST /v1/responses

Alternative endpoint for responses API compatibility (Anthropic-style).

Request Body

{ "model": "string", // Optional: Model ID "input": "string" | [ // Required: Input text or messages { "role": "user", "content": "string" } ], "stream": false, // Optional: Enable streaming "tools": [], // Optional: Function definitions "reasoning": { // Optional: Reasoning configuration "effort": "minimal|low|medium|high" } }

Response Schema (Non-streaming)

{ "id": "resp_abc123", "object": "response", "created": 1704067200, "model": "gpt-5.1-codex", "status": "completed", "output": [ { "id": "msg_def456", "type": "message", "role": "assistant", "content": [ { "type": "output_text", "text": "Response content" } ] } ], "usage": { "input_tokens": 0, "output_tokens": 0, "total_tokens": 0 } }

🚀 Async Job Mode (NEW)

For long-running operations, append ?async=true to any endpoint to submit as a background job.

Benefits: No timeouts, resume on disconnect, progress monitoring, better error handling.

How It Works

Submit request with ?async=true parameter
Receive job ID immediately
Poll /v1/jobs/{job_id} for status
Or stream progress via /v1/jobs/{job_id}/stream

Example: Async Chat Completion

curl -X POST "https://api3.exploit.bot/v1/chat/completions?async=true" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-5.1-codex", "messages": [{"role": "user", "content": "Complex task..."}]}'

# Returns immediately:
{
  "id": "98b2945f-0be2-4ac6-b1b4-cfde9b85a8eb",
  "object": "job",
  "status": "queued",
  "created_at": 1764743637,
  "request_type": "chat_completion",
  "message": "Job submitted. Poll /v1/jobs/{id} for results..."
}

GET /v1/jobs/{job_id}

Get job status and results. Poll this endpoint to check if job is complete.

Response Statuses

queued - Job waiting to start
running - Job currently executing
completed - Job finished successfully (result available)
failed - Job failed (error available)
cancelled - Job was cancelled

Example

curl "https://api3.exploit.bot/v1/jobs/98b2945f-0be2-4ac6-b1b4-cfde9b85a8eb" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Completed job response:
{
  "id": "98b2945f-0be2-4ac6-b1b4-cfde9b85a8eb",
  "object": "job",
  "status": "completed",
  "created_at": 1764743637,
  "updated_at": 1764743644,
  "request_type": "chat_completion",
  "result": {
    "id": "98b2945f-0be2-4ac6-b1b4-cfde9b85a8eb",
    "object": "chat.completion",
    "choices": [{
      "index": 0,
      "message": {"role": "assistant", "content": "Response text"},
      "finish_reason": "stop"
    }],
    "usage": {}
  }
}

GET /v1/jobs/{job_id}/stream

Stream job output in real-time via Server-Sent Events.

Example

curl -N "https://api3.exploit.bot/v1/jobs/98b2945f.../stream" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Event stream:
event: status
data: {"type":"status","job_id":"...","status":"running"}

event: chunk
data: {"type":"chunk","chunk_index":0,"chunk":"Hello"}

event: final
data: {"type":"final","status":"completed","result":{...}}

data: [DONE]

GET /v1/jobs

List all jobs with pagination.

Query Parameters

Parameter	Default	Max
limit	100	1000
offset	0	-

Example

curl "https://api3.exploit.bot/v1/jobs?limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

DELETE /v1/jobs/{job_id}

Cancel a running or queued job.

Example

curl -X DELETE "https://api3.exploit.bot/v1/jobs/98b2945f..." \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response:
{"status": "cancelled", "job_id": "98b2945f..."}

GET /health

Health check endpoint.

Response: {"status": "ok"}

5. Codex CLI Tools & Capabilities

🛠️ Available Tools

The Codex CLI provides powerful built-in tools that enable AI models to perform real-world actions beyond text generation. All tools are accessed through the x_codex parameter and work seamlessly with streaming.

Tool	Description	Use Cases	Response Time	Security
web_search	Real-time web search with citations	Current events, weather, news, research	🌐 15-20s	🔒 Requires network access
computer_use	File operations and script execution	File creation, editing, code generation	⚡ 2-3s	🔒 Workspace restricted
shell	System command execution	Directory operations, automation	⚡ 2-4s	🔒 Sandbox controlled

🔧 Enabling Tools

Tool Activation via x_codex Parameter

Tools must be explicitly enabled in each request using the x_codex parameter. Multiple tools can be enabled simultaneously.

Basic Tool Request

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "What's the current weather in Tokyo?"}
  ],
  "x_codex": {
    "tools": ["web_search"],
    "sandbox": "workspace-write",
    "reasoning_effort": "medium"
  }
}

Multiple Tools Combination

{
  "model": "gpt-5.1-codex-max",
  "messages": [
    {"role": "user", "content": "Research latest AI developments and create a summary markdown file"}
  }
  ],
  "x_codex": {
    "tools": ["web_search", "computer_use"],
    "sandbox": "workspace-write",
    "reasoning_effort": "high",
    "network_access": true
  }
}

🌐 Web Search Tool

Real-time Information Access

The web search tool provides access to current information from the internet with proper citations and source attribution.

Capabilities

Current Events: Latest news, weather, sports scores
Research: Academic papers, technical documentation
Citations: Automatic source attribution with links
Multi-language: Search and respond in multiple languages

Usage Examples

Weather Query

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [{"role": "user", "content": "Current weather in London?"}],
    "x_codex": {
      "tools": ["web_search"],
      "network_access": true
    }
  }'

News Research

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [{"role": "user", "content": "Latest AI breakthroughs"}],
    "x_codex": {
      "tools": ["web_search"],
      "network_access": true,
      "reasoning_effort": "high"
    }
  }'

Expected Response

The current temperature in London is 8°C (46°F) with light rain.

Sources:
- BBC Weather (london.weather.bbc.co.uk)
- Met Office (metoffice.gov.uk)

💻 Computer Use Tool

File System & Code Operations

Enables file creation, editing, reading, and code generation within the secure workspace environment.

Capabilities

File Operations: Create, read, write, delete files
Code Generation: Write scripts in multiple languages
Directory Management: Create, list, navigate directories
Text Processing: Format, analyze, transform content

Usage Examples

Create Python Script

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "Create a Python script that calculates fibonacci numbers"}
  ],
  "x_codex": {
    "tools": ["computer_use"],
    "sandbox": "workspace-write"
  }
}

File Analysis

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "Read and summarize the contents of report.txt"}
  ],
  "x_codex": {
    "tools": ["computer_use"],
    "sandbox": "workspace-write"
  }
}

Created Files Example

✅ Created: /workspace/codex-api3/fibonacci.py
✅ Created: /workspace/codex-api3/analysis_report.md
✅ Created: /workspace/codex-api3/backup/2025-11-30_data.zip

🐚 Shell Tool

System Command Execution

Execute shell commands for system operations, automation, and file management tasks.

Capabilities

File Operations: ls, cp, mv, rm, mkdir
System Info: uname, df, ps, date
Text Processing: grep, sed, awk, sort
Networking: ping, curl, wget (when enabled)

Usage Examples

Directory Listing

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "List all Python files in the current directory"}
  ],
  "x_codex": {
    "tools": ["shell"],
    "sandbox": "workspace-write"
  }
}

System Information

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "Show system uptime and disk usage"}
  ],
  "x_codex": {
    "tools": ["shell"],
    "sandbox": "workspace-write"
  }
}

🔒 Security & Sandbox Controls

Workspace Isolation

All file operations are restricted to the /workspace/codex-api3 directory, ensuring secure containment.

Sandbox Modes

Mode	File Access	Network Access	System Access	Recommended For
`read-only`	❌ No file writes	❌ No network	❌ No system	Secure read-only queries
`workspace-write`	✅ Workspace files	⚙️ Per request	⚙️ Limited	Default balanced usage
`danger-full-access`	⚠️ Full system	⚠️ Full network	⚠️ Full system	🚫 Blocked by policy

⚠️ Security Notice

The danger-full-access mode is blocked by server policy for security reasons. Always use the least privileged mode necessary for your use case.

📊 Performance Characteristics

⚡ Response Times

Basic Chat: 2.5-3.0s
Web Search: 15-20s
Computer Use: 2.5-3.0s
Shell Operations: 2-4s
Streaming Init: <50ms

🎯 Optimization Tips

Use medium reasoning for balanced performance
Low reasoning for quick file operations
High reasoning for complex research tasks
Combine tools efficiently in single requests

📈 Best Practices

Enable streaming for better UX
Use gpt-5.1-codex-mini for cost efficiency
Use gpt-5.1-codex-max for complex tasks
Request network_access only when needed

🔧 Advanced x_codex Configuration

Complete Parameter Reference

{
  "x_codex": {
    "tools": ["web_search", "computer_use", "shell"],
    "sandbox": "workspace-write",
    "reasoning_effort": "medium",
    "network_access": true,
    "hide_reasoning": false
  }
}

Parameter Details

Parameter	Type	Values	Default	Description
`tools`	Array	`["web_search"]`, `["computer_use"]`, `["shell"]`	[]	Tools to enable for this request
`sandbox`	String	`"read-only"`, `"workspace-write"`, `"danger-full-access"`	`"read-only"`	File system access level
`reasoning_effort`	String	`"low"`, `"medium"`, `"high"`	`"medium"`	Model reasoning depth
`network_access`	Boolean	`true`, `false`	`false`	Allow web search network access
`hide_reasoning`	Boolean	`true`, `false`	`false`	Hide thinking/reasoning blocks

🎯 Tool Selection Guide

📰 Research Tasks

Tools: ["web_search", "computer_use"]
Reasoning: Medium to High
Use: Current events, fact-checking, documentation

💻 Code Generation

Tools: ["computer_use"]
Reasoning: Medium
Use: Scripts, files, automation

⚙️ System Admin

Tools: ["shell", "computer_use"]
Reasoning: Low to Medium
Use: File management, system info

🔧 Automation

Tools: ["web_search", "computer_use", "shell"]
Reasoning: High
Use: Research → Generate → Execute workflows

🎯 Tools Integration Summary

The Codex CLI tools system provides powerful capabilities directly integrated into the AI model:

Web Search: Access current information from the internet
Computer Use: GUI automation and screenshot capabilities
Shell: Direct command-line execution and file operations
Automatic Selection: Model chooses appropriate tools based on context
Security Controls: Configurable sandbox modes for safe execution

Activation: Tools are automatically available when enabled in config.toml. Use appropriate x_codex parameters for security and performance control.

6. Server-Sent Events (SSE) Streaming

Overview

Streaming enables real-time delivery of response chunks as they're generated, providing immediate feedback and reducing perceived latency. The API implements OpenAI-compatible SSE streaming with token-by-token precision.

Streaming Implementation

This wrapper forwards Codex CLI output verbatim with character-by-character precision, ensuring natural streaming boundaries that match OpenAI's SSE format.

Streaming Benefits

Token-by-token precision: Fixed from 32-byte chunks to 1-byte character reading for perfect streaming boundaries
Natural boundaries: Content flows smoothly without artificial chunk divisions
OpenAI compatibility: Standard SSE format matches client expectations
Real-time interaction: Users see responses as they're generated

Streaming Usage Example

Request streaming with stream: true in your API call:

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "stream": true
  }'

Streaming Response Format

Responses follow OpenAI's SSE format with data: prefixes:

data: {"choices":[{"delta":{"content":"Quantum"},"index":0}]}

data: {"choices":[{"delta":{"content":" computing"},"index":0}]}

data: {"choices":[{"delta":{"content":" is"},"index":0}]}

data: {"choices":[{"delta":{"content":" a"},"index":0}]}

data: [DONE]

Client-Side Streaming Implementation

Python example for handling SSE streams:

import requests
import json

def stream_chat_completion(messages):
    response = requests.post(
        "https://api3.exploit.bot/v1/chat/completions",
        headers={
            "Authorization": "Bearer YOUR_API_KEY",
            "Content-Type": "application/json"
        },
        json={
            "model": "gpt-5.1-codex",
            "messages": messages,
            "stream": True
        },
        stream=True
    )

    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                data = line[6:]  # Remove 'data: ' prefix
                if data == '[DONE]':
                    break
                try:
                    chunk = json.loads(data)
                    content = chunk['choices'][0]['delta'].get('content', '')
                    if content:
                        print(content, end='', flush=True)
                except json.JSONDecodeError:
                    continue

# Usage
messages = [{"role": "user", "content": "Tell me a story"}]
stream_chat_completion(messages)

💡 Pro Tip: The streaming implementation uses character-by-character reading (1-byte chunks) instead of fixed-size chunks for the most natural response flow.

8. Internationalization & Unicode Support

Overview

This API provides complete Unicode support for international languages, including Korean, Japanese, Chinese, Arabic, Cyrillic, and other character sets. All text processing maintains proper UTF-8 encoding across both streaming and non-streaming responses.

🌍 Korean Language Support

Full Korean language support has been implemented and tested with proper UTF-8 encoding for both input and output, including real-time streaming responses.

Supported Character Sets

Korean (한글): Complete Hangul and Hanja support
Japanese (日本語): Hiragana, Katakana, and Kanji
Chinese (中文): Simplified and Traditional Chinese
Arabic (العربية): Right-to-left text support
Cyrillic (Кириллица): Russian, Bulgarian, and other Slavic languages
European Languages: Full Latin alphabet with diacritics
Emoji: Complete Unicode emoji support

UTF-8 Encoding Implementation

Request Processing

All incoming requests are processed with UTF-8 encoding:

JSON payloads automatically decoded as UTF-8
Character boundaries preserved for multi-byte characters
No character loss or corruption during processing

Streaming Responses

Server-Sent Events (SSE) streaming includes proper UTF-8 charset specification:

Content-Type: text/event-stream; charset=utf-8

Incremental Decoding: Multi-byte characters properly handled during streaming
Character Boundaries: Korean characters (3 bytes) preserved across chunks
No Corruption: Characters remain intact during real-time streaming

Korean Language Examples

Basic Korean Request (Non-streaming)

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "안녕하세요! 한국의 수도는 어디인가요?"}
    ]
  }'

Response:

{
  "choices": [{
    "message": {
      "content": "안녕하세요! 한국의 수도는 서울입니다."
    }
  }]
}

Korean Streaming Request

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "김치랑 비빔밥은 한국의 전통 음식입니다."}
    ],
    "stream": true
  }'

Streaming Response:

data: {"choices": [{"delta": {"content": "네, 맞습니다!"}}]}

data: {"choices": [{"delta": {"content": " 김치와 비빔밥은"}}]}

data: {"choices": [{"delta": {"content": " 오랜 전통을 가진"}}]}

data: {"choices": [{"delta": {"content": " 대표적인 한국 음식입니다."}}]}

data: [DONE]

Technical Implementation Details

UTF-8 Streaming Fix

The streaming implementation was specifically optimized for international character support:

# Fixed implementation in app/codex.py
import codecs

decoder = codecs.getincrementaldecoder('utf-8')(errors='ignore')
while True:
    chunk = await proc.stdout.read(8)  # Read 8-byte chunks
    decoded_chars = decoder.decode(chunk)  # Handle multi-byte chars
    # Process characters with proper boundaries

🌐 Global Ready: The API is production-ready for international applications with complete Unicode support across all languages and character sets.

9. Code Examples

Basic Chat Completion

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

}, "finish_reason": "tool_calls" }] }

3. Execute Tool and Submit Results

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "What's the weather in Tokyo?"},
    {
      "role": "assistant",
      "tool_calls": [{
        "id": "call_1",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\":\"Tokyo\",\"units\":\"celsius\"}"
        }
      }]
    },
    {
      "role": "tool",
      "tool_call_id": "call_1",
      "content": "{\"temperature\":22,\"condition\":\"sunny\",\"humidity\":65}"
    }
  ]
}

4. Final Response with Tool Results

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "The weather in Tokyo is currently sunny with a temperature of 22°C and humidity at 65%."
    },
    "finish_reason": "stop"
  }]
}

Best Practice: Always include tool results as tool messages with matching tool_call_id to maintain conversation context.

6. Server-Sent Events (SSE) Streaming

Overview

Streaming enables real-time delivery of response chunks as they're generated, providing immediate feedback and reducing perceived latency. The API implements OpenAI-compatible SSE streaming.

Enabling Streaming

Set stream: true in your request:

{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "Write a short story"}
  ],
  "stream": true
}

Stream Event Types

Regular Content Chunks

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1704067200,"model":"gpt-5.1","choices":[{"index":0,"delta":{"content":"Once upon a time"}}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1704067200,"model":"gpt-5.1","choices":[{"index":0,"delta":{"content":" in a distant galaxy"}}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1704067200,"model":"gpt-5.1","choices":[{"index":0,"delta":{}}]}

Function Call Streaming

data: {"choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_123","function":{"name":"get_weather","arguments":""}}]},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"location\":\"NYC\""}}]},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":",\"units\":\"celsius\"}"}}]},"index":0}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{}}]},"index":0,"finish_reason":"tool_calls"}]}

Stream Termination

data: [DONE]

Client-Side Implementation

JavaScript EventSource

class StreamingChat {
    constructor(apiKey = 'eric') {
        this.baseURL = 'https://api3.exploit.bot/v1';
        this.apiKey = apiKey;
    }

    async streamChat(messages, onChunk, onComplete, onError) {
        try {
            const response = await fetch(`${this.baseURL}/chat/completions`, {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                    'Authorization': `Bearer ${this.apiKey}`,
                    'Accept': 'text/event-stream',
                },
                body: JSON.stringify({
                    model: 'gpt-5.1',
                    messages: messages,
                    stream: true
                })
            });

            const reader = response.body.getReader();
            const decoder = new TextDecoder();
            let buffer = '';
            let fullContent = '';

            while (true) {
                const { done, value } = await reader.read();
                if (done) break;

                buffer += decoder.decode(value, { stream: true });
                const lines = buffer.split('\n');
                buffer = lines.pop(); // Keep incomplete line in buffer

                for (const line of lines) {
                    if (line.startsWith('data: ')) {
                        const data = line.slice(6);
                        if (data === '[DONE]') {
                            onComplete(fullContent);
                            return;
                        }

                        try {
                            const chunk = JSON.parse(data);
                            const delta = chunk.choices[0]?.delta;

                            if (delta?.content) {
                                fullContent += delta.content;
                                onChunk(delta.content, chunk);
                            }

                            // Handle tool calls
                            if (delta?.tool_calls) {
                                onChunk({ tool_calls: delta.tool_calls }, chunk);
                            }
                        } catch (e) {
                            console.error('Parse error:', e);
                        }
                    }
                }
            }
        } catch (error) {
            onError(error);
        }
    }
}

// Usage
const chat = new StreamingChat();

chat.streamChat(
    [{ role: 'user', content: 'Tell me a story' }],
    (chunk, fullData) => {
        process.stdout.write(chunk.content || '');
    },
    (fullContent) => {
        console.log('\n\nComplete:', fullContent);
    },
    (error) => {
        console.error('Error:', error);
    }
);

Python with aiohttp

import asyncio
import aiohttp
import json

async def stream_chat_completion(messages, model="gpt-5.1"):
    url = "https://api3.exploit.bot/v1/chat/completions"
    headers = {
        "Authorization": "Bearer eric",
        "Content-Type": "application/json"
    }

    payload = {
        "model": model,
        "messages": messages,
        "stream": True
    }

    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, json=payload) as response:
            full_content = ""

            async for line in response.content:
                line = line.decode('utf-8').strip()

                if line.startswith('data: '):
                    data = line[6:]

                    if data == '[DONE]':
                        break

                    try:
                        chunk = json.loads(data)
                        delta = chunk.get('choices', [{}])[0].get('delta', {})

                        if 'content' in delta:
                            content = delta['content']
                            full_content += content
                            print(content, end='', flush=True)

                        # Handle tool calls
                        if 'tool_calls' in delta:
                            print(f"\nTool Call: {delta['tool_calls']}")

                    except json.JSONDecodeError:
                        continue

            return full_content

# Usage
async def main():
    messages = [{"role": "user", "content": "Explain quantum computing"}]
    result = await stream_chat_completion(messages)
    print(f"\n\nFull response: {result}")

asyncio.run(main())

Python with requests

import requests
import json

def stream_chat(messages, model="gpt-5.1"):
    response = requests.post(
        "https://api3.exploit.bot/v1/chat/completions",
        headers={
            "Authorization": "Bearer eric",
            "Content-Type": "application/json"
        },
        json={
            "model": model,
            "messages": messages,
            "stream": True
        },
        stream=True
    )

    full_content = ""

    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')

            if line.startswith('data: '):
                data = line[6:]

                if data == '[DONE]':
                    break

                try:
                    chunk = json.loads(data)
                    delta = chunk.get('choices', [{}])[0].get('delta', {})

                    if 'content' in delta:
                        content = delta['content']
                        full_content += content
                        print(content, end='', flush=True)

                except json.JSONDecodeError:
                    continue

    return full_content

# Usage
result = stream_chat([{"role": "user", "content": "Hello!"}])
print(f"\nComplete: {result}")

Go Implementation

package main

import (
    "bufio"
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "os"
)

type StreamChunk struct {
    Choices []struct {
        Delta struct {
            Content   string `json:"content,omitempty"`
            Role      string `json:"role,omitempty"`
            ToolCalls []struct {
                Index    int `json:"index"`
                ID       string `json:"id"`
                Type     string `json:"type"`
                Function struct {
                    Name      string `json:"name,omitempty"`
                    Arguments string `json:"arguments,omitempty"`
                } `json:"function,omitempty"`
            } `json:"tool_calls,omitempty"`
        } `json:"delta"`
        FinishReason string `json:"finish_reason,omitempty"`
    } `json:"choices"`
}

func streamChat(messages []map[string]string, model string) error {
    reqBody := map[string]interface{}{
        "model":    model,
        "messages": messages,
        "stream":   true,
    }

    jsonData, err := json.Marshal(reqBody)
    if err != nil {
        return err
    }

    req, err := http.NewRequest("POST", "https://api3.exploit.bot/v1/chat/completions", bytes.NewBuffer(jsonData))
    if err != nil {
        return err
    }

    req.Header.Set("Content-Type", "application/json")
    req.Header.Set("Authorization", "Bearer eric")

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        return err
    }
    defer resp.Body.Close()

    scanner := bufio.NewScanner(resp.Body)
    for scanner.Scan() {
        line := scanner.Text()
        if len(line) > 6 && line[:6] == "data: " {
            data := line[6:]
            if data == "[DONE]" {
                break
            }

            var chunk StreamChunk
            if err := json.Unmarshal([]byte(data), &chunk); err == nil {
                if len(chunk.Choices) > 0 {
                    content := chunk.Choices[0].Delta.Content
                    if content != "" {
                        fmt.Print(content)
                    }
                }
            }
    }

    return scanner.Err()
}

func main() {
    messages := []map[string]string{
        {"role": "user", "content": "Explain Go programming"},
    }

    err := streamChat(messages, "gpt-5.1")
    if err != nil {
        fmt.Fprintf(os.Stderr, "Error: %v\n", err)
    }
}

Error Handling for Streams

Monitor HTTP status codes before processing stream
Handle partial JSON data gracefully
Implement reconnection logic for dropped connections
Set appropriate timeouts to prevent hanging
Validate chunk structure before processing

Stream Best Practices

Use Connection: keep-alive for persistent connections
Implement backpressure handling for slow consumers
Buffer chunks for smoother UI updates
Store full conversation history including streamed content
Cleanup resources properly when stream ends

🚀 Optimized Streaming Features (API3 v1.1)

⚡ Ultra-Low Latency Streaming

API3 features optimized real-time streaming with 32-byte chunk reading and immediate output for minimal latency:

Response Time: ~16ms (vs 2-3s before optimization)
Chunk Size: 16-32 bytes for instant feedback
Buffer Strategy: Smart buffering with partial content yielding
Throughput: Maintains high speed even with large outputs

🧠 Complete Thinking & Reasoning Inclusion

Unlike standard OpenAI streaming, API3 includes 100% of model thinking and reasoning in the stream:

Example stream showing reasoning:
data: {"choices":[{"delta":{"content":"🤔 Thinking: I need to analyze the requirements..."}}]}
data: {"choices":[{"delta":{"content":"📝 First, let me create the file structure..."}}]}
data: {"choices":[{"delta":{"content":"💡 I'll implement a recursive solution..."}}]}
data: {"choices":[{"delta":{"content":"✅ File created: solution.py"}}]}
data: {"choices":[{"delta":{"content":"Here's the complete implementation..."}}]}

🔧 Real-Time Tool Call Streaming

Watch tool calls execute in real-time with detailed progress information:

Example tool call stream:
data: {"choices":[{"delta":{"content":"🔍 Creating test.py:1..."}}]}
data: {"choices":[{"delta":{"content":"✅ Added fibonacci function with input validation"}}]}
data: {"choices":[{"delta":{"content":"📝 Writing main execution block..."}}]}
data: {"choices":[{"delta":{"content":"🧪 Adding test cases..."}}]}
data: {"choices":[{"delta":{"content":"✨ Complete! The file is ready."}}]}

⚙️ Advanced Configuration Options

Enhanced Request Parameters

{
  "model": "gpt-5.1-codex",
  "messages": [...],
  "stream": true,
  "x_codex": {
    "sandbox": "workspace-write",        // Enable file operations
    "reasoning_effort": "high",          // Control thinking depth
    "hide_reasoning": false,             // Show reasoning process
    "network_access": true,              // Allow internet access
    "expose_reasoning": true             // Include detailed thoughts
  }
}

Available Reasoning Efforts

Effort Level	Description	Use Case	Speed
`minimal`	Quick responses, minimal thinking	Simple tasks, basic Q&A	⚡ Fastest
`low`	Light reasoning, quick solutions	Simple code, straightforward problems	🚀 Fast
`medium`	Balanced reasoning and speed (default)	General purpose, most tasks	⚙️ Balanced
`high`	Deep analysis, thorough reasoning	Complex problems, architecture	🧠 Detailed
`extra high`	Maximum reasoning, comprehensive analysis	Research, complex algorithms	🔬 Thorough

🛠️ Tool Usage Examples

File Creation and Editing

// Request file creation
{
  "model": "gpt-5.1-codex",
  "messages": [
    {"role": "user", "content": "Create a Python web server with Flask that handles API endpoints"}
  ],
  "stream": true,
  "x_codex": {
    "sandbox": "workspace-write",
    "reasoning_effort": "high"
  }
}

// Stream shows real-time file operations
data: {"choices":[{"delta":{"content":"💭 Thinking: I'll create a Flask web server..."}}]}
data: {"choices":[{"delta":{"content":"📁 Creating app.py..."}}]}
data: {"choices":[{"delta":{"content":"✅ Added Flask imports and app initialization"}}]}
data: {"choices":[{"delta":{"content":"🔗 Adding /api/health endpoint..."}}]}
data: {"choices":[{"delta":{"content":"📊 Adding /api/data endpoint with JSON response..."}}]}
data: {"choices":[{"delta":{"content":"🧪 Adding test routes..."}}]}
data: {"choices":[{"delta":{"content":"✨ Flask server complete!"}}]}

Code Analysis and Debugging

// Ask for code debugging
{
  "model": "gpt-5.1-codex-high",
  "messages": [
    {"role": "user", "content": "Debug this slow SQL query and optimize it"}
  ],
  "stream": true,
  "x_codex": {
    "reasoning_effort": "extra high"
  }
}

// Stream shows debugging process
data: {"choices":[{"delta":{"content":"🔍 Analyzing query performance..."}}]}
data: {"choices":[{"delta":{"content":"📊 Found missing index on user_id column"}}]}
data: {"choices":[{"delta":{"content":"⚡ Optimizing JOIN operations..."}}]}
data: {"choices":[{"delta":{"content":"📈 Adding composite index for better performance"}}]}
data: {"choices":[{"delta":{"content":"✅ Query optimized - 100x faster!"}}]}

🎯 Performance Tips

For Maximum Streaming Performance:

Use workspace-write sandbox for tool operations
Set appropriate reasoning effort - higher isn't always better
Enable network access only when needed for security
Stream large responses to avoid timeouts
Monitor chunk processing to prevent memory buildup

🚀 Latest Performance Enhancements (Session 5)

Connection Pooling Improvements

Issue Resolved: Parallel streaming from multiple models causing connection bottlenecks

Solution Implemented:

Increased parallel capacity: From 2 to 5 concurrent requests
Enhanced timeout handling: 120-second limit with proper cleanup
Improved connection reuse: Better HTTP connection management
Memory optimization: Stable ~34MB service footprint

Configuration Updates

# Connection Pooling
CODEX_MAX_PARALLEL_REQUESTS=5  # Increased from default 2
CODEX_TIMEOUT=120  # 2-minute timeout

# Service Environment
PATH=/usr/bin:/usr/local/bin:/home/eric/.nvm/versions/node/v25.2.1/bin
CODEX_PATH=/home/eric/.nvm/versions/node/v25.2.1/bin/codex

Performance Validation Results

Metric	Before	After	Improvement
Concurrent Requests	2	5	+150%
Timeout Handling	Basic	120s with cleanup	More reliable
Service Startup	Failing	Fast initialization	Fixed
Memory Usage	Variable	Stable 34MB	Consistent
UTF-8 Streaming	Chunk-based	Token-by-token	Better for international

Real-World Performance Test

# Test: Korean Character Streaming (UTF-8 Validation)
curl -X POST https://api3.exploit.bot/v1/chat/completions \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [{"role": "user", "content": "Say hello in Korean: 안녕하세요"}],
    "stream": true
  }'

# Response: Proper UTF-8 encoding
data: {"choices": [{"delta": {"content": "\uc548\ub155\ud558\uc138\uc694\n"}}]}

# Results: ✅ Perfect Unicode handling
# ✅ No character corruption
# ✅ Natural token boundaries
# ✅ International character support

🔍 Monitoring and Debugging

Stream Event Types in Detail

// Content chunks (most common)
data: {"choices":[{"delta":{"content":"Response text"}}]}

// Tool call progress
data: {"choices":[{"delta":{"content":"🔧 Creating file.py..."}}]}

// Reasoning/thinking
data: {"choices":[{"delta":{"content":"🧠 Analyzing requirements..."}}]}

// Status updates
data: {"choices":[{"delta":{"content":"✅ Operation completed"}}]}

// Error handling
data: {"choices":[{"delta":{"content":"⚠️ Warning: File already exists"}}]}

Client-Side Error Handling

// Enhanced JavaScript error handling
class EnhancedStreamingChat {
    async streamChat(messages, onChunk, onComplete, onError) {
        try {
            const response = await fetch(/* ... */);

            if (!response.ok) {
                throw new Error(`HTTP ${response.status}: ${response.statusText}`);
            }

            const reader = response.body.getReader();
            let buffer = '';
            let chunkCount = 0;

            while (true) {
                const { done, value } = await reader.read();
                if (done) break;

                buffer += new TextDecoder().decode(value);
                const lines = buffer.split('\n');
                buffer = lines.pop();

                for (const line of lines) {
                    if (line.startsWith('data: ')) {
                        const data = line.slice(6);

                        if (data === '[DONE]') {
                            onComplete(fullContent, chunkCount);
                            return;
                        }

                        try {
                            const chunk = JSON.parse(data);
                            chunkCount++;

                            // Handle different content types
                            const delta = chunk.choices[0]?.delta;
                            if (delta?.content) {
                                if (delta.content.includes('⚠️')) {
                                    console.warn('Warning:', delta.content);
                                } else if (delta.content.includes('❌')) {
                                    onError(new Error('Operation failed: ' + delta.content));
                                    return;
                                } else {
                                    onChunk(delta.content, chunk);
                                }
                            }
                        } catch (e) {
                            console.error('Parse error for chunk:', line);
                        }
                    }
                }
            }
        } catch (error) {
            onError(error);
        }
    }
}

9. Code Examples

Basic Chat Completion

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

With Custom Reasoning Effort

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "codex-cli high",
    "messages": [
      {"role": "user", "content": "Solve this complex algorithm problem"}
    ],
    "x_codex": {
      "reasoning_effort": "high"
    }
  }'

Function Calling Example

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "What is the weather in Paris?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string"},
              "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Streaming Example

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "Write a story about AI"}
    ],
    "stream": true
  }'

With Image Input

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "Analyze this code and explain what it does"},
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
            }
          }
        ]
      }
    ]
  }'

Parallel Tool Calls

curl -X POST "https://api3.exploit.bot/v1/chat/completions" \
  -H "Authorization: Bearer eric" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "messages": [
      {"role": "user", "content": "Get weather for NYC, LA, and Chicago"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string"}
            },
            "required": ["location"]
          }
        }
      }
    ],
    "parallel_tool_calls": true,
    "tool_choice": "required"
  }'

Basic Request with requests

import requests
import json

def chat_completion(message, model="gpt-5.1"):
    response = requests.post(
        "https://api3.exploit.bot/v1/chat/completions",
        headers={
            "Authorization": "Bearer eric",
            "Content-Type": "application/json"
        },
        json={
            "model": model,
            "messages": [{"role": "user", "content": message}]
        }
    )

    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    else:
        raise Exception(f"Error: {response.status_code} - {response.text}")

# Usage
result = chat_completion("Explain quantum computing")
print(result)

Async with aiohttp

import asyncio
import aiohttp
import json

class AsyncAPI3Client:
    def __init__(self, api_key="eric"):
        self.base_url = "https://api3.exploit.bot/v1"
        self.api_key = api_key
        self.session = None

    async def __aenter__(self):
        self.session = aiohttp.ClientSession(
            headers={"Authorization": f"Bearer {self.api_key}"}
        )
        return self

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        if self.session:
            await self.session.close()

    async def chat(self, messages, model="gpt-5.1", **kwargs):
        payload = {
            "model": model,
            "messages": messages,
            **kwargs
        }

        async with self.session.post(
            f"{self.base_url}/chat/completions",
            json=payload
        ) as response:
            if response.status == 200:
                data = await response.json()
                return data["choices"][0]["message"]
            else:
                text = await response.text()
                raise Exception(f"Error {response.status}: {text}")

# Usage
async def main():
    async with AsyncAPI3Client() as client:
        response = await client.chat([
            {"role": "user", "content": "Write a Python function"}
        ])
        print(response.content)

asyncio.run(main())

Function Calling

import requests

def call_function_with_tools():
    tools = [
        {
            "type": "function",
            "function": {
                "name": "calculate",
                "description": "Perform mathematical calculations",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "expression": {
                            "type": "string",
                            "description": "Mathematical expression to evaluate"
                        }
                    },
                    "required": ["expression"]
                }
            }
        }
    ]

    messages = [
        {"role": "user", "content": "What is 123 * 456?"}
    ]

    # First request - get function call
    response = requests.post(
        "https://api3.exploit.bot/v1/chat/completions",
        headers={
            "Authorization": "Bearer eric",
            "Content-Type": "application/json"
        },
        json={
            "model": "gpt-5.1-codex",
            "messages": messages,
            "tools": tools,
            "tool_choice": "required"
        }
    )

    data = response.json()
    tool_call = data["choices"][0]["message"]["tool_calls"][0]

    # Execute the function
    function_name = tool_call["function"]["name"]
    arguments = json.loads(tool_call["function"]["arguments"])

    if function_name == "calculate":
        result = eval(arguments["expression"])  # Be careful with eval in production!

        # Send result back
        messages.append(data["choices"][0]["message"])
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call["id"],
            "content": json.dumps({"result": result})
        })

        # Second request - get final response
        final_response = requests.post(
            "https://api3.exploit.bot/v1/chat/completions",
            headers={
                "Authorization": "Bearer eric",
                "Content-Type": "application/json"
            },
            json={
                "model": "gpt-5.1-codex",
                "messages": messages
            }
        )

        return final_response.json()["choices"][0]["message"]["content"]

# Usage
result = call_function_with_tools()
print(result)

Streaming with requests

def stream_chat(message, on_chunk=None):
    response = requests.post(
        "https://api3.exploit.bot/v1/chat/completions",
        headers={
            "Authorization": "Bearer eric",
            "Content-Type": "application/json"
        },
        json={
            "model": "gpt-5.1-codex",
            "messages": [{"role": "user", "content": message}],
            "stream": True
        },
        stream=True
    )

    full_content = ""

    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                data = line[6:]
                if data == '[DONE]':
                    break

                try:
                    chunk = json.loads(data)
                    delta = chunk.get('choices', [{}])[0].get('delta', {})

                    if 'content' in delta:
                        content = delta['content']
                        full_content += content
                        if on_chunk:
                            on_chunk(content)
                        print(content, end='', flush=True)
                except json.JSONDecodeError:
                    continue

    return full_content

# Usage
result = stream_chat("Tell me a joke")
print(f"\n\nComplete: {result}")

With OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="eric",
    base_url="https://api3.exploit.bot/v1"
)

def chat_with_sdk(message):
    response = client.chat.completions.create(
        model="gpt-5.1",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": message}
        ],
        temperature=0.7,
        max_tokens=1000
    )

    return response.choices[0].message.content

def stream_with_sdk(message):
    stream = client.chat.completions.create(
        model="gpt-5.1",
        messages=[{"role": "user", "content": message}],
        stream=True
    )

    full_response = ""
    for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            full_response += content
            print(content, end='', flush=True)

    return full_response

def function_calling_with_sdk():
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_stock_price",
                "description": "Get current stock price",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "symbol": {"type": "string"},
                        "currency": {"type": "string", "enum": ["USD", "EUR"]}
                    },
                    "required": ["symbol"]
                }
            }
        }
    ]

    response = client.chat.completions.create(
        model="gpt-5.1",
        messages=[{"role": "user", "content": "What's AAPL stock price?"}],
        tools=tools,
        tool_choice="auto"
    )

    return response.choices[0].message

Basic Fetch API

class API3Client {
    constructor(apiKey = 'eric') {
        this.baseURL = 'https://api3.exploit.bot/v1';
        this.apiKey = apiKey;
    }

    async chat(messages, options = {}) {
        const response = await fetch(`${this.baseURL}/chat/completions`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${this.apiKey}`
            },
            body: JSON.stringify({
                model: options.model || 'gpt-5.1',
                messages,
                stream: false,
                ...options
            })
        });

        if (!response.ok) {
            const error = await response.json();
            throw new Error(`Error: ${response.status} - ${error.detail?.message || 'Unknown error'}`);
        }

        const data = await response.json();
        return data.choices[0].message;
    }

    async stream(messages, onChunk, options = {}) {
        const response = await fetch(`${this.baseURL}/chat/completions`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${this.apiKey}`
            },
            body: JSON.stringify({
                model: options.model || 'gpt-5.1',
                messages,
                stream: true,
                ...options
            })
        });

        if (!response.ok) {
            throw new Error(`Error: ${response.status}`);
        }

        const reader = response.body.getReader();
        const decoder = new TextDecoder();
        let buffer = '';

        while (true) {
            const { done, value } = await reader.read();
            if (done) break;

            buffer += decoder.decode(value);
            const lines = buffer.split('\n');
            buffer = lines.pop();

            for (const line of lines) {
                if (line.startsWith('data: ')) {
                    const data = line.slice(6);
                    if (data === '[DONE]') return;

                    try {
                        const chunk = JSON.parse(data);
                        const delta = chunk.choices[0]?.delta;

                        if (delta?.content) {
                            onChunk(delta.content, chunk);
                        }
                    } catch (e) {
                        console.error('Parse error:', e);
                    }
                }
            }
        }
    }
}

// Usage
const client = new API3Client();

// Basic chat
client.chat([
    { role: 'user', content: 'Hello!' }
]).then(response => {
    console.log(response.content);
});

// Streaming
client.stream(
    [{ role: 'user', content: 'Write a poem' }],
    (chunk) => process.stdout.write(chunk)
).then(() => {
    console.log('\nStream complete');
});

Function Calling

async function functionCallingExample() {
    const tools = [
        {
            type: 'function',
            function: {
                name: 'send_email',
                description: 'Send an email to a recipient',
                parameters: {
                    type: 'object',
                    properties: {
                        to: { type: 'string', description: 'Email address' },
                        subject: { type: 'string', description: 'Email subject' },
                        body: { type: 'string', description: 'Email body' }
                    },
                    required: ['to', 'subject', 'body']
                }
            }
        }
    ];

    const messages = [
        { role: 'user', content: 'Send an email to john@example.com about the meeting' }
    ];

    // Get function call
    const response = await fetch('https://api3.exploit.bot/v1/chat/completions', {
        method: 'POST',
        headers: {
            'Authorization': 'Bearer eric',
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            model: 'gpt-5.1',
            messages,
            tools,
            tool_choice: 'required'
        })
    });

    const data = await response.json();
    const toolCall = data.choices[0].message.tool_calls[0];

    // Execute function (mock implementation)
    const args = JSON.parse(toolCall.function.arguments);
    const emailResult = await sendEmail(args.to, args.subject, args.body);

    // Continue conversation with result
    messages.push(data.choices[0].message);
    messages.push({
        role: 'tool',
        tool_call_id: toolCall.id,
        content: JSON.stringify({ success: true, messageId: emailResult.id })
    });

    const finalResponse = await fetch('https://api3.exploit.bot/v1/chat/completions', {
        method: 'POST',
        headers: {
            'Authorization': 'Bearer eric',
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            model: 'gpt-5.1',
            messages
        })
    });

    const finalData = await finalResponse.json();
    return finalData.choices[0].message.content;
}

Streaming with EventSource

function streamingWithEventSource(messages) {
    // Note: EventSource doesn't support POST, so we need to use fetch for streaming

    const eventSource = new EventSource('/stream-endpoint'); // If you have a proxy

    eventSource.onmessage = (event) => {
        const data = JSON.parse(event.data);
        console.log('Received:', data);
    };

    // Better approach: Use fetch with streaming as shown above
}

Error Handling

async function robustChat(message, retries = 3) {
    for (let attempt = 1; attempt <= retries; attempt++) {
        try {
            const response = await fetch('https://api3.exploit.bot/v1/chat/completions', {
                method: 'POST',
                headers: {
                    'Authorization': 'Bearer eric',
                    'Content-Type': 'application/json'
                },
                body: JSON.stringify({
                    model: 'gpt-5.1',
                    messages: [{ role: 'user', content: message }]
                })
            });

            if (response.ok) {
                const data = await response.json();
                return data.choices[0].message.content;
            } else if (response.status === 429) {
                // Rate limited - wait and retry
                const retryAfter = response.headers.get('Retry-After') || 1;
                await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
                continue;
            } else {
                const error = await response.json();
                throw new Error(error.detail?.message || `HTTP ${response.status}`);
            }
        } catch (error) {
            console.error(`Attempt ${attempt} failed:`, error);

            if (attempt === retries) {
                throw error;
            }

            // Exponential backoff
            await new Promise(resolve => setTimeout(resolve, Math.pow(2, attempt) * 1000));
        }
    }
}

TypeScript Client

interface Message {
    role: 'system' | 'user' | 'assistant' | 'tool';
    content?: string;
    tool_calls?: ToolCall[];
    tool_call_id?: string;
}

interface ToolCall {
    id: string;
    type: 'function';
    function: {
        name: string;
        arguments: string;
    };
}

interface Tool {
    type: 'function';
    function: {
        name: string;
        description: string;
        parameters: Record;
    };
}

interface ChatCompletionRequest {
    model: string;
    messages: Message[];
    stream?: boolean;
    temperature?: number;
    max_tokens?: number;
    tools?: Tool[];
    tool_choice?: 'auto' | 'required' | 'none' | { type: 'function'; function: { name: string } };
    parallel_tool_calls?: boolean;
    x_codex?: {
        reasoning_effort?: 'minimal' | 'low' | 'medium' | 'high';
        sandbox?: 'read-only' | 'danger-full-access';
        network_access?: boolean;
        hide_reasoning?: boolean;
    };
}

interface ChatCompletionResponse {
    id: string;
    object: 'chat.completion';
    created: number;
    model: string;
    choices: Array<{
        index: number;
        message: Message;
        finish_reason: 'stop' | 'length' | 'tool_calls' | 'content_filter';
    }>;
    usage: {
        prompt_tokens: number;
        completion_tokens: number;
        total_tokens: number;
    };
}

class API3Client {
    private readonly baseURL: string;
    private readonly apiKey: string;

    constructor(apiKey: string = 'eric', baseURL: string = 'https://api3.exploit.bot/v1') {
        this.apiKey = apiKey;
        this.baseURL = baseURL;
    }

    async chat(request: ChatCompletionRequest): Promise {
        const response = await fetch(`${this.baseURL}/chat/completions`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${this.apiKey}`
            },
            body: JSON.stringify({
                model: 'gpt-5.1',
                ...request
            })
        });

        if (!response.ok) {
            const error = await response.json() as any;
            throw new Error(`API Error: ${response.status} - ${error.detail?.message || 'Unknown'}`);
        }

        return response.json();
    }

    async stream(request: ChatCompletionRequest): Promise> {
        const response = await fetch(`${this.baseURL}/chat/completions`, {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${this.apiKey}`
            },
            body: JSON.stringify({
                ...request,
                stream: true
            })
        });

        if (!response.ok) {
            throw new Error(`Stream error: ${response.status}`);
        }

        const reader = response.body?.getReader();
        const decoder = new TextDecoder();

        if (!reader) {
            throw new Error('No response body');
        }

        return this.parseStream(reader, decoder);
    }

    private async *parseStream(reader: ReadableStreamDefaultReader, decoder: TextDecoder): AsyncGenerator {
        let buffer = '';

        while (true) {
            const { done, value } = await reader.read();
            if (done) break;

            buffer += decoder.decode(value);
            const lines = buffer.split('\n');
            buffer = lines.pop() || '';

            for (const line of lines) {
                if (line.startsWith('data: ')) {
                    const data = line.slice(6);
                    if (data === '[DONE]') return;

                    try {
                        const chunk = JSON.parse(data) as ChatCompletionResponse;
                        yield chunk;
                    } catch (e) {
                        console.error('Parse error:', e);
                    }
                }
            }
        }
    }
}

// Usage Example
const client = new API3Client();

// Basic chat with type safety
async function example() {
    const messages: Message[] = [
        { role: 'system', content: 'You are a TypeScript expert.' },
        { role: 'user', content: 'Explain generics in TypeScript' }
    ];

    const response = await client.chat({
        model: 'gpt-5.1',
        messages,
        temperature: 0.7
    });

    console.log(response.choices[0].message.content);
}

// Function calling with types
async function functionCallingExample() {
    const tools: Tool[] = [
        {
            type: 'function',
            function: {
                name: 'create_file',
                description: 'Create a file with content',
                parameters: {
                    type: 'object',
                    properties: {
                        path: { type: 'string' },
                        content: { type: 'string' }
                    },
                    required: ['path', 'content']
                }
            }
        }
    ];

    const messages: Message[] = [
        { role: 'user', content: 'Create a TypeScript file with a hello world function' }
    ];

    const response = await client.chat({
        model: 'gpt-5.1-codex',
        messages,
        tools,
        tool_choice: 'auto'
    });

    // Type-safe function call handling
    const toolCalls = response.choices[0].message.tool_calls;
    if (toolCalls) {
        for (const toolCall of toolCalls) {
            console.log(`Calling ${toolCall.function.name} with args:`, toolCall.function.arguments);
        }
    }
}

React Hook

import { useState, useCallback } from 'react';

interface UseChatOptions {
    model?: string;
    temperature?: number;
    onStream?: (chunk: string) => void;
    onError?: (error: Error) => void;
}

export function useChat(options: UseChatOptions = {}) {
    const [loading, setLoading] = useState(false);
    const [messages, setMessages] = useState([]);
    const [error, setError] = useState(null);

    const sendMessage = useCallback(async (content: string) => {
        setLoading(true);
        setError(null);

        const newMessage: Message = { role: 'user', content };
        setMessages(prev => [...prev, newMessage]);

        try {
            const response = await fetch('https://api3.exploit.bot/v1/chat/completions', {
                method: 'POST',
                headers: {
                    'Authorization': 'Bearer eric',
                    'Content-Type': 'application/json'
                },
                body: JSON.stringify({
                    model: options.model || 'gpt-5.1',
                    messages: [...messages, newMessage],
                    temperature: options.temperature || 0.7
                })
            });

            if (!response.ok) {
                throw new Error(`Error: ${response.status}`);
            }

            const data = await response.json();
            const assistantMessage = data.choices[0].message;

            setMessages(prev => [...prev, assistantMessage]);
            return assistantMessage;
        } catch (err) {
            const error = err as Error;
            setError(error);
            options.onError?.(error);
            throw error;
        } finally {
            setLoading(false);
        }
    }, [messages, options]);

    const streamMessage = useCallback(async (content: string) => {
        setLoading(true);
        setError(null);

        const newMessage: Message = { role: 'user', content };
        setMessages(prev => [...prev, newMessage]);

        try {
            const response = await fetch('https://api3.exploit.bot/v1/chat/completions', {
                method: 'POST',
                headers: {
                    'Authorization': 'Bearer eric',
                    'Content-Type': 'application/json'
                },
                body: JSON.stringify({
                    model: options.model || 'gpt-5.1',
                    messages: [...messages, newMessage],
                    stream: true
                })
            });

            const reader = response.body?.getReader();
            const decoder = new TextDecoder();
            let assistantContent = '';

            if (reader) {
                while (true) {
                    const { done, value } = await reader.read();
                    if (done) break;

                    const chunk = decoder.decode(value);
                    const lines = chunk.split('\n');

                    for (const line of lines) {
                        if (line.startsWith('data: ')) {
                            const data = line.slice(6);
                            if (data === '[DONE]') return;

                            try {
                                const parsed = JSON.parse(data);
                                const delta = parsed.choices[0]?.delta;

                                if (delta?.content) {
                                    assistantContent += delta.content;
                                    options.onStream?.(delta.content);
                                }
                            } catch (e) {
                                // Ignore parse errors
                            }
                        }
                    }
                }
            }

            const assistantMessage: Message = {
                role: 'assistant',
                content: assistantContent
            };

            setMessages(prev => [...prev, assistantMessage]);
            return assistantMessage;
        } catch (err) {
            const error = err as Error;
            setError(error);
            options.onError?.(error);
            throw error;
        } finally {
            setLoading(false);
        }
    }, [messages, options]);

    return {
        messages,
        loading,
        error,
        sendMessage,
        streamMessage,
        clearMessages: () => setMessages([])
    };
}

// Usage in component
function ChatComponent() {
    const { messages, loading, sendMessage, streamMessage } = useChat({
        model: 'gpt-5.1',
        onStream: (chunk) => console.log('Received:', chunk)
    });

    const handleSubmit = async (e: React.FormEvent) => {
        e.preventDefault();
        const formData = new FormData(e.target as HTMLFormElement);
        const message = formData.get('message') as string;

        await streamMessage(message);
    };

    return (
        
            
                {messages.map((msg, i) => (
                    
                        {msg.role}: {msg.content}
                    
                ))}
            

            {loading && Typing...}

            
                
                
            
        
    );
}

Python OpenAI SDK

from openai import OpenAI

# Initialize client with API3
client = OpenAI(
    api_key="eric",
    base_url="https://api3.exploit.bot/v1"
)

# Basic chat
def basic_chat():
    response = client.chat.completions.create(
        model="gpt-5.1",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Explain the theory of relativity"}
        ],
        temperature=0.7,
        max_tokens=500
    )

    return response.choices[0].message.content

# Streaming
def streaming_example():
    stream = client.chat.completions.create(
        model="gpt-5.1",
        messages=[{"role": "user", "content": "Write a story"}],
        stream=True,
        temperature=0.8
    )

    full_response = ""
    for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            full_response += content
            print(content, end='', flush=True)

    return full_response

# Function calling
def function_calling_example():
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current weather for a location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City and state, e.g. San Francisco, CA"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"]
                        }
                    },
                    "required": ["location"]
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "calculate_tip",
                "description": "Calculate tip amount",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "bill_amount": {"type": "number"},
                        "tip_percentage": {"type": "number"}
                    },
                    "required": ["bill_amount"]
                }
            }
        }
    ]

    response = client.chat.completions.create(
        model="gpt-5.1",
        messages=[
            {"role": "user", "content": "What's the weather in Boston and calculate a 20% tip on a $50 bill?"}
        ],
        tools=tools,
        parallel_tool_calls=True,
        tool_choice="auto"
    )

    return response.choices[0].message

# Using codex-cli with reasoning effort
def codex_reasoning_example():
    response = client.chat.completions.create(
        model="codex-cli high",
        messages=[
            {"role": "user", "content": "Implement a binary search tree in Python with all operations"}
        ],
        x_codex={
            "reasoning_effort": "high",
            "sandbox": "read-only",
            "hide_reasoning": False
        }
    )

    return response.choices[0].message.content

JavaScript/TypeScript OpenAI SDK

import OpenAI from 'openai';

// Initialize client
const openai = new OpenAI({
    apiKey: 'eric',
    baseURL: 'https://api3.exploit.bot/v1',
    dangerouslyAllowBrowser: true // Only for browser usage
});

// Basic chat completion
async function basicChat() {
    const completion = await openai.chat.completions.create({
        model: 'gpt-5.1',
        messages: [
            { role: 'system', content: 'You are a helpful assistant.' },
            { role: 'user', content: 'Explain JavaScript closures' }
        ],
        temperature: 0.7,
        max_tokens: 500
    });

    return completion.choices[0].message.content;
}

// Streaming with SDK
async function streamingChat() {
    const stream = await openai.chat.completions.create({
        model: 'gpt-5.1',
        messages: [
            { role: 'user', content: 'Write a mystery story' }
        ],
        stream: true
    });

    let fullResponse = '';

    for await (const chunk of stream) {
        const content = chunk.choices[0]?.delta?.content;
        if (content) {
            fullResponse += content;
            process.stdout.write(content);
        }
    }

    return fullResponse;
}

// Function calling with SDK
async function functionCalling() {
    const tools = [
        {
            type: 'function',
            function: {
                name: 'search_web',
                description: 'Search the web for information',
                parameters: {
                    type: 'object',
                    properties: {
                        query: {
                            type: 'string',
                            description: 'Search query'
                        },
                        num_results: {
                            type: 'number',
                            description: 'Number of results to return'
                        }
                    },
                    required: ['query']
                }
            }
        }
    ];

    const response = await openai.chat.completions.create({
        model: 'gpt-5.1',
        messages: [
            { role: 'user', content: 'Search for recent AI breakthroughs' }
        ],
        tools,
        tool_choice: 'auto'
    });

    const toolCalls = response.choices[0].message.tool_calls;
    if (toolCalls) {
        for (const toolCall of toolCalls) {
            console.log(`Function: ${toolCall.function.name}`);
            console.log(`Args: ${toolCall.function.arguments}`);
        }
    }

    return response.choices[0].message;
}

// Advanced usage with custom logic
class ChatAssistant {
    private client: OpenAI;
    private conversation: Array = [];

    constructor(model: string = 'gpt-5.1') {
        this.client = new OpenAI({
            apiKey: 'eric',
            baseURL: 'https://api3.exploit.bot/v1'
        });
    }

    async addMessage(role: 'user' | 'assistant', content: string) {
        this.conversation.push({ role, content });
    }

    async respond(useTools = false) {
        const response = await this.client.chat.completions.create({
            model: 'gpt-5.1',
            messages: this.conversation,
            tools: useTools ? this.getTools() : undefined,
            tool_choice: useTools ? 'auto' : undefined
        });

        const message = response.choices[0].message;
        this.conversation.push(message as any);

        // Handle tool calls
        if (message.tool_calls) {
            const results = await this.executeToolCalls(message.tool_calls);

            // Add tool results and get final response
            for (const result of results) {
                this.conversation.push({
                    role: 'tool',
                    tool_call_id: result.tool_call_id,
                    content: result.content
                });
            }

            return this.respond(false); // Get final response without tools
        }

        return message.content;
    }

    private getTools() {
        return [
            {
                type: 'function' as const,
                function: {
                    name: 'get_current_time',
                    description: 'Get current time',
                    parameters: { type: 'object', properties: {} }
                }
            }
        ];
    }

    private async executeToolCalls(toolCalls: OpenAI.Chat.Completions.ChatCompletionMessageToolCall[]) {
        const results = [];

        for (const toolCall of toolCalls) {
            if (toolCall.function.name === 'get_current_time') {
                results.push({
                    tool_call_id: toolCall.id,
                    content: new Date().toISOString()
                });
            }
        }

        return results;
    }
}

// Usage
const assistant = new ChatAssistant();
await assistant.addMessage('user', 'What time is it?');
const response = await assistant.respond(true);
console.log(response);

package main

import (
    "bytes"
    "context"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "os"
    "time"
)

// Structs for API requests and responses
type Message struct {
    Role       string `json:"role"`
    Content    string `json:"content,omitempty"`
    ToolCalls  []ToolCall `json:"tool_calls,omitempty"`
    ToolCallID string `json:"tool_call_id,omitempty"`
}

type ToolCall struct {
    ID       string `json:"id"`
    Type     string `json:"type"`
    Function struct {
        Name      string `json:"name"`
        Arguments string `json:"arguments"`
    } `json:"function"`
}

type Tool struct {
    Type     string `json:"type"`
    Function struct {
        Name        string                 `json:"name"`
        Description string                 `json:"description"`
        Parameters  map[string]interface{} `json:"parameters"`
    } `json:"function"`
}

type ChatRequest struct {
    Model              string    `json:"model"`
    Messages           []Message `json:"messages"`
    Stream             bool      `json:"stream,omitempty"`
    Temperature        float64   `json:"temperature,omitempty"`
    MaxTokens          int       `json:"max_tokens,omitempty"`
    Tools              []Tool    `json:"tools,omitempty"`
    ToolChoice         string    `json:"tool_choice,omitempty"`
    ParallelToolCalls  bool      `json:"parallel_tool_calls,omitempty"`
    XCodex             *XCodex   `json:"x_codex,omitempty"`
}

type XCodex struct {
    ReasoningEffort string `json:"reasoning_effort,omitempty"`
    Sandbox         string `json:"sandbox,omitempty"`
    NetworkAccess   bool   `json:"network_access,omitempty"`
    HideReasoning   bool   `json:"hide_reasoning,omitempty"`
}

type ChatResponse struct {
    ID      string `json:"id"`
    Object  string `json:"object"`
    Created int64  `json:"created"`
    Model   string `json:"model"`
    Choices []struct {
        Index        int     `json:"index"`
        Message      Message `json:"message"`
        FinishReason string  `json:"finish_reason"`
    } `json:"choices"`
    Usage struct {
        PromptTokens     int `json:"prompt_tokens"`
        CompletionTokens int `json:"completion_tokens"`
        TotalTokens      int `json:"total_tokens"`
    } `json:"usage"`
}

// Client struct
type API3Client struct {
    BaseURL string
    APIKey  string
    Client  *http.Client
}

func NewClient(apiKey string) *API3Client {
    return &API3Client{
        BaseURL: "https://api3.exploit.bot/v1",
        APIKey:  apiKey,
        Client: &http.Client{
            Timeout: 120 * time.Second,
        },
    }
}

// Basic chat completion
func (c *API3Client) Chat(ctx context.Context, req ChatRequest) (*ChatResponse, error) {
    req.Stream = false // Ensure non-streaming

    jsonData, err := json.Marshal(req)
    if err != nil {
        return nil, err
    }

    httpReq, err := http.NewRequestWithContext(ctx, "POST", c.BaseURL+"/chat/completions", bytes.NewBuffer(jsonData))
    if err != nil {
        return nil, err
    }

    httpReq.Header.Set("Content-Type", "application/json")
    httpReq.Header.Set("Authorization", "Bearer "+c.APIKey)

    resp, err := c.Client.Do(httpReq)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        body, _ := io.ReadAll(resp.Body)
        return nil, fmt.Errorf("API error: %d - %s", resp.StatusCode, string(body))
    }

    var result ChatResponse
    err = json.NewDecoder(resp.Body).Decode(&result)
    return &result, err
}

// Streaming chat
func (c *API3Client) ChatStream(ctx context.Context, req ChatRequest, callback func(*ChatResponse) error) error {
    req.Stream = true

    jsonData, err := json.Marshal(req)
    if err != nil {
        return err
    }

    httpReq, err := http.NewRequestWithContext(ctx, "POST", c.BaseURL+"/chat/completions", bytes.NewBuffer(jsonData))
    if err != nil {
        return err
    }

    httpReq.Header.Set("Content-Type", "application/json")
    httpReq.Header.Set("Authorization", "Bearer "+c.APIKey)
    httpReq.Header.Set("Accept", "text/event-stream")

    resp, err := c.Client.Do(httpReq)
    if err != nil {
        return err
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        body, _ := io.ReadAll(resp.Body)
        return fmt.Errorf("API error: %d - %s", resp.StatusCode, string(body))
    }

    decoder := json.NewDecoder(resp.Body)

    for {
        var line string
        _, err = fmt.Fscanf(resp.Body, "data: %s\n", &line)
        if err != nil {
            if err == io.EOF {
                break
            }
            continue
        }

        if line == "[DONE]" {
            break
        }

        var chunk ChatResponse
        if err := json.Unmarshal([]byte(line), &chunk); err == nil {
            if callback != nil {
                if err := callback(&chunk); err != nil {
                    return err
                }
            }
        }
    }

    return nil
}

// Function calling example
func functionCallingExample() {
    client := NewClient("eric")

    tools := []Tool{
        {
            Type: "function",
            Function: struct {
                Name        string                 `json:"name"`
                Description string                 `json:"description"`
                Parameters  map[string]interface{} `json:"parameters"`
            }{
                Name:        "calculate",
                Description: "Perform mathematical calculations",
                Parameters: map[string]interface{}{
                    "type": "object",
                    "properties": map[string]interface{}{
                        "expression": map[string]interface{}{
                            "type":        "string",
                            "description": "Mathematical expression",
                        },
                    },
                    "required": []string{"expression"},
                },
            },
        },
    }

    req := ChatRequest{
        Model: "gpt-5.1",
        Messages: []Message{
            {Role: "user", Content: "What is 15 * 23?"},
        },
        Tools:     tools,
        ToolChoice: "auto",
    }

    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    resp, err := client.Chat(ctx, req)
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }

    // Check for tool calls
    if len(resp.Choices) > 0 && len(resp.Choices[0].Message.ToolCalls) > 0 {
        toolCall := resp.Choices[0].Message.ToolCalls[0]
        fmt.Printf("Function called: %s\n", toolCall.Function.Name)
        fmt.Printf("Arguments: %s\n", toolCall.Function.Arguments)

        // Execute function and continue conversation...
    } else {
        fmt.Printf("Response: %s\n", resp.Choices[0].Message.Content)
    }
}

// Streaming example
func streamingExample() {
    client := NewClient("eric")

    req := ChatRequest{
        Model: "gpt-5.1",
        Messages: []Message{
            {Role: "user", Content: "Write a story about a robot"},
        },
    }

    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    fmt.Print("Streaming: ")
    err := client.ChatStream(ctx, req, func(chunk *ChatResponse) error {
        if len(chunk.Choices) > 0 {
            content := chunk.Choices[0].Message.Content
            if content != "" {
                fmt.Print(content)
            }
        }
        return nil
    })

    if err != nil {
        fmt.Printf("\nError: %v\n", err)
    } else {
        fmt.Println("\n\nStream complete")
    }
}

func main() {
    // Basic example
    client := NewClient("eric")

    req := ChatRequest{
        Model: "gpt-5.1",
        Messages: []Message{
            {Role: "system", Content: "You are a helpful assistant."},
            {Role: "user", Content: "Explain Go concurrency"},
        },
        Temperature: 0.7,
    }

    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    resp, err := client.Chat(ctx, req)
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        os.Exit(1)
    }

    fmt.Printf("Response: %s\n", resp.Choices[0].Message.Content)

    // Run other examples
    functionCallingExample()
    streamingExample()
}

use reqwest;
use serde::{Deserialize, Serialize};
use serde_json::json;
use std::collections::HashMap;
use tokio_stream::StreamExt;
use futures::Stream;

// API structs
#[derive(Debug, Serialize, Deserialize)]
struct Message {
    role: String,
    content: Option,
    tool_calls: Option>,
    tool_call_id: Option,
}

#[derive(Debug, Serialize, Deserialize)]
struct ToolCall {
    id: String,
    r#type: String,
    function: FunctionCall,
}

#[derive(Debug, Serialize, Deserialize)]
struct FunctionCall {
    name: String,
    arguments: String,
}

#[derive(Debug, Serialize)]
struct Tool {
    r#type: String,
    function: FunctionDef,
}

#[derive(Debug, Serialize)]
struct FunctionDef {
    name: String,
    description: String,
    parameters: serde_json::Value,
}

#[derive(Debug, Serialize)]
struct ChatRequest {
    model: String,
    messages: Vec,
    stream: Option,
    temperature: Option,
    max_tokens: Option,
    tools: Option>,
    tool_choice: Option,
    parallel_tool_calls: Option,
    x_codex: Option,
}

#[derive(Debug, Serialize)]
struct XCodex {
    reasoning_effort: Option,
    sandbox: Option,
    network_access: Option,
    hide_reasoning: Option,
}

#[derive(Debug, Deserialize)]
struct ChatResponse {
    id: String,
    object: String,
    created: u64,
    model: String,
    choices: Vec,
    usage: Usage,
}

#[derive(Debug, Deserialize)]
struct Choice {
    index: u32,
    message: Message,
    finish_reason: String,
}

#[derive(Debug, Deserialize)]
struct Usage {
    prompt_tokens: u32,
    completion_tokens: u32,
    total_tokens: u32,
}

// Client struct
struct API3Client {
    base_url: String,
    api_key: String,
    client: reqwest::Client,
}

impl API3Client {
    fn new(api_key: &str) -> Self {
        Self {
            base_url: "https://api3.exploit.bot/v1".to_string(),
            api_key: api_key.to_string(),
            client: reqwest::Client::new(),
        }
    }

    async fn chat(&self, request: ChatRequest) -> Result> {
        let url = format!("{}/chat/completions", self.base_url);

        let mut req_body = request;
        req_body.stream = Some(false);

        let response = self.client
            .post(&url)
            .header("Authorization", format!("Bearer {}", self.api_key))
            .header("Content-Type", "application/json")
            .json(&req_body)
            .send()
            .await?;

        if !response.status().is_success() {
            let error_text = response.text().await?;
            return Err(format!("API error: {}", error_text).into());
        }

        let chat_response: ChatResponse = response.json().await?;
        Ok(chat_response)
    }

    async fn chat_stream(
        &self,
        request: ChatRequest,
    ) -> Result>>, Box> {
        let url = format!("{}/chat/completions", self.base_url);

        let mut req_body = request;
        req_body.stream = Some(true);

        let response = self.client
            .post(&url)
            .header("Authorization", format!("Bearer {}", self.api_key))
            .header("Content-Type", "application/json")
            .json(&req_body)
            .send()
            .await?;

        if !response.status().is_success() {
            let error_text = response.text().await?;
            return Err(format!("API error: {}", error_text).into());
        }

        let byte_stream = response.bytes_stream();
        let line_stream = byte_stream
            .map(|result| {
                result.map_err(|e| Box::new(e) as Box)
            })
            .scan(Vec::new(), |buffer, chunk| {
                async move {
                    let chunk = chunk?;
                    buffer.extend_from_slice(&chunk);

                    let mut lines = Vec::new();
                    while let Some(newline_pos) = buffer.iter().position(|&b| b == b'\n') {
                        let line = buffer.drain(..=newline_pos).collect::>();
                        let line_str = String::from_utf8_lossy(&line[..line.len()-1]);
                        lines.push(line_str.to_string());
                    }

                    Some(Ok(lines))
                }
            })
            .flat_map(futures::stream::iter);

        let chunk_stream = line_stream
            .filter_map(|lines_result| async move {
                match lines_result {
                    Ok(lines) => Some(futures::stream::iter(lines).collect::>()),
                    Err(e) => Some(futures::stream::iter(vec![Err(e)]).collect::>()),
                }
            })
            .flatten()
            .filter_map(|line_result| async move {
                match line_result {
                    Ok(line) => {
                        if line.starts_with("data: ") {
                            let data = &line[6..];
                            if data == "[DONE]" {
                                None
                            } else {
                                match serde_json::from_str::(data) {
                                    Ok(chunk) => Some(Ok(chunk)),
                                    Err(e) => Some(Err(Box::new(e))),
                                }
                            }
                        } else {
                            None
                        }
                    }
                    Err(e) => Some(Err(e)),
                }
            });

        Ok(chunk_stream)
    }
}

// Basic usage example
#[tokio::main]
async fn main() -> Result<(), Box> {
    let client = API3Client::new("eric");

    // Basic chat
    let request = ChatRequest {
        model: "gpt-5.1".to_string(),
        messages: vec![
            Message {
                role: "system".to_string(),
                content: Some("You are a helpful Rust assistant.".to_string()),
                tool_calls: None,
                tool_call_id: None,
            },
            Message {
                role: "user".to_string(),
                content: Some("Explain Rust's ownership system".to_string()),
                tool_calls: None,
                tool_call_id: None,
            },
        ],
        stream: None,
        temperature: Some(0.7),
        max_tokens: Some(500),
        tools: None,
        tool_choice: None,
        parallel_tool_calls: None,
        x_codex: None,
    };

    let response = client.chat(request).await?;
    println!("Response: {}", response.choices[0].message.content.as_ref().unwrap());

    // Function calling example
    let tools = vec![
        Tool {
            r#type: "function".to_string(),
            function: FunctionDef {
                name: "execute_code".to_string(),
                description: "Execute code and return result".to_string(),
                parameters: json!({
                    "type": "object",
                    "properties": {
                        "code": {
                            "type": "string",
                            "description": "Code to execute"
                        },
                        "language": {
                            "type": "string",
                            "enum": ["python", "javascript", "rust"],
                            "description": "Programming language"
                        }
                    },
                    "required": ["code", "language"]
                }),
            },
        },
    ];

    let tool_request = ChatRequest {
        model: "gpt-5.1".to_string(),
        messages: vec![
            Message {
                role: "user".to_string(),
                content: Some("Calculate the factorial of 10".to_string()),
                tool_calls: None,
                tool_call_id: None,
            },
        ],
        tools: Some(tools),
        tool_choice: Some(json!("auto")),
        parallel_tool_calls: Some(true),
        ..Default::default()
    };

    let tool_response = client.chat(tool_request).await?;

    if let Some(tool_calls) = &tool_response.choices[0].message.tool_calls {
        for tool_call in tool_calls {
            println!("Function: {}", tool_call.function.name);
            println!("Args: {}", tool_call.function.arguments);

            // Execute function and continue conversation...
        }
    } else {
        println!("No function calls made");
    }

    // Streaming example
    println!("\nStreaming:");
    let stream_request = ChatRequest {
        model: "gpt-5.1".to_string(),
        messages: vec![
            Message {
                role: "user".to_string(),
                content: Some("Write a poem about programming".to_string()),
                tool_calls: None,
                tool_call_id: None,
            },
        ],
        ..Default::default()
    };

    let mut stream = client.chat_stream(stream_request).await?;

    while let Some(chunk_result) = stream.next().await {
        match chunk_result {
            Ok(chunk) => {
                if let Some(content) = chunk.choices.get(0).and_then(|c| c.message.content.as_ref()) {
                    print!("{}", content);
                }
            }
            Err(e) => {
                eprintln!("Stream error: {:?}", e);
                break;
            }
        }
    }

    println!("\n\nStream complete");

    Ok(())
}

// Helper macro for Default implementation
macro_rules! default {
    ($struct_name:ident { $($field:ident: $default:expr),* }) => {
        impl Default for $struct_name {
            fn default() -> Self {
                $struct_name {
                    $($field: $default),*,
                    ..Default::default()
                }
            }
        }
    };
}

// Implement Default for ChatRequest
impl Default for ChatRequest {
    fn default() -> Self {
        Self {
            model: String::new(),
            messages: Vec::new(),
            stream: None,
            temperature: None,
            max_tokens: None,
            tools: None,
            tool_choice: None,
            parallel_tool_calls: None,
            x_codex: None,
        }
    }
}

impl Default for Message {
    fn default() -> Self {
        Self {
            role: String::new(),
            content: None,
            tool_calls: None,
            tool_call_id: None,
        }
    }
}

8. Advanced Features

Multi-modal Capabilities

Models support image analysis through vision capabilities. Include images as base64 data or URLs:

{
  "model": "gpt-5.1-codex",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Analyze this screenshot of code"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
            "detail": "high"  // "low" or "high" resolution
          }
        }
      ]
    }
  ]
}

Temperature and Sampling Parameters

temperature: 0-2, controls randomness (default: 1.0)
top_p: 0-1, nucleus sampling (default: 1.0)
frequency_penalty: -2.0 to 2.0, reduce repetition
presence_penalty: -2.0 to 2.0, encourage new topics

Response Format Controls

{
  "model": "gpt-5.1-codex",
  "messages": [...],
  "response_format": {
    "type": "json_object"  // or "text"
  }
}

System Message Optimization

Best practices for system messages:

Keep it concise and clear
Define the role explicitly
Set specific constraints or guidelines
Specify output format requirements

Codex-Specific Features

Sandbox Modes

read-only: Safe mode, no file modifications
danger-full-access: Full system access (requires server permission)

Network Access

{
  "x_codex": {
    "network_access": true,  // Allow internet access
    "sandbox": "read-only",
    "reasoning_effort": "high",
    "hide_reasoning": false  // Show reasoning process
  }
}

Batch Processing

Process multiple requests concurrently using the parallel processing capability:

import asyncio
import aiohttp

async def batch_process(prompts, max_concurrent=5):
    semaphore = asyncio.Semaphore(max_concurrent)

    async def process(prompt):
        async with semaphore:
            # Make API call
            return await call_api(prompt)

    tasks = [process(p) for p in prompts]
    results = await asyncio.gather(*tasks)
    return results

Token Optimization

Use efficient message formatting
Truncate or summarize long conversations
Remove redundant information
Use concise language in prompts

Caching Strategies

import hashlib
import json
from functools import lru_cache

def cache_key(model, messages, **kwargs):
    """Generate cache key for requests"""
    content = json.dumps({
        'model': model,
        'messages': messages,
        **kwargs
    }, sort_keys=True)
    return hashlib.md5(content.encode()).hexdigest()

@lru_cache(maxsize=100)
def cached_response(cache_key):
    """Cache responses to identical requests"""
    # API call here
    pass

9. Error Handling

HTTP Status Codes

Status Code	Meaning	When Occurs
200	OK	Successful request
400	Bad Request	Invalid request body or parameters
401	Unauthorized	Missing or invalid API key
404	Not Found	Model not found or invalid endpoint
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Server error during processing

Error Response Format

{ "detail": { "message": "Model 'invalid-model' is not available. Choose one of: gpt-5.1, codex-cli", "type": "invalid_request_error", "code": "model_not_found" } }

Common Error Scenarios

Model Not Found

{
  "detail": {
    "message": "Model 'gpt-4' is not available. Choose one of: gpt-5.1, codex-cli, ...",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Tool Execution Error

{
  "detail": {
    "message": "Tool 'invalid_tool' not found in tools list",
    "type": "invalid_request_error",
    "code": "tool_not_found"
  }
}

Stream Interrupted

{
  "detail": {
    "message": "Stream connection interrupted",
    "type": "stream_error",
    "code": "connection_lost"
  }
}

Retry Strategy

import time
import random
import asyncio

async def resilient_api_call(request_func, max_retries=5):
    """Resilient API calling with exponential backoff"""

    for attempt in range(max_retries):
        try:
            return await request_func()

        except Exception as e:
            # Don't retry on client errors (4xx)
            if hasattr(e, 'response') and 400 <= e.response.status_code < 500:
                raise

            # Calculate backoff with jitter
            base_delay = 2 ** attempt
            jitter = random.uniform(0, 1)
            delay = min(base_delay + jitter, 30)  # Cap at 30 seconds

            if attempt < max_retries - 1:
                print(f"Attempt {attempt + 1} failed, retrying in {delay:.2f}s...")
                await asyncio.sleep(delay)
            else:
                raise

# Usage
async def safe_chat_request():
    async def make_request():
        # Your API call here
        pass

    return await resilient_api_call(make_request)

Error Monitoring

import logging
from dataclasses import dataclass
from typing import Optional
from datetime import datetime

@dataclass
class APIError:
    timestamp: datetime
    error_type: str
    status_code: Optional[int]
    message: str
    request_id: Optional[str]
    retry_count: int

class ErrorMonitor:
    def __init__(self):
        self.errors = []
        self.logger = logging.getLogger('api_errors')

    def record_error(self, error: Exception, status_code: Optional[int] = None):
        error_info = APIError(
            timestamp=datetime.now(),
            error_type=type(error).__name__,
            status_code=status_code,
            message=str(error),
            request_id=getattr(error, 'request_id', None),
            retry_count=getattr(error, 'retry_count', 0)
        )

        self.errors.append(error_info)
        self.logger.error(f"API Error: {error_info}")

        # Check for patterns
        self.check_error_patterns()

    def check_error_patterns(self):
        """Analyze error patterns for insights"""
        recent_errors = [e for e in self.errors
                        if (datetime.now() - e.timestamp).seconds < 300]

        # High error rate?
        if len(recent_errors) > 10:
            self.logger.warning("High error rate detected!")

        # Rate limiting?
        rate_limits = [e for e in recent_errors if e.status_code == 429]
        if len(rate_limits) > 5:
            self.logger.warning("Frequent rate limiting - consider throttling")

10. Rate Limiting

Implementation Details

Type: In-memory rate limiter per IP address
Default Limit: 60 requests per minute
Window: 60-second sliding window
Configuration: RATE_LIMIT_PER_MINUTE environment variable

Rate Limit Headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1704067200
Retry-After: 60

Client-Side Rate Limiting

import time
from collections import deque
import asyncio

class RateLimiter:
    def __init__(self, max_requests=60, time_window=60):
        self.max_requests = max_requests
        self.time_window = time_window
        self.requests = deque()

    async def acquire(self):
        """Wait if rate limit would be exceeded"""
        now = time.time()

        # Remove old requests
        while self.requests and self.requests[0] < now - self.time_window:
            self.requests.popleft()

        # Check if we can make a request
        if len(self.requests) >= self.max_requests:
            sleep_time = self.time_window - (now - self.requests[0])
            await asyncio.sleep(sleep_time)
            return await self.acquire()

        self.requests.append(now)

# Usage
limiter = RateLimiter(max_requests=30, time_window=60)

async def make_api_call():
    await limiter.acquire()
    # Make API call here
    pass

Adaptive Rate Limiting

class AdaptiveRateLimiter:
    def __init__(self):
        self.base_limit = 60
        self.current_limit = 60
        self.error_count = 0
        self.last_adjustment = time.time()

    async def acquire(self):
        now = time.time()

        # Adjust limit based on recent errors
        if now - self.last_adjustment > 60:
            if self.error_count > 5:
                self.current_limit = max(10, self.current_limit // 2)
                print(f"Rate limit reduced to {self.current_limit}")
            elif self.error_count == 0:
                self.current_limit = min(self.base_limit, self.current_limit + 10)

            self.error_count = 0
            self.last_adjustment = now

        await super().acquire()

    def record_error(self):
        self.error_count += 1

Best Practices

Implement client-side rate limiting to avoid hitting server limits
Use exponential backoff when receiving 429 responses
Cache responses when appropriate to reduce API calls
Consider using websockets for real-time applications
Monitor rate limit headers to adjust behavior dynamically

11. Integration Guides

Next.js Integration

// app/api/chat/route.ts
import { NextRequest, NextResponse } from 'next/server';

export async function POST(request: NextRequest) {
    try {
        const body = await request.json();

        const response = await fetch('https://api3.exploit.bot/v1/chat/completions', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': 'Bearer eric'
            },
            body: JSON.stringify({
                model: 'gpt-5.1',
                ...body
            })
        });

        if (!response.ok) {
            const error = await response.json();
            return NextResponse.json(
                { error: error.detail?.message || 'API Error' },
                { status: response.status }
            );
        }

        if (body.stream) {
            // Handle streaming
            const reader = response.body?.getReader();
            const encoder = new TextEncoder();

            const readable = new ReadableStream({
                async start(controller) {
                    while (true) {
                        const { done, value } = await reader!.read();
                        if (done) break;
                        controller.enqueue(value);
                    }
                    controller.close();
                }
            });

            return new NextResponse(readable, {
                headers: {
                    'Content-Type': 'text/event-stream',
                    'Cache-Control': 'no-cache',
                    'Connection': 'keep-alive'
                }
            });
        }

        const data = await response.json();
        return NextResponse.json(data);

    } catch (error) {
        console.error('Chat API error:', error);
        return NextResponse.json(
            { error: 'Internal server error' },
            { status: 500 }
        );
    }
}

// components/ChatComponent.tsx
'use client';

import { useState } from 'react';

export default function ChatComponent() {
    const [messages, setMessages] = useState([]);
    const [loading, setLoading] = useState(false);
    const [input, setInput] = useState('');

    const sendMessage = async () => {
        if (!input.trim() || loading) return;

        setLoading(true);
        const userMessage = { role: 'user', content: input };
        setMessages(prev => [...prev, userMessage]);

        try {
            const response = await fetch('/api/chat', {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({
                    messages: [...messages, userMessage],
                    stream: true
                })
            });

            const reader = response.body?.getReader();
            const decoder = new TextDecoder();
            let assistantMessage = { role: 'assistant', content: '' };

            setMessages(prev => [...prev, assistantMessage]);

            while (true) {
                const { done, value } = await reader!.read();
                if (done) break;

                const chunk = decoder.decode(value);
                const lines = chunk.split('\n');

                for (const line of lines) {
                    if (line.startsWith('data: ')) {
                        const data = line.slice(6);
                        if (data === '[DONE]') return;

                        try {
                            const parsed = JSON.parse(data);
                            const content = parsed.choices[0]?.delta?.content;

                            if (content) {
                                assistantMessage.content += content;
                                setMessages(prev => {
                                    const newMessages = [...prev];
                                    newMessages[newMessages.length - 1] = assistantMessage;
                                    return newMessages;
                                });
                            }
                        } catch (e) {
                            // Ignore parse errors
                        }
                    }
                }
            }
        } catch (error) {
            console.error('Error:', error);
        } finally {
            setLoading(false);
            setInput('');
        }
    };

    return (
        
            
                {messages.map((msg, i) => (
                    
                        {msg.content}
                    
                ))}
            

            
                 setInput(e.target.value)}
                    onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
                    placeholder="Type a message..."
                    disabled={loading}
                />
                
            
        
    );
}

Python FastAPI Integration

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
import httpx
import asyncio

app = FastAPI(title="API3 Proxy")

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"],
)

class API3Client:
    def __init__(self):
        self.base_url = "https://api3.exploit.bot/v1"
        self.api_key = "eric"

    async def chat_completion(self, request_data: dict):
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.base_url}/chat/completions",
                headers={
                    "Authorization": f"Bearer {self.api_key}",
                    "Content-Type": "application/json"
                },
                json=request_data,
                timeout=60.0
            )

            if response.status_code != 200:
                raise HTTPException(
                    status_code=response.status_code,
                    detail=response.json()
                )

            return response.json()

api3 = API3Client()

@app.post("/chat/completions")
async def chat_completions(request: dict):
    try:
        # Add default model if not specified
        if "model" not in request:
            request["model"] = "gpt-5.1"

        # Handle streaming
        if request.get("stream", False):
            return await StreamingResponse(api3.stream_chat(request))

        return await api3.chat_completion(request)

    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

class StreamingResponse:
    def __init__(self, stream_generator):
        self.stream_generator = stream_generator

    async def __aiter__(self):
        async for chunk in self.stream_generator:
            yield f"data: {chunk}\n\n"
        yield "data: [DONE]\n\n"

# Add authentication, logging, monitoring, etc.
@app.middleware("http")
async def add_process_time_header(request, call_next):
    import time
    start_time = time.time()
    response = await call_next(request)
    process_time = time.time() - start_time
    response.headers["X-Process-Time"] = str(process_time)
    return response

Spring Boot Integration

// API3Service.java
@Service
public class API3Service {
    private final RestTemplate restTemplate;
    private final String BASE_URL = "https://api3.exploit.bot/v1";
    private final String API_KEY = "eric";

    public API3Service(RestTemplateBuilder builder) {
        this.restTemplate = builder
            .defaultHeader("Authorization", "Bearer " + API_KEY)
            .defaultHeader("Content-Type", "application/json")
            .build();
    }

    public ChatResponse chatCompletion(ChatRequest request) {
        if (request.getModel() == null) {
            request.setModel("gpt-5.1");
        }

        try {
            return restTemplate.postForObject(
                BASE_URL + "/chat/completions",
                request,
                ChatResponse.class
            );
        } catch (HttpClientErrorException e) {
            throw new API3Exception(e.getResponseBodyAsString());
        }
    }

    public Flux streamChat(ChatRequest request) {
        if (request.getModel() == null) {
            request.setModel("gpt-5.1");
        }

        request.setStream(true);

        return WebClient.builder()
            .baseUrl(BASE_URL)
            .defaultHeader("Authorization", "Bearer " + API_KEY)
            .build()
            .post()
            .uri("/chat/completions")
            .bodyValue(request)
            .retrieve()
            .bodyToFlux(String.class)
            .filter(line -> line.startsWith("data: "))
            .map(line -> line.substring(6))
            .takeWhile(data -> !data.equals("[DONE]"));
    }
}

// ChatController.java
@RestController
@RequestMapping("/api/chat")
@CrossOrigin(origins = "*")
public class ChatController {
    private final API3Service api3Service;

    public ChatController(API3Service api3Service) {
        this.api3Service = api3Service;
    }

    @PostMapping
    public ResponseEntity chat(@RequestBody ChatRequest request) {
        try {
            ChatResponse response = api3Service.chatCompletion(request);
            return ResponseEntity.ok(response);
        } catch (API3Exception e) {
            return ResponseEntity.status(HttpStatus.BAD_REQUEST).build();
        }
    }

    @GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux> streamChat(@RequestBody ChatRequest request) {
        return api3Service.streamChat(request)
            .map(data -> ServerSentEvent.builder(data).build());
    }
}

Docker Integration

# Dockerfile
FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .

EXPOSE 3000

CMD ["npm", "start"]

# docker-compose.yml
version: '3.8'

services:
  app:
    build: .
    ports:
      - "3000:3000"
    environment:
      - API3_API_KEY=eric
      - API3_BASE_URL=https://api3.exploit.bot/v1
      - NODE_ENV=production
    depends_on:
      - redis
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data
    restart: unless-stopped

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./ssl:/etc/nginx/ssl
    depends_on:
      - app
    restart: unless-stopped

volumes:
  redis_data:

12. SDK Compatibility

OpenAI SDK Compatibility

API3 is fully compatible with OpenAI's official SDKs. Simply change the base URL:

Python SDK

# Full compatibility with version 1.0+
from openai import OpenAI

client = OpenAI(
    api_key="eric",
    base_url="https://api3.exploit.bot/v1"
)

# All standard methods work:
completion = client.chat.completions.create(
    model="gpt-5.1",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Streaming
stream = client.chat.completions.create(
    model="gpt-5.1",
    messages=[{"role": "user", "content": "Stream me"}],
    stream=True
)

# Function calling
response = client.chat.completions.create(
    model="gpt-5.1",
    messages=[{"role": "user", "content": "Use tools"}],
    tools=[...],
    tool_choice="auto"
)

JavaScript/TypeScript SDK

import OpenAI from 'openai';

const openai = new OpenAI({
    apiKey: 'eric',
    baseURL: 'https://api3.exploit.bot/v1',
    dangerouslyAllowBrowser: true // For browser usage
});

// Compatible with all standard methods
const completion = await openai.chat.completions.create({
    model: 'gpt-5.1',
    messages: [{ role: 'user', content: 'Hello!' }]
});

// Streaming
const stream = await openai.chat.completions.create({
    model: 'gpt-5.1',
    messages: [{ role: 'user', content: 'Stream please' }],
    stream: true
});

for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Supported Features

✅ Chat completions with all models
✅ Streaming responses (SSE)
✅ Model listing via /v1/models
✅ Message history and conversation context
✅ Image inputs (vision capabilities)
✅ Function/tool calling with parallel execution
✅ Temperature, top_p, and other sampling parameters
✅ System messages and role-based conversations
✅ Codex-specific extensions (x_codex header)
✅ Reasoning effort levels (for codex models)
❌ Fine-tuning (not supported)
❌ Embeddings (not supported)
❌ Audio processing (not supported)
❌ File operations (not supported)
❌ Moderation endpoint (not supported)

Compatibility Notes

Important Differences from OpenAI API

Authentication: Uses "eric" as default test key instead of sk-*
Model Selection: Different model names (gpt-5.1, codex-cli, etc.)
Extended Features: x_codex options provide additional functionality
Reasoning Levels: Built-in reasoning effort control for codex models
Alternative Endpoint: /v1/responses provides Anthropic-style API
Sandbox Options: Additional security modes available
Usage Tracking: Token counts may not be accurately tracked

Migration Guide

From OpenAI API

// Old OpenAI code
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: 'sk-...' });

// Migrated to API3 - just change baseURL and API key!
const openai = new OpenAI({
    baseURL: 'https://api3.exploit.bot/v1',
    apiKey: 'eric'
});

// Everything else stays the same!
const response = await openai.chat.completions.create({
    model: 'gpt-5.1',  // Use available model
    messages: messages
});

From Anthropic API

// You can use the /v1/responses endpoint for Anthropic-style
const response = await fetch('https://api3.exploit.bot/v1/responses', {
    method: 'POST',
    headers: {
        'Authorization': 'Bearer eric',
        'Content-Type': 'application/json'
    },
    body: JSON.stringify({
        model: 'gpt-5.1',
        input: [
            { role: 'user', content: 'Hello!' }
        ]
    })
});

Third-Party Libraries Support

Most libraries that support custom base URLs will work with API3:

LangChain

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-5.1",
    openai_api_base="https://api3.exploit.bot/v1",
    openai_api_key="eric"
)

result = llm(["What is RAG?"])
print(result)

LlamaIndex

from llama_index import LLMPredictor, ServiceContext

llm_predictor = LLMPredictor(
    llm=ChatOpenAI(
        model="gpt-5.1",
        openai_api_base="https://api3.exploit.bot/v1",
        openai_api_key="eric"
    )
)

service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

Auto-GPT

# In .env file
OPENAI_API_BASE=https://api3.exploit.bot/v1
OPENAI_API_KEY=eric
OPENAI_API_MODEL=gpt-5.1

Table of Contents

1. API Overview

Key Features

2. Authentication

Bearer Token Authentication

Authorization Header

Implementation

3. Available Models

🚀 26 Model Variants Available

🎯 Working Base Models

✅ Confirmed Working Models

🎛️ Dynamic Reasoning Control

⚙️ Use x_codex Parameter for Reasoning Control

✅ Working Reasoning Levels

⚠️ Currently Unavailable

📚 Usage Examples

Basic Model Selection

Reasoning Level Control (Recommended)

Low Reasoning for Speed

🎯 Model Selection Guide

📊 Performance Characteristics

⚡ Speed Rankings

⭐ Quality Rankings

💰 Cost Efficiency

🎯 Best Practices

4. Endpoints

GET /v1/models

Headers

Response Schema

POST /v1/chat/completions

Headers

Request Body Schema

Response Schema (Non-streaming)

Streaming Response Format

POST /v1/responses

Request Body

Response Schema (Non-streaming)

🚀 Async Job Mode (NEW)

How It Works

Example: Async Chat Completion

GET /v1/jobs/{job_id}

Response Statuses

Example

GET /v1/jobs/{job_id}/stream

Example

GET /v1/jobs

Query Parameters

Example

DELETE /v1/jobs/{job_id}

Example

GET /health

5. Codex CLI Tools & Capabilities

🛠️ Available Tools

🔧 Enabling Tools

Tool Activation via x_codex Parameter

Basic Tool Request

Multiple Tools Combination

🌐 Web Search Tool

Real-time Information Access

Capabilities

Usage Examples

Weather Query

News Research

Expected Response

💻 Computer Use Tool

File System & Code Operations

Capabilities

Usage Examples

Create Python Script

File Analysis

Created Files Example

🐚 Shell Tool

System Command Execution

Capabilities

Usage Examples

Directory Listing

System Information

🔒 Security & Sandbox Controls

Workspace Isolation

Sandbox Modes