OpenAI Compatible API

TokenCode is fully compatible with the OpenAI API protocol. You can use the OpenAI SDK or any OpenAI-compatible client to connect directly.

Base URL

text
https://tokencode.dev/v1

Authentication

All endpoints require authentication:

  • Bearer Token: Authorization: Bearer <your-api-key>
  • API Key Header: x-api-key: <your-api-key>

List Models

Returns the list of currently available models.

bash
curl https://tokencode.dev/v1/models \
  -H "Authorization: Bearer sk-your-api-key"

Response example:

json
{
  "object": "list",
  "data": [
    { "id": "gpt-5.5", "object": "model", "owned_by": "openai" },
    { "id": "claude-sonnet-4-6", "object": "model", "owned_by": "anthropic" },
    { "id": "gemini-2.5-pro", "object": "model", "owned_by": "google" }
  ]
}

Chat Completions

OpenAI-compatible chat completion endpoint. Supports all upstream models, not just OpenAI models.

bash
curl https://tokencode.dev/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Request parameters:

Parameter Type Required Description
model string Yes Model ID (e.g., gpt-5.5, claude-sonnet-4-6)
messages array Yes Array of message objects
stream boolean No Enable streaming response
temperature number No Sampling temperature (0-2)
max_tokens integer No Maximum tokens to generate
top_p number No Nucleus sampling probability
n integer No Number of completion candidates
stop string/array No Stop sequences
presence_penalty number No Presence penalty (-2 to 2)
frequency_penalty number No Frequency penalty (-2 to 2)
tools array No Function Calling tool definitions
tool_choice string/object No Tool calling strategy

Response example:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is a type of computing that leverages quantum mechanics..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 128,
    "total_tokens": 153
  }
}

Streaming Response

When "stream": true is set, the response uses SSE format:

text
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Qu"},"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"an"},"index":0}]}

data: [DONE]

Embeddings

Generate text vector embeddings.

bash
curl https://tokencode.dev/v1/embeddings \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Hello world"
  }'

Request parameters:

Parameter Type Required Description
model string Yes Embedding model ID
input string/array Yes Text to embed
encoding_format string No Encoding format (float or base64)

Automatic Protocol Conversion

One of TokenCode's core advantages is automatic protocol conversion. When you call a non-OpenAI model (such as Claude or Gemini) through the OpenAI-compatible endpoint, the platform automatically:

  1. Request conversion: Converts the OpenAI-format request body to the target model's native format
  2. Response conversion: Converts the target model's native response to OpenAI format
  3. Streaming adaptation: Streaming responses are automatically format-converted as well

This means you can use the same OpenAI SDK code to call all models without worrying about underlying protocol differences.

Using the OpenAI SDK

python
from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://tokencode.dev/v1"
)

# Call an OpenAI model
response = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "Hello"}]
)

# Call a Claude model — same code, automatic protocol conversion
response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello"}]
)

# Call a Gemini model — same approach
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "Hello"}]
)

Error Responses

All errors follow the OpenAI error format:

json
{
  "error": {
    "message": "Invalid API key",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}
HTTP Status Code Meaning
400 Invalid request parameters
401 Authentication failed — invalid API Key
403 Insufficient permissions — model unavailable
429 Rate limit or insufficient balance
500 Internal server error
502 Upstream service error