Skip to content

Latest commit

 

History

History
388 lines (273 loc) · 11.6 KB

File metadata and controls

388 lines (273 loc) · 11.6 KB

API Reference

Complete API documentation for Jan Server services.

Available APIs

1. LLM API (Port 8080)

OpenAI-compatible API for chat completions, conversations, and models.

What it does:

  • Generate AI responses to user messages
  • Manage conversations and chat history
  • Organize conversations in projects
  • List available AI models
  • Handle user authentication
  • Support images via jan_* IDs
  • Generate images from text prompts

Documentation:

2. Response API (Port 8082)

Executes tools and generates AI responses for complex tasks.

What it does:

  • Run multiple tools in sequence (up to 8 steps)
  • Chain tool outputs together
  • Generate final answers using LLM
  • Track execution time and status

Documentation:

3. Media API (Port 8285)

Handles image uploads and storage.

What it does:

  • Upload images from URLs or base64 data
  • Store images in S3 cloud storage
  • Generate jan_* IDs for images
  • Create temporary download links
  • Prevent duplicate uploads

Documentation:

4. MCP Tools API (Port 8091)

Provides Model Context Protocol tools for search, scraping, lightweight vector search, and sandboxed execution.

Available Tools:

  • google_search - Serper/SearXNG-backed web search with filters and location hints
  • scrape - Fetch and parse a web page (optional Markdown output)
  • file_search_index / file_search_query - Index custom text into the bundled vector store and run similarity queries
  • python_exec - Run trusted code via SandboxFusion, returning stdout/stderr/artifacts

Documentation:

API Guides

Quick Reference

Base URLs

Environment LLM API Response API Media API MCP Tools Gateway
Local http://localhost:8080 http://localhost:8082 http://localhost:8285 http://localhost:8091 http://localhost:8000
Docker http://llm-api:8080 http://response-api:8082 http://media-api:8285 http://mcp-tools:8091 http://kong:8000

Recommended: Point all public clients at the Kong gateway (port 8000) so authentication, rate limiting, and routing stay consistent. Direct service ports remain available for internal tests but still require JWT/API key headers.

Authentication

All API endpoints require authentication. The Kong gateway (port 8000) validates your credentials and forwards requests to backend services.

Authentication Methods

1. Bearer Token (Recommended for Development)

Get a guest token from Keycloak and use it in the Authorization header:

# Request a guest token
curl -X POST http://localhost:8000/llm/auth/guest-login

# Response
{
 "access_token": "eyJhbGci...",
 "refresh_token": "eyJhbGci...",
 "expires_in": 300,
 "token_type": "Bearer"
}

# Use the token in requests
curl -H "Authorization: Bearer eyJhbGci..." \
 http://localhost:8000/v1/chat/completions

2. API Key (For Production Clients)

Use the X-API-Key header with your API key:

curl -H "X-API-Key: sk_your_api_key_here" \
 http://localhost:8000/v1/chat/completions

Token Management

Refresh Tokens:

curl -X POST http://localhost:8000/llm/auth/refresh \
 -H "Content-Type: application/json" \
 -d '{"refresh_token": "eyJhbGci..."}'

Revoke Tokens:

curl -X POST http://localhost:8000/llm/auth/revoke \
 -H "Authorization: Bearer <token>" \
 -H "Content-Type: application/json" \
 -d '{"token": "eyJhbGci..."}'

Direct Service Access

When calling services directly (ports 8080/8082/8285/8091) instead of through Kong:

  • You still need a valid Keycloak JWT
  • Use the same Authorization: Bearer <token> header
  • API key authentication is NOT available (Kong-only feature)

Example direct call:

# Still requires JWT token from Keycloak
curl -H "Authorization: Bearer <token>" \
 http://localhost:8080/v1/chat/completions

Authentication Flow

  1. Client requests guest login or uses API key
  2. Kong validates credentials (JWT signature + expiry, or API key lookup)
  3. Kong forwards request to backend service with JWT in header
  4. Backend service validates JWT signature and claims
  5. Request is processed and response returned

Best Practice: Always use the Kong gateway (port 8000) for client applications. Direct service ports are for internal communication and debugging only.

Quick Examples

Chat Completion

curl -X POST http://localhost:8000/v1/chat/completions \
 -H "Authorization: Bearer <token>" \
 -H "Content-Type: application/json" \
 -d '{
 "model": "jan-v1-4b",
 "messages": [
 {"role": "user", "content": "Hello!"}
 ]
 }'

Google Search (MCP)

curl -X POST http://localhost:8000/v1/mcp \
 -H "Authorization: Bearer <token>" \
 -H "Content-Type: application/json" \
 -d '{
 "jsonrpc": "2.0",
 "method": "tools/call",
 "params": {
 "name": "google_search",
 "arguments": {"q": "AI news"}
 }
 }'

Calling MCP Tools directly (e.g., http://localhost:8091/v1/mcp) is supported for internal testing, but the gateway-provided JWT/API key is still required when Kong proxies the request.

List Models

curl -H "Authorization: Bearer <token>" \
 http://localhost:8000/v1/models

API Conventions

Response Format

All successful responses return JSON:

{
 "data": {...},
 "meta": {...}
}

Error Format

All errors follow this structure:

{
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_parameter",
    "message": "Parameter 'model' is required",
    "param": "model",
    "request_id": "req_123xyz"
  }
}

Error Types

Type Description HTTP Status
invalid_request_error Invalid request parameters 400
auth_error Authentication failed 401
permission_error Insufficient permissions 403
not_found_error Resource not found 404
rate_limit_error Too many requests 429
internal_error Server error 500

Headers

Request Headers:

  • Authorization: Bearer <token> - Required for authenticated endpoints
  • Content-Type: application/json - For POST/PUT requests
  • Idempotency-Key: <uuid> - Optional, for idempotent POST requests
  • X-Request-Id: <uuid> - Optional, for request tracing

Response Headers:

  • X-Request-Id - Request identifier for tracing
  • X-Auth-Method - Authentication method used (jwt or api_key)
  • Content-Type: application/json - JSON response
  • Content-Type: text/event-stream - SSE streaming response

Pagination

List endpoints support pagination:

curl "http://localhost:8000/v1/conversations?limit=10&after=conv_123"

Response:

{
 "data": [...],
 "next_after": "conv_456"
}

Streaming

Chat completions support Server-Sent Events (SSE) streaming:

curl -X POST http://localhost:8000/v1/chat/completions \
 -H "Authorization: Bearer <token>" \
 -d '{"model":"jan-v1-4b","messages":[...],"stream":true}'

Response:

data: {"id":"chat-123","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chat-123","choices":[{"delta":{"content":"!"}}]}

data: [DONE]

Interactive API Documentation

Access the interactive Swagger UI:

Local: http://localhost:8000/api/swagger/index.html

Try API calls directly from your browser with built-in authentication.

SDK & Client Libraries

Official SDKs

Official SDKs are coming soon. In the meantime, use OpenAI-compatible clients with the Jan Server base URL.

Community SDKs

Contributions welcome! Jan Server is OpenAI-compatible, so most OpenAI client libraries work with minor configuration changes.

JavaScript Example (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:8000/v1",
  apiKey: "your_guest_token_here",
});

const response = await client.chat.completions.create({
  model: "jan-v1-4b",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

Rate Limits

Currently, Jan Server does not enforce rate limits in development mode.

Production deployments should configure rate limiting via Kong Gateway.

API Versioning

All APIs are versioned using URL path versioning:

  • Current version: /v1/
  • Future versions will be: /v2/, /v3/, etc.

Breaking changes will only occur in new major versions.

Support


Explore APIs: LLM API -> | MCP Tools -> | Interactive Docs: Swagger UI ->