Skip to content

feat: add Avian as LLM service provider#1015

Open
avianion wants to merge 2 commits intodatalab-to:masterfrom
avianion:feat/add-avian-service
Open

feat: add Avian as LLM service provider#1015
avianion wants to merge 2 commits intodatalab-to:masterfrom
avianion:feat/add-avian-service

Conversation

@avianion
Copy link
Copy Markdown

@avianion avianion commented Apr 4, 2026

Summary

  • Adds AvianService as a new LLM service option for the Avian inference API (https://api.avian.io/v1), which is OpenAI-compatible
  • Follows the same patterns as the existing OpenAIService and AzureOpenAIService implementations
  • Adds corresponding test in tests/services/test_service_init.py

Available Models

Model Context Window
DeepSeek V3.2 (deepseek-v3.2) 164K
Kimi K2.5 (kimi-k2.5) 128K
GLM-5 (glm-5) 128K
MiniMax M2.5 (minimax-m2.5) 1M

Usage

marker_single input.pdf \
  --llm_service marker.services.avian.AvianService \
  --avian_api_key YOUR_API_KEY \
  --avian_model deepseek-v3.2

Configuration

Option Description Default
avian_api_key API key for the Avian service (required)
avian_model Model to use deepseek-v3.2
avian_image_format Image format for multimodal requests png

Test plan

  • Added test_llm_avian to tests/services/test_service_init.py verifying service initialization
  • Verify structured output (JSON schema) works with Avian's OpenAI-compatible endpoint
  • Verify image processing works for multimodal PDF pages

Add AvianService for the Avian inference API (https://api.avian.io/v1),
an OpenAI-compatible endpoint offering DeepSeek V3.2, Kimi K2.5, GLM-5,
and MiniMax M2.5 models.

Usage:
  marker_single input.pdf --llm_service marker.services.avian.AvianService --avian_api_key <key>

Configuration options:
  - avian_api_key: API key (required)
  - avian_model: model name (default: deepseek-v3.2)
  - avian_image_format: image format (default: png)
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@avianion
Copy link
Copy Markdown
Author

avianion commented Apr 4, 2026

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Apr 4, 2026
@avianion
Copy link
Copy Markdown
Author

Addressed feedback: corrected minimax-m2.5 token limits in docstring — changed from "1M context" to "200K context, 16K max output" to accurately reflect MiniMax M2.5 (MiniMax-Text-01) capabilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant