Skip to content

Conversation

@1fanwang
Copy link
Contributor

@1fanwang 1fanwang commented Dec 3, 2025

closes #1282

Testing Done

  1. Precommit Checks
check for case conflicts.................................................Passed
check for merge conflicts................................................Passed
check for broken symlinks............................(no files to check)Skipped
detect private key.......................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
maturin develop..........................................................Passed
generate CLI documentation...........................(no files to check)Skipped
cargo fmt................................................................Passed
cargo test...............................................................Passed
mypy type check..........................................................Passed
ruff format..............................................................Passed
pytest...................................................................Passed
  1. end to end testing

2. End-to-End Testing with Azure OpenAI

Test Setup

  • Azure OpenAI endpoint: https://cocoindex.openai.azure.com
  • Deployment: gpt-4o
  • API version: 2024-08-01-preview
  • Test data: 2 text files with geography questions

Flow Execution
$ cocoindex update main
AzureOpenAIE2ETest.questions (batch update): 2/2 source rows: (+) 2 added

Verification - Structured Data Extracted by Azure OpenAI
SELECT filename, question, capital, country
FROM azure_openai_test_answers
ORDER BY filename;

Results:

   filename    |            question            | capital | country 
---------------+--------------------------------+---------+---------
 question1.txt | What is the capital of France? | Paris   | France
 question2.txt | What is the capital of Japan?  | Tokyo   | Japan
(2 rows)

Azure OpenAI successfully:

  • Processed 2 documents through CocoIndex
  • Extracted structured data using ExtractByLlm
  • Returned correct answers (Paris/France, Tokyo/Japan)
  • Stored results in Postgres database

Copy link
Member

@georgeh0 georgeh0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this PR!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wonder is it possible to reuse the crate::llm::openai::Client, but just wrap it with a separate initialization logic? (we're doing this for a bunch of other clients, like LiteLLM, OpenRouter, vLLM etc.)

@1fanwang 1fanwang changed the title Support Azure OpenAI as LLM provider feat: Support Azure OpenAI as LLM provider Dec 3, 2025
@1fanwang 1fanwang force-pushed the 1fanwang/support-azure-openai branch from 863ab8d to b91c846 Compare December 3, 2025 23:08
@1fanwang
Copy link
Contributor Author

1fanwang commented Dec 3, 2025

Thanks for creating this PR!

apologizes I must have messed up when I was squashing a few commits now it's all squashed into the initial commit instead of showing up separately as a commit to address the comments

Comment on lines +47 to +49
Ok(Self {
client: OpenAIClient::with_config(azure_config),
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it possible to directly use super::openai::Client::from_parts(OpenAIClient::with_config(azure_config)) here? We won't need to define another Client struct and implement LlmGenerationClient and LlmEmbeddingClient below.

Just similar to LiteLLM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya I actually attempted this approach however hitting type mismatch error

The openai::Client::from_parts() function expects:
pub(crate) fn from_parts(client: async_openai::Client<OpenAIConfig>) -> Self
But we're trying to pass:
async_openai::Client<AzureConfig>
The async_openai::Client is generic over the config type C, however AzureConfig and OpenAIConfig are of different types.
LiteLLM works because it uses OpenAIConfig with a different base URL, but Azure OpenAI needs its own Client struct with AzureConfig.

https://docs.rs/async-openai/latest/async_openai/config/struct.AzureConfig.html
https://docs.rs/async-openai/latest/async_openai/config/struct.OpenAIConfig.html

  --> rust/cocoindex/src/llm/azureopenai.rs:33:57
   |
33 |         Ok(Client::from_parts(OpenAIClient::with_config(azure_config)))
   |                               ------------------------- ^^^^^^^^^^^^ expected `OpenAIConfig`, found `AzureConfig`
   |                               |
   |                               arguments to this function are incorrect
   |
note: associated function defined here
  --> /Users/stewang/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/async-openai-0.30.1/src/client.rs:51:12
   |
51 |     pub fn with_config(config: C) -> Self {
   |            ^^^^^^^^^^^

For more information about this error, try `rustc --explain E0308`.
error: could not compile `cocoindex` (lib) due to 1 previous error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. They have two different type parameters. Thanks for trying out!

We can merge this PR now. I can do some refactorfor openai.rs later.

@georgeh0
Copy link
Member

georgeh0 commented Dec 4, 2025

Thanks for creating this PR!

apologizes I must have messed up when I was squashing a few commits now it's all squashed into the initial commit instead of showing up separately as a commit to address the comments

No worries!

@georgeh0 georgeh0 marked this pull request as ready for review December 4, 2025 04:48
@badmonster0
Copy link
Member

Great PR thanks a lot @1fanwang for the contribution!

@1fanwang
Copy link
Contributor Author

1fanwang commented Dec 4, 2025

Updated PR with 28b04d8 and end to end testing result in PR description

@1fanwang 1fanwang requested a review from georgeh0 December 4, 2025 08:59
Comment on lines +47 to +49
Ok(Self {
client: OpenAIClient::with_config(azure_config),
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. They have two different type parameters. Thanks for trying out!

We can merge this PR now. I can do some refactorfor openai.rs later.

@georgeh0 georgeh0 merged commit 96ea4f0 into cocoindex-io:main Dec 4, 2025
9 checks passed
@badmonster0
Copy link
Member

Great feature, thank you so much @1fanwang Stephan! Welcome to the community and we are excited to learn from you down the road!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Support Azure OpenAI as LLM provider

3 participants