-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add gateway to Known Model Names #3593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add gateway to Known Model Names #3593
Conversation
|
there's a flaky outlines test breaking CI :( |
| openai_names = [f'openai:{n}' for n in get_model_names(OpenAIModelName)] | ||
| bedrock_names = [f'bedrock:{n}' for n in get_model_names(BedrockModelName)] | ||
| deepseek_names = ['deepseek:deepseek-chat', 'deepseek:deepseek-reasoner'] | ||
| gateway_names = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize you couldn't see this comment in a private Slack channel, but I responded to Samuel (and he agreed):
Agreed, but if we add just a few per provider, people are definitely gonna ask us “but why not this other one?”, so maybe just include them all?
So we should NOT hard-code this, but dynamically build this based on the known model names of the providers that are known to work with gateway/{provider}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now done
…/dsfaccini/pydantic-ai into add-gateway-to-known-model-names
tests/models/test_model_names.py
Outdated
| openai_names = [f'openai:{n}' for n in get_model_names(OpenAIModelName)] | ||
| bedrock_names = [f'bedrock:{n}' for n in get_model_names(BedrockModelName)] | ||
| deepseek_names = ['deepseek:deepseek-chat', 'deepseek:deepseek-reasoner'] | ||
| gateway_names = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the pydantic_ai.providers.gateway.UpstreamProvider type (possibly needs splitting up into 2 so that we don't include the APi types here), and then use those names as keys into a dict of things like {'openai': OpenAIModelName}? That way when we update it in the gateway file, this list will update too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will check it out
|
|
||
|
|
||
| # These are only API formats, but we still support them for convenience. | ||
| ApiFormatProviders = Literal[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ApiFormats? Or do we call them API Flavors now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied the comment as was, but yeah flavor's definitely correcter (correcter than the word correcter)
| ] | ||
|
|
||
|
|
||
| def gateway_provider_to_model_names() -> Mapping[ModelProviders, object]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be a public method :)
| from pydantic_ai.models.groq import GroqModelName | ||
| from pydantic_ai.models.openai import OpenAIModelName | ||
|
|
||
| return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this mapping to test_model_names.py so that the get_model_names used there can also take a provider name.
Then [f'grok:{n}' for n in get_model_names(GrokModelName)] could become [f'grok:{n}' for n in get_model_names('grok')], or prefixed_model_names('grok'). And then if we have the complete dict, we can basically just to model_names = [prefixed_model_names(provider) for provider in provider_model_names.keys()] or something like that.
Basically clean up all this duplication:
anthropic_names = [f'anthropic:{n}' for n in get_model_names(AnthropicModelName)]
cohere_names = [f'cohere:{n}' for n in get_model_names(CohereModelName)]
google_names = [f'google-gla:{n}' for n in get_model_names(GoogleModelName)] + [
f'google-vertex:{n}' for n in get_model_names(GoogleModelName)
]
grok_names = [f'grok:{n}' for n in get_model_names(GrokModelName)]
groq_names = [f'groq:{n}' for n in get_model_names(GroqModelName)]
moonshotai_names = [f'moonshotai:{n}' for n in get_model_names(MoonshotAIModelName)]
mistral_names = [f'mistral:{n}' for n in get_model_names(MistralModelName)]
openai_names = [f'openai:{n}' for n in get_model_names(OpenAIModelName)]
bedrock_names = [f'bedrock:{n}' for n in get_model_names(BedrockModelName)]
deepseek_names = ['deepseek:deepseek-chat', 'deepseek:deepseek-reasoner']
gateway_names = [
f'gateway/{provider}:{model_name}'
for provider, model_names in gateway_provider_to_model_names().items()
for model_name in get_model_names(model_names)
]
huggingface_names = [f'huggingface:{n}' for n in get_model_names(HuggingFaceModelName)]By having just a dict of provider names to model name types (or functions like in the case of get_heroku_model_names)
No description provided.