-
Notifications
You must be signed in to change notification settings - Fork 3
updates RetryLLMHandler to reuse LiteLLM retry mechanism #502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I'm inclined to say we shouldn't have standalone handlers in the library that are just passing a particular |
|
@eb8680 that makes sense |
|
@eb8680 , @datvo06 LiteLLM's retry only seems to be intended for the network errors that arise during LLM calls. Validation errors seem to be out of scope. The following fails. class EngineState(enum.Enum):
OFF = "off"
WARMING_UP = "warming_up"
READY = "ready"
SHUTTING_DOWN = "shutting_down"
class EngineConfig(BaseModel):
description: str
state: EngineState
@pydantic.model_validator(mode='after')
def verify_self(self) -> Self:
if self.state != EngineState.WARMING_UP:
raise ValueError("The infinity engine is never ready, and always in a warming up state.")
return self
@Template.define
def predict_engine_config(description: str) -> EngineConfig:
"""Given the description \"{description}\" of things I did,
predict the configuration of the engine after I perform those
tasks."""
raise NotHandled
@requires_openai
def test_num_retries_allowed_for_provider(request):
"""Test that LiteLLMProvider works with `num_retries`."""
description = "I insert my keys into the car, turn it. The car revs. I drive off into the distance."
with handler(ReplayLiteLLMProvider(request, model_name="gpt-5-nano", num_retries=3)):
config = predict_engine_config(description)
print(config)Given this, it might be worthwhile keeping |
|
@eb8680 yep, both points makes sense. Moving this PR and the issue to blocked. |
This PR simplifies the implementation of
RetryLLMHandlerto instead use LiteLLM's built in retry mechanism.We do not need to have specific handling for tool calls as this is already handled by
call_too_with_json_args.Closes #494