-
Notifications
You must be signed in to change notification settings - Fork 33
feat: Support for allocating GPU memory based on the selected profile #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support for allocating GPU memory based on the selected profile #108
Conversation
Signed-off-by: Anmol Gupta <[email protected]>
Signed-off-by: Anmol Gupta <[email protected]>
|
For kUSER_MANAGED , the user (in this case the triton server) would need to actually allocate a piece of device memory and pass to execution context. Do you have support for this behavior? If not I would suggest you to only add kSTATIC and kON_PROFILE_CHANGE |
removed user_managed option
src/instance_state.cc
Outdated
| // the first context creation. As currently triton supports one | ||
| // context per engine, in order to set the specified profile_index, | ||
| // another context is created and the previous context is destroyed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As currently triton supports one context per engine, in order to set the specified profile_index, another context is created and the previous context is destroyed.
Is the comment still valid? From the code, each profile_index holds a context.
if (profile_index == 0) {
res.first->second.context_ = std::move(default_trt_context);
} else {
res.first->second.context_.reset(engine_->createExecutionContext());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested at my end, the changes work for me.
Signed-off-by: Anmol Gupta <[email protected]>
Co-authored-by: Yingge He <[email protected]>
Co-authored-by: Yingge He <[email protected]>
|
@yinggeh: New changes look good to me; I got the expected results on the models with these updates. |
|
Updated README.md |
|
LGTM. Thanks for your contribution. |
The changes in the PR support 2 main items: