Skip to content

[Runtime] GLM 5.2 FP8 multi node runtimes#634

Open
YouNeedCryDear wants to merge 1 commit into
mainfrom
feat/glm-5.2-multi-node
Open

[Runtime] GLM 5.2 FP8 multi node runtimes#634
YouNeedCryDear wants to merge 1 commit into
mainfrom
feat/glm-5.2-multi-node

Conversation

@YouNeedCryDear

Copy link
Copy Markdown
Collaborator

What this PR does

Adds GLM 5.2 FP8 multi-node runtime configuration:

  • Adds the vllm-glm-5-2-fp8-multi ClusterServingRuntime with an SMG router and vLLM leader/worker engine configuration.
  • Registers the runtime in config/runtimes/kustomization.yaml.
  • Adds a matching InferenceService sample for glm-5-2-fp8-multi.
  • Adds the GLM-5.2-FP8 base model configuration.

Why we need it

Enables OME users to deploy GLM 5.2 FP8 on a multi-node topology with vLLM.

Fixes #

How to test

Not run locally; configuration-only PR submission.

Checklist

  • Tests added/updated (if applicable)
  • Docs updated (if applicable)
  • make test passes locally

@github-actions github-actions Bot added runtime Runtime configuration changes models Model configuration changes config Configuration changes labels Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config Configuration changes models Model configuration changes runtime Runtime configuration changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant