Skip to content

feat: LLM driven kernel agent for testing#10

Merged
vgene merged 4 commits intomainfrom
feat/kernel-agent
Feb 27, 2026
Merged

feat: LLM driven kernel agent for testing#10
vgene merged 4 commits intomainfrom
feat/kernel-agent

Conversation

@vgene
Copy link
Contributor

@vgene vgene commented Feb 4, 2026

Issue #, if available:

Description of changes:

Add new nkipy.tools.kernel_agent tool providing:

  • CLI for discovering NumPy op support status on NKI hardware
  • Multi-stage kernel execution (numpy, compile, hardware)
  • LLM-based kernel generation via AWS Bedrock
  • Op testing framework with dtype and shape validation
  • Sweep test with a given set of simple prompts

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@vgene vgene force-pushed the feat/kernel-agent branch from 822467b to 588500e Compare February 14, 2026 08:04
@vgene vgene changed the title Feature: LLM driven kernel agent for testing feat: LLM driven kernel agent for testing Feb 24, 2026
Add new nkipy.tools.kernel_agent tool providing:
- CLI for discovering NumPy op support status on NKI hardware
- Multi-stage kernel execution (numpy, compile, hardware)
- LLM-based kernel generation via AWS Bedrock
- Op testing framework with dtype and shape validation
@vgene vgene force-pushed the feat/kernel-agent branch from 588500e to 0aff2b0 Compare February 25, 2026 12:13
…te functions

Extract `_compile_kernel` and `_execute_neff` helper functions from `baremetal_run_traced_kernel`.
- Implement sweep functionality for continuous generation-and-test loops
- Include 50+ kernel prompt templates for common operations
- Add CLI commands: generate, sweep, list-prompts
- Configure package data to include prompt text files
- Update documentation with new commands and architecture overview
@vgene vgene marked this pull request as ready for review February 26, 2026 12:30
@vgene vgene requested a review from a team February 26, 2026 12:30
3. No loops or conditionals based on array values
4. The function must be self-contained — only use numpy (imported as np)

Output ONLY a JSON object such as:
Copy link

@shaowz-aws shaowz-aws Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this output and then some manual parser script may still be unstable: there might be edge cases like improperly escaped characters within the embedded source code, incorrect indentations, etc. We can consider using anthropic SDK which has bedrock and structured output support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the output to structured. But the numpy code is still hard to enforce. The code will be executed as numpy first in the test so if it's invalid, it will show in that stage.

@@ -0,0 +1,62 @@
# ML activations / norms
softmax along the last axis
Copy link

@shaowz-aws shaowz-aws Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, I would also include things like input/output shapes/datatypes when asking the agent to write kernels, or the LLM can produce anything that are mostly reasonable but sometimes unexpected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to let the prompt as underspecified as possible so the LLM can generate "random" numpy code to test the lowering. The goal is to find corner cases that fail in NKIPy.

input_specs = data.get("inputs", {})

inputs = {}
for inp_name, spec in input_specs.items():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic feels fragile. Maybe it's ok as a starting point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to tool use to make sure no fallthrough shape and dtype

from nkipy.tools.kernel_agent.executor import ExecutionResult, run_kernel

# Target operations to test
TARGET_OPS = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this classification of ops!

@vgene vgene force-pushed the feat/kernel-agent branch 2 times, most recently from 716a1fe to 81816ae Compare February 27, 2026 13:51
@vgene vgene force-pushed the feat/kernel-agent branch from 81816ae to fbb62c9 Compare February 27, 2026 13:52
@vgene vgene merged commit c4cdc7b into main Feb 27, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants