-
Notifications
You must be signed in to change notification settings - Fork 108
Add cutlass python dsl executor for quack-kernels
#2719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
crcrpar
wants to merge
23
commits into
main
Choose a base branch
from
crpa/quack
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+856
−0
Open
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
7fab90c
Add cutlass-python-dsl executor.
crcrpar 2f42fa2
[no ci] add crossentropy
crcrpar b95c9e4
[no ci] add layer norm forward
crcrpar a9eaf4d
[no ci] add rmsnorm
crcrpar 0b51a12
fix
crcrpar d769096
fix backward of crossentropy
crcrpar 4a0ced4
fix checkers
crcrpar 1a2a868
[no ci] add test
crcrpar d6efb9a
DRY: dtypes & their ids
crcrpar c7bbf34
comment out backward for now
crcrpar 4daff6e
upcast inputs to fp32 for reference
crcrpar 3711497
fix how softmax is called
crcrpar 85a8e65
upcast and downcast for reference layernorm
crcrpar 1b633d5
fix typo of rmsnorm
crcrpar 27ff159
fix meta
crcrpar 0958dba
add cutlass_dsl_ex to all_executors
crcrpar 6680878
Only forward, no backward support for now
crcrpar b9c876d
call non-augmented forward in execution transform
crcrpar 340f14e
quack bench
crcrpar 968420c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] b4408d1
fix quack availability check
crcrpar 1e5f3b2
mandate weight in layer|rms norm
crcrpar 0e95dbd
Merge branch 'main' into crpa/quack
crcrpar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class does not call Benchmark.init during initialization. (BaseBenchmarkForQuack.init may be missing a call to a base class init)