cuda_core: add "10 minutes to cuda.core" tutorial to docs#2289
Open
sri-koundinyan wants to merge 1 commit into
Open
cuda_core: add "10 minutes to cuda.core" tutorial to docs#2289sri-koundinyan wants to merge 1 commit into
sri-koundinyan wants to merge 1 commit into
Conversation
|
Add a beginner-friendly "10 minutes to cuda.core" guide that walks through the core workflow (select a device, compile a kernel, allocate memory, copy, launch, time with events, use multiple streams, capture a CUDA graph, and interoperate with CuPy/PyTorch), then wire it into the cuda.core docs table of contents between the installation and examples sections. Signed-off-by: Sri Koundinyan <skoundinyan@nvidia.com>
976cc4d to
046eac2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This adds a beginner-friendly "10 minutes to cuda.core" guide to the
cuda.coredocs. It walks through the core workflow with small, runnable snippets: selecting a device, compiling a CUDA C++ kernel, allocating memory, copying, launching, timing with events, using multiple streams, capturing a CUDA graph, and interoperating with CuPy and PyTorch. The new page lives atcuda_core/docs/source/10_minutes_to_cuda_core.rstand is added to the docs table of contents between the Installation and Examples sections. All code snippets were verified to run end-to-end on CUDA 13.