Fix JaggedTensor single-element constructor unconditionally initializing CUDA via pinned_memory #468
+30
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
pinned_memoryconditional on the tensor device being CUDA in two locations where single-elementJaggedTensorconstruction unconditionally allocated pinned (page-locked) memory viacudaHostAlloc, which forced CUDA runtime initialization even for CPU-only tensors.DataLoaderworker processes (where re-initializing CUDA afterfork()is forbidden) and added unnecessary overhead for CPU-only workloads.JaggedTensoroffsets are not pinned.Fixes #467
Changes
src/fvdb/JaggedTensor.cpp.pinned_memory(true)→.pinned_memory(mData.device().is_cuda())in theJaggedTensor(const std::vector<torch::Tensor>&)single-element branch.src/fvdb/detail/ops/JOffsetsFromJIdx.cu.pinned_memory(true)→.pinned_memory(jdata.device().is_cuda())injoffsetsFromJIdx(), which is the shared implementation called by CPU, CUDA, and PrivateUse1 dispatch paths.tests/unit/test_jagged_tensor.pyNew
test_cpu_single_element_no_cuda_initverifying both constructor paths produce non-pinned offsets for CPU tensors.Test plan
python -m pytest tests/unit/test_jagged_tensor.py::TestJaggedTensor::test_cpu_single_element_no_cuda_init -vpassespython -m pytest tests/unit/test_jagged_tensor.py::TestJaggedTensor::test_batch_size_one_cpu_float32 -vpasses (existing single-element test)python -m pytest tests/unit/test_jagged_tensor.py -vfull test suite passesMade with Cursor