Hi everyone, I try to run model chandra in kaggle and have some issue
Environment
Platform: Kaggle
GPU: Tesla T4 x2 (15GB VRAM per GPU)
CUDA: 12.x
PyTorch: (please fill version)
Model: Chandra OCR 9B
Mode: Inference only
Batch size: 1
Description
When running OCR inference on a single image/page using Chandra OCR 9B on Kaggle (T4 x2), the process fails with a CUDA Out of Memory error, even though there is still free VRAM available on the GPU.
The model loads successfully onto GPU(s), but fails during the actual inference step.
Error Log
[1/1] Processing: page_011.png
Loaded 1 page(s)
Processing pages 1-1...
Error processing page_011.png:
CUDA out of memory. Tried to allocate 8.45 GiB.
GPU 0 has a total capacity of 14.74 GiB of which 6.70 GiB is free.
Process 13242 has 8.04 GiB memory in use.
Of the allocated memory:
- 7.87 GiB is allocated by PyTorch
- 41.04 MiB is reserved by PyTorch but unallocated
If reserved but unallocated memory is large try setting:
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
See documentation:
https://pytorch.org/docs/stable/notes/cuda.html#environment-variables
Processing complete. Results saved to: output
Input Details
Input type: PNG image / PDF page
Image size after render: 596 × 843 px
Pages processed: 1
No batching
The input image/page is small and should not cause excessive VRAM usage.
Hi everyone, I try to run model chandra in kaggle and have some issue
Environment
Description
When running OCR inference on a single image/page using Chandra OCR 9B on Kaggle (T4 x2), the process fails with a CUDA Out of Memory error, even though there is still free VRAM available on the GPU.
The model loads successfully onto GPU(s), but fails during the actual inference step.
Error Log
[1/1] Processing: page_011.png
Loaded 1 page(s)
Processing pages 1-1...
Error processing page_011.png:
CUDA out of memory. Tried to allocate 8.45 GiB.
GPU 0 has a total capacity of 14.74 GiB of which 6.70 GiB is free.
Process 13242 has 8.04 GiB memory in use.
Of the allocated memory:
If reserved but unallocated memory is large try setting:
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
See documentation:
https://pytorch.org/docs/stable/notes/cuda.html#environment-variables
Processing complete. Results saved to: output
Input Details
Input type: PNG image / PDF page
Image size after render: 596 × 843 px
Pages processed: 1
No batching
The input image/page is small and should not cause excessive VRAM usage.