Add production deployment scripts and documentation#19
Conversation
This commit adds comprehensive deployment infrastructure for EdgeTAM: 1. Model Export Scripts: - export_to_onnx.py: Export PyTorch model to ONNX format - convert_to_tensorrt.py: Convert ONNX to TensorRT engines 2. Inference Examples: - deploy/pytorch_inference.py: Reference PyTorch implementation - deploy/onnx_inference.py: Production-ready ONNX inference - deploy/tensorrt_inference.py: High-performance TensorRT inference 3. Documentation: - DEPLOYMENT.md: Comprehensive deployment guide (Turkish) - requirements-deploy.txt: Deployment dependencies Features: - Support for ONNX and TensorRT deployment - Simulation mode for performance benchmarking - Real-world integration examples - Docker deployment instructions - Performance optimization tips The deployment pipeline: PyTorch Model -> ONNX -> TensorRT (FP32/FP16/INT8)
|
Hi @sahinemreaslan! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
This commit fixes the ONNX export error by properly handling the high-resolution features that EdgeTAM uses. Changes: 1. export_to_onnx.py: - EdgeTAMImageEncoder now exports high-res features (256x256, 128x128) - EdgeTAMMaskDecoder accepts optional high-res feature inputs - Auto-detect use_high_res_features from model config - Update opset version to 18 (recommended for PyTorch 2.3+) 2. deploy/onnx_inference.py: - Support high-res features in inference - Auto-detect model capabilities from ONNX outputs - Handle both single-output and multi-output encoders 3. deploy/tensorrt_inference.py: - Allocate additional GPU buffers for high-res features - Support high-res features in encode/decode pipeline - Auto-detect engine capabilities The exported ONNX models now properly utilize EdgeTAM's high-resolution feature pyramid for better segmentation accuracy.
The forward_image() method already applies conv_s0 and conv_s1 to the high-resolution features, so we should not apply them again in the export wrapper. This was causing a channel mismatch error: 'expected input to have 256 channels, but got 32 channels instead'
The new torch.export/dynamo exporter has compatibility issues with EdgeTAM's complex architecture. Switch to the legacy ONNX exporter by setting dynamo=False, which is more stable and widely tested. This resolves torch.export tracing errors with the model's forward_image and high-resolution feature handling.
Display encoder outputs and decoder inputs count to help debug model configuration issues. This will show whether high-res features are correctly detected.
Dynamic axes were causing broadcasting errors in ONNX Runtime with certain operations (like index_put). Use fixed batch size (1) and fixed number of points (1) for more stable ONNX export. This is acceptable for production deployment as: - Batch size 1 is typical for real-time inference - Multiple points can be added in sequence if needed - Fixed shapes have better runtime performance Fixes: ONNXRuntimeError with Where/index_put_2 node
- Fix default config path in export_to_onnx.py (configs/edgetam.yaml -> sam2/configs/edgetam.yaml) - Update DEPLOYMENT.md with correct config paths - Fix help text in export script to reference correct inference script - Add HIZLI_BASLANGIC.md (Turkish quick start guide) with step-by-step instructions - Improve user experience with clear setup and usage instructions
This commit adds comprehensive deployment infrastructure for EdgeTAM:
Model Export Scripts:
Inference Examples:
Documentation:
Features:
The deployment pipeline:
PyTorch Model -> ONNX -> TensorRT (FP32/FP16/INT8)