Add production deployment scripts and documentation by sahinemreaslan · Pull Request #19 · facebookresearch/EdgeTAM

sahinemreaslan · 2025-11-10T08:08:08Z

This commit adds comprehensive deployment infrastructure for EdgeTAM:

Model Export Scripts:
- export_to_onnx.py: Export PyTorch model to ONNX format
- convert_to_tensorrt.py: Convert ONNX to TensorRT engines
Inference Examples:
- deploy/pytorch_inference.py: Reference PyTorch implementation
- deploy/onnx_inference.py: Production-ready ONNX inference
- deploy/tensorrt_inference.py: High-performance TensorRT inference
Documentation:
- DEPLOYMENT.md: Comprehensive deployment guide (Turkish)
- requirements-deploy.txt: Deployment dependencies

Features:

Support for ONNX and TensorRT deployment
Simulation mode for performance benchmarking
Real-world integration examples
Docker deployment instructions
Performance optimization tips

The deployment pipeline:
PyTorch Model -> ONNX -> TensorRT (FP32/FP16/INT8)

This commit adds comprehensive deployment infrastructure for EdgeTAM: 1. Model Export Scripts: - export_to_onnx.py: Export PyTorch model to ONNX format - convert_to_tensorrt.py: Convert ONNX to TensorRT engines 2. Inference Examples: - deploy/pytorch_inference.py: Reference PyTorch implementation - deploy/onnx_inference.py: Production-ready ONNX inference - deploy/tensorrt_inference.py: High-performance TensorRT inference 3. Documentation: - DEPLOYMENT.md: Comprehensive deployment guide (Turkish) - requirements-deploy.txt: Deployment dependencies Features: - Support for ONNX and TensorRT deployment - Simulation mode for performance benchmarking - Real-world integration examples - Docker deployment instructions - Performance optimization tips The deployment pipeline: PyTorch Model -> ONNX -> TensorRT (FP32/FP16/INT8)

meta-cla · 2025-11-10T08:08:14Z

Hi @sahinemreaslan!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

This commit fixes the ONNX export error by properly handling the high-resolution features that EdgeTAM uses. Changes: 1. export_to_onnx.py: - EdgeTAMImageEncoder now exports high-res features (256x256, 128x128) - EdgeTAMMaskDecoder accepts optional high-res feature inputs - Auto-detect use_high_res_features from model config - Update opset version to 18 (recommended for PyTorch 2.3+) 2. deploy/onnx_inference.py: - Support high-res features in inference - Auto-detect model capabilities from ONNX outputs - Handle both single-output and multi-output encoders 3. deploy/tensorrt_inference.py: - Allocate additional GPU buffers for high-res features - Support high-res features in encode/decode pipeline - Auto-detect engine capabilities The exported ONNX models now properly utilize EdgeTAM's high-resolution feature pyramid for better segmentation accuracy.

The forward_image() method already applies conv_s0 and conv_s1 to the high-resolution features, so we should not apply them again in the export wrapper. This was causing a channel mismatch error: 'expected input to have 256 channels, but got 32 channels instead'

The new torch.export/dynamo exporter has compatibility issues with EdgeTAM's complex architecture. Switch to the legacy ONNX exporter by setting dynamo=False, which is more stable and widely tested. This resolves torch.export tracing errors with the model's forward_image and high-resolution feature handling.

Display encoder outputs and decoder inputs count to help debug model configuration issues. This will show whether high-res features are correctly detected.

Dynamic axes were causing broadcasting errors in ONNX Runtime with certain operations (like index_put). Use fixed batch size (1) and fixed number of points (1) for more stable ONNX export. This is acceptable for production deployment as: - Batch size 1 is typical for real-time inference - Multiple points can be added in sequence if needed - Fixed shapes have better runtime performance Fixes: ONNXRuntimeError with Where/index_put_2 node

- Fix default config path in export_to_onnx.py (configs/edgetam.yaml -> sam2/configs/edgetam.yaml) - Update DEPLOYMENT.md with correct config paths - Fix help text in export script to reference correct inference script - Add HIZLI_BASLANGIC.md (Turkish quick start guide) with step-by-step instructions - Improve user experience with clear setup and usage instructions

claude added 6 commits November 10, 2025 08:58

Add detailed model configuration logging

8bcedc4

Display encoder outputs and decoder inputs count to help debug model configuration issues. This will show whether high-res features are correctly detected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add production deployment scripts and documentation#19

Add production deployment scripts and documentation#19
sahinemreaslan wants to merge 7 commits intofacebookresearch:mainfrom
sahinemreaslan:claude/clean-up-article-code-011CUyrVpQCTCvJ7jn17iZwA

sahinemreaslan commented Nov 10, 2025

Uh oh!

meta-cla bot commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sahinemreaslan commented Nov 10, 2025

Uh oh!

meta-cla bot commented Nov 10, 2025

Action Required

Process

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants