Problem
I've been experimenting with fine-tuning MapAnything on custom multi-view datasets, and I'm hitting the exact issue #104. Scale drift on the global scale factor regresses poorly on scenes >10m (ScanNet-like failure mode), metric depths off by 10+%.
If anyone can answer, what pose/depth cleaning worked best? (RANSAC reprojection filtering, learned uncertainty encoding...)?
Environment:
- PyTorch 2.4.1+cu124, Ubuntu 24, RTX 4090 (24GB)
- Input: 5-8 images per scene (1920x1080), noisy COMAP poses (bundle adjustment RANSAC outliers), sparse LiDAR depth
- Dataset: 500 indoor scenes, similar scale to ScanNet but with 2-5cm pose noise + 3% depth outliers (roughly)
Thanks again for making this open-source, this repo is genuinely amazing to learn.