English | 中文
This directory contains tool scripts for the DepthAnything-AC project.
The inference script supports processing of both images and videos, with flexible batch processing capabilities.
# Basic usage
python tools/infer.py --input image.jpg --output depth.png
# Specify model and encoder
python tools/infer.py --input image.jpg --output depth.png --model checkpoints/depth_anything_AC_vits.pth --encoder vits
# Use different colormap
python tools/infer.py --input image.jpg --output depth.png --colormap inferno# Basic video processing
python tools/infer.py --input video.mp4 --output depth_video.mp4
# Specify output FPS
python tools/infer.py --input video.mp4 --output depth_video.mp4 --fps 30
# Use different colormap for video
python tools/infer.py --input video.mp4 --output depth_video.mp4 --colormap spectral# Process images only in a directory
python tools/infer.py --input images/ --output output/ --mode images
# Process videos only in a directory
python tools/infer.py --input videos/ --output output/ --mode videos
# Process both images and videos (default mode)
python tools/infer.py --input media/ --output output/ --mode mixed
# Recursive processing of subdirectories
python tools/infer.py --input dataset/ --output results/ --recursive| Parameter | Short | Type | Default | Description |
|---|---|---|---|---|
--input |
-i |
str | - | Input image/video path or directory (required) |
--output |
-o |
str | - | Output path (file or directory) (required) |
--model |
-m |
str | checkpoints/depth_anything_v2_vits.pth |
Model weight path |
--encoder |
- | str | vits |
Encoder type (vits, vitb, vitl) |
--colormap |
- | str | spectral |
Colormap (inferno, spectral, gray) |
--depth_cap |
- | float | 80 |
Maximum depth value for capping |
--fps |
- | float | None |
Output video FPS (defaults to input FPS) |
--recursive |
-r |
flag | False |
Search recursively in subdirectories |
--mode |
- | str | mixed |
Processing mode (images, videos, mixed) |
Images: .jpg, .jpeg, .png, .bmp, .tiff, .tif
Videos: .mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, .m4v
python tools/infer.py -i test_image.jpg -o result.pngOutput:
result.png- Colored depth mapresult_raw.npy- Raw depth data
python tools/infer.py -i test_video.mp4 -o depth_video.mp4 --fps 30Output:
depth_video.mp4- Video with depth visualization
python tools/infer.py -i photos/ -o depth_results/ --mode images --recursiveRecursively processes all images in the photos/ directory and subdirectories, generating corresponding depth maps in the depth_results/ directory.
python tools/infer.py -i videos/ -o depth_videos/ --mode videos --fps 25Processes all videos in the videos/ directory, generating depth videos with 25 FPS output.
python tools/infer.py -i media/ -o output/ --mode mixed --colormap infernoProcesses both images and videos in the media/ directory, using inferno colormap for visualization.
When processing directories, the output maintains the same folder structure as the input:
- Images:
input_name_depth.png+input_name_depth_raw.npy - Videos:
input_name_depth.mp4
Script for launching distributed training.
bash tools/train.sh [NUM_GPUS] [PORT]Script for launching distributed evaluation.
bash tools/val.sh [NUM_GPUS] [PORT] [DATASET]For more detailed information, please refer to the main project README documentation.