Have uv installed on your device, and prepare the project by:
uv runIf you have a gpu, uninstall torch/torchvision and reinstall them with cuda
uv pip uninstall torch torchvision
uv pip install torch torchvision --index-url https://download.pytorch.org/whl/cu130
uv pip show torchpython -m lerobot.scripts.lerobot_train --policy.type=smolvla --policy.repo_id=lerobot/smolvla_base --dataset.repo_id=fracapuano/behavior1k-task0000 --batch_size=64 --steps=1
-
The dataset
behavior-1k/2025-challenge-demosis in version v2.1 but lerobot supports only v3.0. There is a script to convert it but requires a local download and the 50 tasks are very heavy.lerobot/behavior1k-task0000andfracapuano/behavior1k-task0000have specific tasks datasets and are already v3.0. -
All downloaded and cached behavior-1k datasets have the following missing features in the
meta/stats.json:
- observation.images.rgb.left_wrist,
- observation.images.rgb.right_wrist,
- observation.images.rgb.head,
- observation.images.depth.left_wrist,
- observation.images.depth.right_wrist,
- observation.images.depth.head,
- observation.images.seg_instance_id.left_wrist,
- observation.images.seg_instance_id.right_wrist,
- observation.images.seg_instance_id.head
A current fix is to add fake values for those stats, but it seems like a temporary solution:
"observation.images.seg_instance_id.head": {
"mean": [0.5, 0.5, 0.5],
"std": [0.5, 0.5, 0.5],
"min": [0.0, 0.0, 0.0],
"max": [1.0, 1.0, 1.0]
}- When running the lerobot_train script, there is an error that the state dimension are of 256 in the behavior1k but by default smolvla has max_state_dim=64. A workaround is to set it in the CLI command as :
--policy.max_state_dim=256, but doing so increases the number of parameters in the model and my 8GB VRAM can't support it. Maybe reducing the observations dimensions could help?