Training and Evaluating a Policy
Warning
We have introdced a new change in how the values are saved in the dataset. The values are now saved in the dataset as radians for all joints and no scaling is applied for the gripper. If you are using a previous version of the dataset, the values for joints 0-5 will be in degrees and a scaling of 10000 will be applied to gripper. Check Trossen v1.0 Dataset Format before using datasets from previous versions.
Training a Policy
To train a policy to control your robot, use the train.py script. A few arguments are required. Here is an example command:
python lerobot/scripts/train.py \
--dataset.repo_id=${HF_USER}/trossen_ai_stationary_test \
--policy.type=act \
--output_dir=outputs/train/act_trossen_ai_stationary_test \
--job_name=act_trossen_ai_stationary_test \
--device=cuda \
--wandb.enable=true
python lerobot/scripts/train.py \
--dataset.repo_id=${HF_USER}/trossen_ai_mobile_test \
--policy.type=act \
--output_dir=outputs/train/act_trossen_ai_mobile_test \
--job_name=act_trossen_ai_mobile_test \
--device=cuda \
--wandb.enable=true
python lerobot/scripts/train.py \
--dataset.repo_id=${HF_USER}/trossen_ai_solo_test \
--policy.type=act \
--output_dir=outputs/train/act_trossen_ai_solo_test \
--job_name=act_trossen_ai_solo_test \
--device=cuda \
--wandb.enable=true
Explanation of the Command
We provided the dataset using
--dataset.repo_id=${HF_USER}/trossen_ai_xxxxxxx_test
.We specified the policy with
--policy.type=act
, which loads configurations from configuration_act.py. This policy will automatically adapt to the number of motor states, motor actions, and cameras recorded in your dataset.We set
--device=cuda
to train on an NVIDIA GPU. If using Apple Silicon, you can replace it with--device=mps
.We enabled Weights & Biases logging using
--wandb.enable=true
for visualizing training plots. This is optional, but if used, ensure you’re logged in by running:wandb login
Note
Training will take several hours. Checkpoints will be saved in: outputs/train/act_trossen_ai_xxxxx_test/checkpoints.
Training Pipeline Configuration
The training pipeline can be configured using the following parameters:
--dataset
: Configuration for the dataset.--env
: Configuration for the environment. Can beNone
.--policy
: Configuration for the pre-trained policy. Can beNone
.--output_dir
: Directory to save all run outputs. If another training session is run with the same value, its contents will be overwritten unlessresume
is set to true.--job_name
: Name of the job. Can beNone
.--resume
: Set to true to resume a previous run. Ensureoutput_dir
is the directory of an existing run with at least one checkpoint.--device
: Device to use for training (e.g.,cuda
,cpu
,mps
).--use_amp
: Determines whether to use Automatic Mixed Precision (AMP) for training and evaluation.--seed
: Seed for training and evaluation environments.--num_workers
: Number of workers for the dataloader.--batch_size
: Batch size for training.--eval_freq
: Frequency of evaluation during training.--log_freq
: Frequency of logging during training.--save_checkpoint
: Whether to save checkpoints during training.--save_freq
: Frequency of saving checkpoints.--offline
: Configuration for offline training.--online
: Configuration for online training.--use_policy_training_preset
: Whether to use policy training preset.--optimizer
: Configuration for the optimizer. Can beNone
.--scheduler
: Configuration for the learning rate scheduler. Can beNone
.--eval
: Configuration for evaluation.--wandb
: Configuration for Weights & Biases logging.
Evaluating Your Policy
You can use the record
function from lerobot/scripts/control_robot.py but with a policy checkpoint as input.
Run the following command to record 10 evaluation episodes:
python lerobot/scripts/control_robot.py \
--robot.type=trossen_ai_stationary \
--control.type=record \
--control.fps=30 \
--control.single_task="Recording evaluation episode using Trossen AI Stationary." \
--control.repo_id=${HF_USER}/eval_act_trossen_ai_stationary_test \
--control.tags='["tutorial"]' \
--control.warmup_time_s=5 \
--control.episode_time_s=30 \
--control.reset_time_s=30 \
--control.num_episodes=10 \
--control.push_to_hub=true \
--control.policy.path=outputs/train/act_trossen_ai_stationary_test/checkpoints/last/pretrained_model \
--control.num_image_writer_processes=1
python lerobot/scripts/control_robot.py \
--robot.type=trossen_ai_mobile \
--control.type=record \
--control.fps=30 \
--control.single_task="Recording evaluation episode using Trossen AI Mobile." \
--control.repo_id=${HF_USER}/eval_act_trossen_ai_mobile_test \
--control.tags='["tutorial"]' \
--control.warmup_time_s=5 \
--control.episode_time_s=30 \
--control.reset_time_s=30 \
--control.num_episodes=10 \
--control.push_to_hub=true \
--control.policy.path=outputs/train/act_trossen_ai_mobile_test/checkpoints/last/pretrained_model \
--control.num_image_writer_processes=1 \
--robot.enable_motor_torque=true
python lerobot/scripts/control_robot.py \
--robot.type=trossen_ai_solo \
--control.type=record \
--control.fps=30 \
--control.single_task="Recording evaluation episode using Trossen AI Solo." \
--control.repo_id=${HF_USER}/eval_act_trossen_ai_solo_test \
--control.tags='["tutorial"]' \
--control.warmup_time_s=5 \
--control.episode_time_s=30 \
--control.reset_time_s=30 \
--control.num_episodes=10 \
--control.push_to_hub=true \
--control.policy.path=outputs/train/act_trossen_ai_solo_test/checkpoints/last/pretrained_model \
--control.num_image_writer_processes=1
Note
You can change the camera interface to use for recording by adding the following command line argument:
--robot.camera_interface='opencv'
.
This is useful if you have configured multiple camera interfaces as explained in Camera Serial Number.
Key Differences from Training Data Recording
Policy Checkpoint:
The command includes
--control.policy.path
, which specifies the path to the trained policy checkpoint (e.g., outputs/train/act_trossen_ai_xxxxx_test/checkpoints/last/pretrained_model).If you uploaded the model checkpoint to Hugging Face Hub, you can also specify it as: –control.policy.path=${HF_USER}/act_trossen_ai_xxxxx_test.
Dataset Naming Convention:
The dataset name now begins with
eval_
(e.g.,${HF_USER}/eval_act_trossen_ai_xxxxx_test
) to indicate that this is an evaluation dataset.
Image Writing Process:
We set
--control.num_image_writer_processes=1
instead of the default0
.On some systems, using a dedicated process for writing images (from multiple cameras) allows achieving a consistent 30 FPS during inference.
You can experiment with different values of
--control.num_image_writer_processes
to optimize performance.