Skating
The goal of this training environment is to teach the agent to ride a skateboard efficiently and perform basic maneuvers without falling. Skating is a highly dynamic task that combines balance, coordination, and precision. This environment builds on the agent's foundational locomotion skills and introduces the added complexity of maintaining stability on a moving platform (the skateboard).
The skating agent is a bipedal humanoid with a torso, legs, arms, and a simulated skateboard. The agent must learn to shift its weight, control its limbs, and manage the skateboard's motion to maintain balance and achieve forward motion.
Rewards
The total reward function ensures the agent learns effective skating techniques while penalizing unsafe or inefficient behaviors. The reward function is: reward = healthy_reward + forward_reward - ctrl_cost - contact_cost - balance_penalty - trick_penalty
healthy_reward: A fixed reward for each timestep the agent remains on the skateboard without falling.
forward_reward: A positive reward proportional to the skateboard's forward velocity, encouraging efficient movement.
ctrl_cost: A penalty for excessive or wasteful limb movements while controlling the skateboard.
contact_cost: A penalty for excessive force during foot or hand contact with the ground or skateboard.
balance_penalty: A penalty for tipping or wobbling excessively while skating.
trick_penalty: A penalty for failed tricks or unnecessary risky maneuvers.
Challenges
Dynamic Balance: The agent must manage its center of gravity to stay upright on a moving skateboard.
Weight Shifting: Learning to shift weight between legs and adjust posture for steering and acceleration.
Obstacle Navigation: Avoiding or maneuvering around obstacles while maintaining speed and balance.
Arguments
Parameter
Default
Description
learning_rate
3e-4
Determines how quickly the agent updates its policy during skating training.
clip_range
0.2
Limits the magnitude of policy updates to ensure stable learning.
entropy_coefficient
0.02
Encourages exploration of new skating techniques and movements.
forward_reward_weight
2.0
Weight for the forward_reward, incentivizing faster skating speeds.
ctrl_cost_weight
0.05
Penalizes inefficient or erratic limb movements during skating.
contact_cost_weight
1e-6
Penalizes abrupt or heavy contact with the ground or skateboard.
contact_cost_range
(-np.inf, 15.0)
Clamps the contact_cost term to prevent runaway penalties.
healthy_reward
10.0
Fixed reward for maintaining balance and avoiding falls.
balance_penalty_weight
0.1
Penalizes excessive wobbling or instability while riding the skateboard.
trick_penalty_weight
0.05
Penalizes failed or unnecessary tricks during the learning phase.
skateboard_friction
0.7
Coefficient of friction between the skateboard and ground, affecting speed and stability.
obstacle_density
low
Determines the frequency of obstacles in the environment (e.g., low, medium, high).
target_speed
3.0 m/s
Goal speed for the agent to reach and sustain during skating.
termination_penalty
-30.0
Penalty for falling off the skateboard or losing forward motion.
terrain_type
flat
Defines the type of surface the skateboard rolls on (e.g., flat, inclined, uneven).
Last updated