Commit 3e538bc7 authored by sbl1996@126.com's avatar sbl1996@126.com

Change default hyperparameters

parent 81d80f7f
...@@ -110,11 +110,11 @@ class Args: ...@@ -110,11 +110,11 @@ class Args:
"""whether to use the PPO clipping to replace V-Trace surrogate clipping""" """whether to use the PPO clipping to replace V-Trace surrogate clipping"""
clip_coef: float = 0.25 clip_coef: float = 0.25
"""the PPO surrogate clipping coefficient""" """the PPO surrogate clipping coefficient"""
dual_clip_coef: Optional[float] = None dual_clip_coef: Optional[float] = 3.0
"""the dual surrogate clipping coefficient""" """the dual surrogate clipping coefficient, typically 3.0"""
ent_coef: float = 0.01 ent_coef: float = 0.01
"""coefficient of the entropy""" """coefficient of the entropy"""
vf_coef: float = 0.5 vf_coef: float = 1.0
"""coefficient of the value function""" """coefficient of the value function"""
max_grad_norm: float = 1.0 max_grad_norm: float = 1.0
"""the maximum norm for the gradient clipping""" """the maximum norm for the gradient clipping"""
......
...@@ -105,7 +105,7 @@ class Args: ...@@ -105,7 +105,7 @@ class Args:
"""Toggles advantages normalization""" """Toggles advantages normalization"""
clip_coef: float = 0.25 clip_coef: float = 0.25
"""the surrogate clipping coefficient""" """the surrogate clipping coefficient"""
dual_clip_coef: Optional[float] = None dual_clip_coef: Optional[float] = 3.0
"""the dual surrogate clipping coefficient, typically 3.0""" """the dual surrogate clipping coefficient, typically 3.0"""
spo_kld_max: Optional[float] = None spo_kld_max: Optional[float] = None
"""the maximum KLD for the SPO policy, typically 0.02""" """the maximum KLD for the SPO policy, typically 0.02"""
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment