From: Adversarial attack and defense in reinforcement learning-from AI security view
RL algorithm | Approach type | Learning type | Application scenarios |
---|---|---|---|
Q-Learning (Watkins and Dayan 1992) | Value-based | Shallow Learning | Motion Control, Control System, |
 |  |  | and Robot Application et al. |
DQN (Mnih et al. 2013) | Value-based | Deep Learning | Motion Control, Neutralization Reaction |
 |  |  | Control, and Robot Path Planning et al. |
VIN (Tamar et al. 2016) | Value-based | Deep Learning | Path Planning, and Motion Control et al. |
A3C (Mnih et al. 2016) | Combined | Deep Learning | motion Control, Game Playing, self-driving, |
 |  |  | and Path Planning et al. |
TRPO (Schulman et al. 2015) | Policy-based | Deep Learning | Motion Control, and Game Playing et al. |
UNREAL (Jaderberg et al. 2016) | Combined | Deep Learning | Motion Control, and Game Playing et al. |