Skip to main content
Fig. 6 | Cybersecurity

Fig. 6

From: Adversarial attack and defense in reinforcement learning-from AI security view

Fig. 6

The exploitation cycle of policy induction attack (Behzadan and Munir 2017). For the first phase, adversary will observes the current state, and transitions in the environment. Then adversary will estimate the optimal action to select based on the adversrial policy. For the next phase, adversary take perturbation into application, and perturb the target’s input. Finally, adversary will waits for the action that agent selected

Back to article page