Policy Gradient Methods: Variance Reduction and Stochastic Convergence

Greensmith, Evan

doi:10.25911/5d7a2a01dcebe

A change is coming. Click to see a sneak peek of the new Open Research Repository.

Policy Gradient Methods: Variance Reduction and Stochastic Convergence

Download (1.19 MB)

link to publisher version

Altmetric Citations

Greensmith, Evan

Description

In a reinforcement learning task an agent must learn a policy for performing actions so as to perform well in a given environment. Policy gradient methods consider a parameterized class of policies, and using a policy from the class, and a trajectory through the environment taken by the agent using this policy, estimate the performance of the policy with respect to the parameters. Policy gradient methods avoid some of the problems of value function methods, such as policy degradation,...[Show more] where inaccuracy in the value function leads to the choice of a poor policy. However, the estimates produced by policy gradient methods can have high variance. ¶ ...

Collections	Open Access Theses
Date published:	2005
Type:	Thesis (PhD)
URI:	http://hdl.handle.net/1885/47105
DOI:	10.25911/5d7a2a01dcebe

Download

File	Description	Size	Format	Image
02whole.pdf		1.19 MB	Adobe PDF
01front.pdf		69.02 kB	Adobe PDF

Show full item record

Policy Gradient Methods: Variance Reduction and Stochastic Convergence

Altmetric Citations

Description

Download