Sridhar Thiagarajan

Navigation

Home
Reinforcement Learning
Machine Learning

Reinforcement Learning

Stochastic Policy Gradient Methods

Deterministic Policy Gradient Methods

Some research directions worth looking at..

Deep-Q Networks and Double DQN

Fourier Basis for value function approximation

Off Policy Eligibility Traces

Monte Carlo Exploring starts for Blackjack

Comparison of Qlearning and SARSA

Comparison of Types of Eligibility traces on Maze

Python Code for Qlearning

Subscribe to: Posts (Atom)

Simple theme. Powered by Blogger.