Value Iteration

Value iteration is an algorithm that iteratively improves the value function by updating it with the Bellman optimality equation. Value iteration starts with an arbitrary value function and repeatedly applies the Bellman optimality equation until the value function converges. Value iteration is guaranteed to converge to the optimal value function, if the MDP is finite and the value function is bounded.