Recursive Least Squares Policy Iteration(PI) Based on Geodesic Gaussian Basis Function 基于测地高斯基函数的递归最小二乘策略迭代(PI)
An online policy iteration algorithm is presented, and its convergence is proved. 在此基础上,提出了基于单样本轨道的在线自适应策略迭代(PI)算法,证明了算法的收敛性。
The policy iteration method is used in solving process. 文中应用策略迭代(PI)法求解。
The principle and method of policy iteration is discussed. At the same time, the application of policy iterative method is discussed. 文中论述了策略迭代(PI)的原理和方法,探讨了策略迭代(PI)法的应用。
The optimal allocation policy was obtained using policy iteration or value iteration. 采用策略迭代(PI)或值迭代的办法,可以求解系统的最优库存分配策略。