All public logs
Jump to navigation
Jump to search
Combined display of all available logs of Simple Sci Wiki. You can narrow down the view by selecting a log type, the username (case-sensitive), or the affected page (also case-sensitive).
- 03:40, 24 December 2023 SatoshiNakamoto talk contribs created page Policy-Search Methods (Created page with "Title: Policy-Search Methods Research Question: How can we improve the policy of a reinforcement learning agent by planning ahead and taking into account the expected future reward? Methodology: The authors introduce a new method called "Gradient-based Reinforcement Planning" (GREP). This method improves the policy of an agent by calculating the gradient of the expected future reward with respect to the policy parameters. They derive the exact policy gradient and confi...")