- 书名
Reinforcement Learning and Optimal Control
- 作者Dimitri Bertsekas
- 格式PDF
- ISBN书号9781886529397
- 出版年2019-7
- 出版社Athena Scientific
- 页数388
- 装帧Hardcover
内容简介
This book considers large and challenging multistage decision problems, which can be solved in principle by dynamic programming, but their exact solution is computationally intractable. It can be used as a textbook or for self-study in conjunction with instructional videos and slides, and other supporting material, which are available from the author's website. The book discusses solution methods that rely on approximations to produce suboptimal policies with adequate performance. These methods are known by several essentially equivalent names: reinforcement learning, approximate dynamic programming, and neuro-dynamic programming. They underlie, among others, the recent impressive successes of self-learning in the context of games such as chess and Go. One of the aims of the book is to explore the common boundary between artificial intelligence and optimal control, and to form a bridge that is accessible by workers with background in either field. Another aim is to organize coherently the broad mosaic of methods that have proved successful in practice while having a solid theoretical and/or logical foundation. This may help researchers and practitioners to find their way through the maze of competing ideas that constitute the current state of the art. The mathematical style of this book is somewhat different than other books by the same author. While we provide a rigorous, albeit short, mathematical account of the theory of finite and infinite horizon dynamic programming, and some fundamental approximation methods, we rely more on intuitive explanations and less on proof-based insights. We also illustrate the methodology with many example algorithms and applications.
Dimitri Bertsekas is McAffee Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, and a member of the National Academy of Engineering. He has researched a broad variety of subjects from optimization theory, control theory, parallel and distributed computation, systems analysis, and data communication networks. He has written num...