Name: Mastering Reinforcement Learning with Python
Price: 29000 KRW
Availability: OnlineOnly
Author: Enes Bilgin

Mastering Reinforcement Learning with Python 상세페이지

출간 정보

2020.12.18 전자책 출간

듣기 기능

TTS(듣기) 미지원

파일 정보

PDF
544 쪽
15.3MB

지원 환경

앱
웹
PC뷰어
PAPER

ISBN

9781838648497

ECN

Mastering Reinforcement Learning with Python

작품 정보

Get hands-on experience in creating state-of-the-art reinforcement learning agents using TensorFlow and RLlib to solve complex real-world business and industry problems with the help of expert tips and best practices

▶What You Will Learn
-Model and solve complex sequential decision-making problems using RL
-Develop a solid understanding of how state-of-the-art RL methods work
-Use Python and TensorFlow to code RL algorithms from scratch
-Parallelize and scale up your RL implementations using Ray's RLlib package
-Get in-depth knowledge of a wide variety of RL topics
-Understand the trade-offs between different RL approaches
-Discover and address the challenges of implementing RL in the real world

▶Key Features
-Understand how large-scale state-of-the-art RL algorithms and approaches work
-Apply RL to solve complex problems in marketing, robotics, supply chain, finance, cybersecurity, and more
-Explore tips and best practices from experts that will enable you to overcome real-world RL challenges

▶Who This Book Is For
This book is for expert machine learning practitioners and researchers looking to focus on hands-on reinforcement learning with Python by implementing advanced deep reinforcement learning concepts in real-world projects. Reinforcement learning experts who want to advance their knowledge to tackle large-scale and complex sequential decision-making problems will also find this book useful. Working knowledge of Python programming and deep learning along with prior experience in reinforcement learning is required.

▶What this book covers
- Chapter 1, Introduction to Reinforcement Learning, provides an introduction to RL, presents motivating examples and success stories, and looks at RL applications in industry. It then gives some fundamental definitions to refresh your mind on RL concepts and concludes with a section on software and hardware setup.

- Chapter 2, Multi-Armed Bandits, covers a rather simple RL setting, bandit problems without context, which, on the other hand, has tremendous applications in industry as an alternative to the traditional A/B testing. The chapter also describes a very fundamental RL trade-off: exploration versus exploitation. It then presents three approaches to tackle this trade-off and compares them against A/B testing.

- Chapter 3, Contextual Bandits, takes the discussion on multi-armed bandits to an advanced level by adding context to the decision-making process and involving deep neural networks in decision making. We adapt a real dataset from the U.S. Census to an online advertising problem. We conclude the chapter with a section on the applications of bandit problems in industry and business.

- Chapter 4, Makings of a Markov Decision Process, builds the mathematical theory behind sequential decision processes that are solved using RL. We start with Markov chains, where we describe types of states, ergodicity, transitionary, and steady-state behavior. Then we go into Markov reward and decision processes. Along the way, we introduce return, discount, policy, value functions, and Bellman optimality, which are key concepts in RL theory that will be frequently referred to in later chapters. We conclude the chapter with a discussion on partially observed Markov decision processes. Throughout the chapter, we use a grid world example to illustrate the concepts.

- Chapter 5, Solving the Reinforcement Learning Problem, presents and compares dynamic programming, Monte Carlo, and temporal-difference methods, which are fundamental to understanding how to solve a Markov decision process. Key approaches such as policy evaluation, policy iteration, and value iteration are introduced and illustrated. Throughout the chapter, we solve an example inventory replenishment problem. Along the way, we motivate the reader for deep RL methods. We conclude the chapter with a discussion on the importance of simulation in reinforcement learning.

- Chapter 6, Deep Q-Learning at Scale, starts with a discussion on why it is challenging to use deep neural networks in reinforcement learning and how modern deep Q-learning addresses those challenges. After a thorough coverage of scalable deep Q-learning methods, we introduce Ray, a distributed computing framework, with which we implement a parallelized deep Q-learning variant. We finish the chapter by introducing RLlib, Ray's own scalable RL library.

- Chapter 7, Policy-Based Methods, introduces another important class of RL approaches: policy-based methods. You will first learn how they are different than Q-learning and why they are needed. As we build the theory for contemporary policy-based methods, we also show how you can use RLlib for their application to a sample problem.

- Chapter 8, Model-Based Methods, presents how learning a model of the environment can help an RL agent to plan its actions efficiently. In the chapter, we implement and use variants of cross-entropy methods and present Dyna, an RL framework that combines model-free and model-based approaches.

- Chapter 9, Multi-Agent Reinforcement Learning, increases gears, goes into multi-agent settings and present the challenges that come with it. In the chapter, we train tic-tac-toe agents through self-play, which you also can play against for fun.

- Chapter 10, Introducing Machine Teaching, introduces an emerging concept in RL that focuses on leveraging the subject matter expertise of a human "teacher" to make learning easy for RL agents. We present how reward function engineering, curriculum learning, demonstration learning, and action masking can help with training autonomous agents effectively.

- Chapter 11, Achieving Generalization and Overcoming Partial Observability discusses why it is important to be concerned about generalization capabilities of trained RL policies for successful real-world implementations. To this end, the chapter focuses on simulation-to-real gap, connects generalization and partial observability, and introduces domain randomization and memory mechanisms. We also present the CoinRun environment and results on how traditional regularization methods can also help with generalization in RL.

- Chapter 12, Meta-Reinforcement Learning, introduces approaches that allow an RL agent to adapt to a new environment once it is deployed for its task. This is one of the most important research directions towards achieving resilient autonomy through RL.

- Chapter 13, Exploring Advanced Topics, brings you up to speed with some of the most recent developments in RL, including state-of-the-art distributed RL, SEED RL, approaches that cracked all the Atari benchmarks, Agent57, and RL without simulation, offline RL.

- Chapter 14, Solving Robot Learning, goes into implementations of the methods covered in the earlier chapters by training a robot hand to grasp objects using manual and automated curriculum learning in PyBullet, a famous physics simulation in Python.

- Chapter 15, Supply Chain Management, gives you hands-on experience in modeling and solving an inventory replenishment problem. Along the way, we perform hyperparameter tuning for our RL agent. The chapter concludes with a discussion on how RL can be applied to vehicle routing problems.

- Chapter 16, Personalization, Marketing, and Finance goes beyond bandit models for personalization and discusses a news recommendation problem while introducing dueling bandit gradient descent and action embeddings along the way. The chapter also discusses marketing and finance applications of RL and introduces the TensorTrade library for the latter.

- Chapter 17, Smart City and Cybersecurity starts with solving a traffic light contsrol scenario as a multi-agent RL problem using the Flow framework. It then describes how RL can be applied to two other problems: providing ancillary service to a power grid and discovering cyberattacks in it.

- Chapter 18, Challenges and Future Directions in Reinforcement Learning wraps up the book by recapping the challenges in RL and connects them to the recent developments and research in the field. Finally, we present practical suggestions for the reader who want to further deepen their RL expertise.

작가 소개

▶About the Author
- Enes Bilgin
Enes Bilgin works as a senior AI engineer and a tech lead in Microsoft's Autonomous Systems division. He is a machine learning and operations research practitioner and researcher with experience in building production systems and models for top tech companies using Python, TensorFlow, and Ray/RLlib. He holds an M.S. and a Ph.D. in systems engineering from Boston University and a B.S. in industrial engineering from Bilkent University. In the past, he has worked as a research scientist at Amazon and as an operations research scientist at AMD. He also held adjunct faculty positions at the McCombs School of Business at the University of Texas at Austin and at the Ingram School of Engineering at Texas State University.

리뷰

0.0

구매자 별점

0명 평가

별점 분포 보기

이 작품을 평가해 주세요!

리뷰 작성 유의사항

건전한 리뷰 정착 및 양질의 리뷰를 위해 아래 해당하는 리뷰는 비공개 조치될 수 있음을 안내드립니다.

타인에게 불쾌감을 주는 욕설
비속어나 타인을 비방하는 내용
특정 종교, 민족, 계층을 비방하는 내용
해당 작품의 줄거리나 리디 서비스 이용과 관련이 없는 내용
의미를 알 수 없는 내용
광고 및 반복적인 글을 게시하여 서비스 품질을 떨어트리는 내용
저작권상 문제의 소지가 있는 내용
다른 리뷰에 대한 반박이나 논쟁을 유발하는 내용

* 결말을 예상할 수 있는 리뷰는 자제하여 주시기 바랍니다.

이 외에도 건전한 리뷰 문화 형성을 위한 운영 목적과 취지에 맞지 않는 내용은 담당자에 의해 리뷰가 비공개 처리가 될 수 있습니다.

아직 등록된 리뷰가 없습니다.
첫 번째 리뷰를 남겨주세요!

구매자 표시 기준은 무엇인가요?

'구매자' 표시는 유료 작품 결제 후 다운로드하거나 리디셀렉트 작품을 다운로드 한 경우에만 표시됩니다.

무료 작품 (프로모션 등으로 무료로 전환된 작품 포함): '구매자'로 표시되지 않습니다.
시리즈 내 무료 작품: '구매자'로 표시되지 않습니다. 하지만 같은 시리즈의 유료 작품을 결제한 뒤 리뷰를 수정하거나 재등록하면 '구매자'로 표시됩니다.
영구 삭제: 작품을 영구 삭제해도 '구매자' 표시는 남아있습니다.
결제 취소: '구매자' 표시가 자동으로 사라집니다.