Name: Deep Reinforcement Learning with Python Second Edition
Price: 27000 KRW
Availability: OnlineOnly
Author: Sudharsan Ravichandiran

Deep Reinforcement Learning with Python Second Edition 상세페이지

출간 정보

2020.09.30 전자책 출간

듣기 기능

TTS(듣기) 미지원

파일 정보

PDF
761 쪽
27.4MB

지원 환경

앱
웹
PC뷰어
PAPER

ISBN

9781839215599

UCI

Deep Reinforcement Learning with Python Second Edition

작품 정보

An example-rich guide for beginners to start their reinforcement and deep reinforcement learning journey with state-of-the-art distinct algorithms

▶What You Will Learn
⦁Understand core RL concepts including the methodologies, math, and code
⦁Train an agent to solve Blackjack, FrozenLake, and many other problems using OpenAI Gym
⦁Train an agent to play Ms Pac-Man using a Deep Q Network
⦁Learn policy-based, value-based, and actor-critic methods
⦁Master the math behind DDPG, TD3, TRPO, PPO, and many others
⦁Explore new avenues such as the distributional RL, meta RL, and inverse RL
⦁Use Stable Baselines to train an agent to walk and play Atari games

▶Key Features
⦁Covers a vast spectrum of basic-to-advanced RL algorithms with mathematical explanations of each algorithm
⦁Learn how to implement algorithms with code by following examples with line-by-line explanations
⦁Explore the latest RL methodologies such as DDPG, PPO, and the use of expert demonstrations

▶Who This Book Is For
If you're a machine learning developer with little or no experience with neural networks interested in artificial intelligence and want to learn about reinforcement learning from scratch, this book is for you.

Basic familiarity with linear algebra, calculus, and the Python programming language is required. Some experience with TensorFlow would be a plus.

▶What this book covers
⦁ Chapter 1, Fundamentals of Reinforcement Learning, helps you build a strong foundation on RL concepts. We will learn about the key elements of RL, the Markov decision process, and several important fundamental concepts such as action spaces, policies, episodes, the value function, and the Q function. At the end of the chapter, we will learn about some of the interesting applications of RL and we will also look into the key terms and terminologies frequently used in RL.

⦁ Chapter 2, A Guide to the Gym Toolkit, provides a complete guide to OpenAI's Gym toolkit. We will understand several interesting environments provided by Gym in detail by implementing them. We will begin our hands-on RL journey from this chapter by implementing several fundamental RL concepts using Gym.

⦁ Chapter 3, The Bellman Equation and Dynamic Programming, will help us understand the Bellman equation in detail with extensive math. Next, we will learn two interesting classic RL algorithms called the value and policy iteration methods, which we can use to find the optimal policy. We will also see how to implement value and policy iteration methods for solving the Frozen Lake problem.

⦁ Chapter 4, Monte Carlo Methods, explains the model-free method, Monte Carlo. We will learn what prediction and control tasks are, and then we will look into Monte Carlo prediction and Monte Carlo control methods in detail. Next, we will implement the Monte Carlo method to solve the blackjack game using the Gym toolkit.

⦁ Chapter 5, Understanding Temporal Difference Learning, deals with one of the most popular and widely used model-free methods called Temporal Difference (TD) learning. First, we will learn how the TD prediction method works in detail, and then we will explore the on-policy TD control method called SARSA and the off-policy TD control method called Q learning in detail. We will also implement TD control methods to solve the Frozen Lake problem using Gym.

⦁ Chapter 6, Case Study – The MAB Problem, explains one of the classic problems in RL called the multi-armed bandit (MAB) problem. We will start the chapter by understanding what the MAB problem is and then we will learn about several exploration strategies such as epsilon-greedy, softmax exploration, upper confidence bound, and Thompson sampling methods for solving the MAB problem in detail.

⦁ Chapter 7, Deep Learning Foundations, helps us to build a strong foundation on deep learning. We will start the chapter by understanding how artificial neural networks work. Then we will learn several interesting deep learning algorithms, such as recurrent neural networks, LSTM networks, convolutional neural networks, and generative adversarial networks.

⦁ Chapter 8, A Primer on TensorFlow, deals with one of the most popular deep learning libraries called TensorFlow. We will understand how to use TensorFlow by implementing a neural network to recognize handwritten digits. Next, we will learn to perform several math operations using TensorFlow. Later, we will learn about TensorFlow 2.0 and see how it differs from the previous TensorFlow versions.

⦁ Chapter 9, Deep Q Network and Its Variants, enables us to kick-start our deep RL journey. We will learn about one of the most popular deep RL algorithms called the Deep Q Network (DQN). We will understand how DQN works step by step along with the extensive math. We will also implement a DQN to play Atari games. Next, we will explore several interesting variants of DQN, called Double DQN, Dueling DQN, DQN with prioritized experience replay, and DRQN.

⦁ Chapter 10, Policy Gradient Method, covers policy gradient methods. We will understand how the policy gradient method works along with the detailed derivation. Next, we will learn several variance reduction methods such as policy gradient with reward-to-go and policy gradient with baseline. We will also understand how to train an agent for the Cart Pole balancing task using policy gradient.

⦁ Chapter 11, Actor-Critic Methods – A2C and A3C, deals with several interesting actorcritic methods such as advantage actor-critic and asynchronous advantage actorcritic. We will learn how these actor-critic methods work in detail, and then we will implement them for a mountain car climbing task using OpenAI Gym.

⦁ Chapter 12, Learning DDPG, TD3, and SAC, covers state-of-the-art deep RL algorithms such as deep deterministic policy gradient, twin delayed DDPG, and soft actor, along with step by step derivation. We will also learn how to implement the DDPG algorithm for performing the inverted pendulum swing-up task using Gym.

⦁ Chapter 13, TRPO, PPO, and ACKTR Methods, deals with several popular policy gradient methods such as TRPO and PPO. We will dive into the math behind TRPO and PPO step by step and understand how TRPO and PPO helps an agent find the optimal policy. Next, we will learn to implement PPO for performing the inverted pendulum swing-up task. At the end, we will learn about the actor-critic method called actor-critic using Kronecker-Factored trust region in detail.

⦁ Chapter 14, Distributional Reinforcement Learning, covers distributional RL algorithms. We will begin the chapter by understanding what distributional RL is. Then we will explore several interesting distributional RL algorithms such as categorical DQN, quantile regression DQN, and distributed distributional DDPG.

⦁ Chapter 15, Imitation Learning and Inverse RL, explains imitation and inverse RL algorithms. First, we will understand how supervised imitation learning, DAgger, and deep Q learning from demonstrations work in detail. Next, we will learn about maximum entropy inverse RL. At the end of the chapter, we will learn about generative adversarial imitation learning.

⦁ Chapter 16, Deep Reinforcement Learning with Stable Baselines, helps us to understand how to implement deep RL algorithms using a library called Stable Baselines. We will learn what Stable Baselines is and how to use it in detail by implementing several interesting Deep RL algorithms such as DQN, A2C, DDPG TRPO, and PPO.

⦁ Chapter 17, Reinforcement Learning Frontiers, covers several interesting avenues in RL, such as meta RL, hierarchical RL, and imagination augmented agents in detail.

작가 소개

▶About the Author
- Sudharsan Ravichandiran
Sudharsan Ravichandiran is a data scientist, researcher, best selling author, and YouTuber (search for "Sudharsan reinforcement learning"). He completed his Bachelor's in Information Technology at Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning, including Natural Language Processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow. He also authored a best-seller, Hands-On Reinforcement Learning with Python, published by Packt Publishing.

리뷰

0.0

구매자 별점

0명 평가

별점 분포 보기

이 작품을 평가해 주세요!

리뷰 작성 유의사항

건전한 리뷰 정착 및 양질의 리뷰를 위해 아래 해당하는 리뷰는 비공개 조치될 수 있음을 안내드립니다.

타인에게 불쾌감을 주는 욕설
비속어나 타인을 비방하는 내용
특정 종교, 민족, 계층을 비방하는 내용
해당 작품의 줄거리나 리디 서비스 이용과 관련이 없는 내용
의미를 알 수 없는 내용
광고 및 반복적인 글을 게시하여 서비스 품질을 떨어트리는 내용
저작권상 문제의 소지가 있는 내용
다른 리뷰에 대한 반박이나 논쟁을 유발하는 내용

* 결말을 예상할 수 있는 리뷰는 자제하여 주시기 바랍니다.

이 외에도 건전한 리뷰 문화 형성을 위한 운영 목적과 취지에 맞지 않는 내용은 담당자에 의해 리뷰가 비공개 처리가 될 수 있습니다.

아직 등록된 리뷰가 없습니다.
첫 번째 리뷰를 남겨주세요!

구매자 표시 기준은 무엇인가요?

'구매자' 표시는 유료 작품 결제 후 다운로드하거나 리디셀렉트 작품을 다운로드 한 경우에만 표시됩니다.

무료 작품 (프로모션 등으로 무료로 전환된 작품 포함): '구매자'로 표시되지 않습니다.
시리즈 내 무료 작품: '구매자'로 표시되지 않습니다. 하지만 같은 시리즈의 유료 작품을 결제한 뒤 리뷰를 수정하거나 재등록하면 '구매자'로 표시됩니다.
영구 삭제: 작품을 영구 삭제해도 '구매자' 표시는 남아있습니다.
결제 취소: '구매자' 표시가 자동으로 사라집니다.