본문 바로가기

리디 접속이 원활하지 않습니다.
강제 새로 고침(Ctrl + F5)이나 브라우저 캐시 삭제를 진행해주세요.
계속해서 문제가 발생한다면 리디 접속 테스트를 통해 원인을 파악하고 대응 방법을 안내드리겠습니다.
테스트 페이지로 이동하기

Deep Reinforcement Learning with Python Second Edition 상세페이지

Deep Reinforcement Learning with Python Second Edition

Master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow

  • 관심 0
소장
전자책 정가
27,000원
판매가
27,000원
출간 정보
  • 2020.09.30 전자책 출간
듣기 기능
TTS(듣기) 지원
파일 정보
  • PDF
  • 761 쪽
  • 27.4MB
지원 환경
  • PC뷰어
  • PAPER
ISBN
9781839215599
ECN
-
Deep Reinforcement Learning with Python Second Edition

작품 정보

An example-rich guide for beginners to start their reinforcement and deep reinforcement learning journey with state-of-the-art distinct algorithms

▶What You Will Learn
⦁Understand core RL concepts including the methodologies, math, and code
⦁Train an agent to solve Blackjack, FrozenLake, and many other problems using OpenAI Gym
⦁Train an agent to play Ms Pac-Man using a Deep Q Network
⦁Learn policy-based, value-based, and actor-critic methods
⦁Master the math behind DDPG, TD3, TRPO, PPO, and many others
⦁Explore new avenues such as the distributional RL, meta RL, and inverse RL
⦁Use Stable Baselines to train an agent to walk and play Atari games

▶Key Features
⦁Covers a vast spectrum of basic-to-advanced RL algorithms with mathematical explanations of each algorithm
⦁Learn how to implement algorithms with code by following examples with line-by-line explanations
⦁Explore the latest RL methodologies such as DDPG, PPO, and the use of expert demonstrations

▶Who This Book Is For
If you're a machine learning developer with little or no experience with neural networks interested in artificial intelligence and want to learn about reinforcement learning from scratch, this book is for you.

Basic familiarity with linear algebra, calculus, and the Python programming language is required. Some experience with TensorFlow would be a plus.

▶What this book covers
⦁ Chapter 1, Fundamentals of Reinforcement Learning, helps you build a strong foundation on RL concepts. We will learn about the key elements of RL, the Markov decision process, and several important fundamental concepts such as action spaces, policies, episodes, the value function, and the Q function. At the end of the chapter, we will learn about some of the interesting applications of RL and we will also look into the key terms and terminologies frequently used in RL.

⦁ Chapter 2, A Guide to the Gym Toolkit, provides a complete guide to OpenAI's Gym toolkit. We will understand several interesting environments provided by Gym in detail by implementing them. We will begin our hands-on RL journey from this chapter by implementing several fundamental RL concepts using Gym.

⦁ Chapter 3, The Bellman Equation and Dynamic Programming, will help us understand the Bellman equation in detail with extensive math. Next, we will learn two interesting classic RL algorithms called the value and policy iteration methods, which we can use to find the optimal policy. We will also see how to implement value and policy iteration methods for solving the Frozen Lake problem.

⦁ Chapter 4, Monte Carlo Methods, explains the model-free method, Monte Carlo. We will learn what prediction and control tasks are, and then we will look into Monte Carlo prediction and Monte Carlo control methods in detail. Next, we will implement the Monte Carlo method to solve the blackjack game using the Gym toolkit.

⦁ Chapter 5, Understanding Temporal Difference Learning, deals with one of the most popular and widely used model-free methods called Temporal Difference (TD) learning. First, we will learn how the TD prediction method works in detail, and then we will explore the on-policy TD control method called SARSA and the off-policy TD control method called Q learning in detail. We will also implement TD control methods to solve the Frozen Lake problem using Gym.

⦁ Chapter 6, Case Study – The MAB Problem, explains one of the classic problems in RL called the multi-armed bandit (MAB) problem. We will start the chapter by understanding what the MAB problem is and then we will learn about several exploration strategies such as epsilon-greedy, softmax exploration, upper confidence bound, and Thompson sampling methods for solving the MAB problem in detail.

⦁ Chapter 7, Deep Learning Foundations, helps us to build a strong foundation on deep learning. We will start the chapter by understanding how artificial neural networks work. Then we will learn several interesting deep learning algorithms, such as recurrent neural networks, LSTM networks, convolutional neural networks, and generative adversarial networks.

⦁ Chapter 8, A Primer on TensorFlow, deals with one of the most popular deep learning libraries called TensorFlow. We will understand how to use TensorFlow by implementing a neural network to recognize handwritten digits. Next, we will learn to perform several math operations using TensorFlow. Later, we will learn about TensorFlow 2.0 and see how it differs from the previous TensorFlow versions.

⦁ Chapter 9, Deep Q Network and Its Variants, enables us to kick-start our deep RL journey. We will learn about one of the most popular deep RL algorithms called the Deep Q Network (DQN). We will understand how DQN works step by step along with the extensive math. We will also implement a DQN to play Atari games. Next, we will explore several interesting variants of DQN, called Double DQN, Dueling DQN, DQN with prioritized experience replay, and DRQN.

⦁ Chapter 10, Policy Gradient Method, covers policy gradient methods. We will understand how the policy gradient method works along with the detailed derivation. Next, we will learn several variance reduction methods such as policy gradient with reward-to-go and policy gradient with baseline. We will also understand how to train an agent for the Cart Pole balancing task using policy gradient.

⦁ Chapter 11, Actor-Critic Methods – A2C and A3C, deals with several interesting actorcritic methods such as advantage actor-critic and asynchronous advantage actorcritic. We will learn how these actor-critic methods work in detail, and then we will implement them for a mountain car climbing task using OpenAI Gym.

⦁ Chapter 12, Learning DDPG, TD3, and SAC, covers state-of-the-art deep RL algorithms such as deep deterministic policy gradient, twin delayed DDPG, and soft actor, along with step by step derivation. We will also learn how to implement the DDPG algorithm for performing the inverted pendulum swing-up task using Gym.

⦁ Chapter 13, TRPO, PPO, and ACKTR Methods, deals with several popular policy gradient methods such as TRPO and PPO. We will dive into the math behind TRPO and PPO step by step and understand how TRPO and PPO helps an agent find the optimal policy. Next, we will learn to implement PPO for performing the inverted pendulum swing-up task. At the end, we will learn about the actor-critic method called actor-critic using Kronecker-Factored trust region in detail.

⦁ Chapter 14, Distributional Reinforcement Learning, covers distributional RL algorithms. We will begin the chapter by understanding what distributional RL is. Then we will explore several interesting distributional RL algorithms such as categorical DQN, quantile regression DQN, and distributed distributional DDPG.

⦁ Chapter 15, Imitation Learning and Inverse RL, explains imitation and inverse RL algorithms. First, we will understand how supervised imitation learning, DAgger, and deep Q learning from demonstrations work in detail. Next, we will learn about maximum entropy inverse RL. At the end of the chapter, we will learn about generative adversarial imitation learning.

⦁ Chapter 16, Deep Reinforcement Learning with Stable Baselines, helps us to understand how to implement deep RL algorithms using a library called Stable Baselines. We will learn what Stable Baselines is and how to use it in detail by implementing several interesting Deep RL algorithms such as DQN, A2C, DDPG TRPO, and PPO.

⦁ Chapter 17, Reinforcement Learning Frontiers, covers several interesting avenues in RL, such as meta RL, hierarchical RL, and imagination augmented agents in detail.

작가 소개

▶About the Author
- Sudharsan Ravichandiran
Sudharsan Ravichandiran is a data scientist, researcher, best selling author, and YouTuber (search for "Sudharsan reinforcement learning"). He completed his Bachelor's in Information Technology at Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning, including Natural Language Processing and computer vision. He is an open-source contributor and loves answering questions on Stack Overflow. He also authored a best-seller, Hands-On Reinforcement Learning with Python, published by Packt Publishing.

리뷰

0.0

구매자 별점
0명 평가

이 작품을 평가해 주세요!

건전한 리뷰 정착 및 양질의 리뷰를 위해 아래 해당하는 리뷰는 비공개 조치될 수 있음을 안내드립니다.
  1. 타인에게 불쾌감을 주는 욕설
  2. 비속어나 타인을 비방하는 내용
  3. 특정 종교, 민족, 계층을 비방하는 내용
  4. 해당 작품의 줄거리나 리디 서비스 이용과 관련이 없는 내용
  5. 의미를 알 수 없는 내용
  6. 광고 및 반복적인 글을 게시하여 서비스 품질을 떨어트리는 내용
  7. 저작권상 문제의 소지가 있는 내용
  8. 다른 리뷰에 대한 반박이나 논쟁을 유발하는 내용
* 결말을 예상할 수 있는 리뷰는 자제하여 주시기 바랍니다.
이 외에도 건전한 리뷰 문화 형성을 위한 운영 목적과 취지에 맞지 않는 내용은 담당자에 의해 리뷰가 비공개 처리가 될 수 있습니다.
아직 등록된 리뷰가 없습니다.
첫 번째 리뷰를 남겨주세요!
'구매자' 표시는 유료 작품 결제 후 다운로드하거나 리디셀렉트 작품을 다운로드 한 경우에만 표시됩니다.
무료 작품 (프로모션 등으로 무료로 전환된 작품 포함)
'구매자'로 표시되지 않습니다.
시리즈 내 무료 작품
'구매자'로 표시되지 않습니다. 하지만 같은 시리즈의 유료 작품을 결제한 뒤 리뷰를 수정하거나 재등록하면 '구매자'로 표시됩니다.
영구 삭제
작품을 영구 삭제해도 '구매자' 표시는 남아있습니다.
결제 취소
'구매자' 표시가 자동으로 사라집니다.

개발/프로그래밍 베스트더보기

  • 핸즈온 LLM (제이 알아마르, 마르턴 흐루턴도르스트)
  • 지속적 배포 (발렌티나 세르빌, 이일웅)
  • 개발자를 위한 IT 영어 온보딩 가이드 (장진호)
  • 개정2판 | 파인만의 컴퓨터 강의 (리처드 파인만, 서환수)
  • 나만의 MCP 서버 만들기 with 커서 AI (서지영)
  • 아키텍트 첫걸음 (요네쿠보 다케시, 조다롱)
  • 조코딩의 랭체인으로 AI 에이전트 서비스 만들기 (우성우, 조동근)
  • 개정2판 | 시작하세요! 도커/쿠버네티스 (용찬호)
  • 일잘러의 비밀, 챗GPT와 GPTs로 나만의 AI 챗봇 만들기 (이경록, 김정욱)
  • AI 에이전트 인 액션 (마이클 래넘, 류광)
  • 코드 너머, 회사보다 오래 남을 개발자 (김상기, 배문교)
  • 핸즈온 생성형 AI (오마르 산세비에로, 페드로 쿠엥카)
  • 생성형 AI를 위한 프롬프트 엔지니어링 (제임스 피닉스, 마이크 테일러)
  • 주니어 백엔드 개발자가 반드시 알아야 할 실무 지식 (최범균)
  • 개발 7년차, 매니저 1일차 (카미유 푸르니에, 권원상)
  • 랭체인과 랭그래프로 구현하는 RAG・AI 에이전트 실전 입문 (니시미 마사히로, 요시다 신고)
  • 이것이 스프링 부트다 with 자바 (김희선)
  • Do it! LLM을 활용한 AI 에이전트 개발 입문 (이성용)
  • 모던 소프트웨어 엔지니어링 (데이비드 팔리, 박재호)
  • 테디노트의 랭체인을 활용한 RAG 비법노트_기본편 (이경록(테디노트))

본문 끝 최상단으로 돌아가기

spinner
앱으로 연결해서 다운로드하시겠습니까?
닫기 버튼
대여한 작품은 다운로드 시점부터 대여가 시작됩니다.
앱으로 연결해서 보시겠습니까?
닫기 버튼
앱이 설치되어 있지 않으면 앱 다운로드로 자동 연결됩니다.
모바일 버전