본문 바로가기

리디 접속이 원활하지 않습니다.
강제 새로 고침(Ctrl + F5)이나 브라우저 캐시 삭제를 진행해주세요.
계속해서 문제가 발생한다면 리디 접속 테스트를 통해 원인을 파악하고 대응 방법을 안내드리겠습니다.
테스트 페이지로 이동하기

Mastering Reinforcement Learning with Python 상세페이지

Mastering Reinforcement Learning with Python

Build next-generation, self-learning models using reinforcement learning techniques and best practices

  • 관심 0
소장
전자책 정가
29,000원
판매가
29,000원
출간 정보
  • 2020.12.18 전자책 출간
듣기 기능
TTS(듣기) 지원
파일 정보
  • PDF
  • 544 쪽
  • 15.3MB
지원 환경
  • PC뷰어
  • PAPER
ISBN
9781838648497
ECN
-
Mastering Reinforcement Learning with Python

작품 정보

Get hands-on experience in creating state-of-the-art reinforcement learning agents using TensorFlow and RLlib to solve complex real-world business and industry problems with the help of expert tips and best practices

▶What You Will Learn
-Model and solve complex sequential decision-making problems using RL
-Develop a solid understanding of how state-of-the-art RL methods work
-Use Python and TensorFlow to code RL algorithms from scratch
-Parallelize and scale up your RL implementations using Ray's RLlib package
-Get in-depth knowledge of a wide variety of RL topics
-Understand the trade-offs between different RL approaches
-Discover and address the challenges of implementing RL in the real world

▶Key Features
-Understand how large-scale state-of-the-art RL algorithms and approaches work
-Apply RL to solve complex problems in marketing, robotics, supply chain, finance, cybersecurity, and more
-Explore tips and best practices from experts that will enable you to overcome real-world RL challenges

▶Who This Book Is For
This book is for expert machine learning practitioners and researchers looking to focus on hands-on reinforcement learning with Python by implementing advanced deep reinforcement learning concepts in real-world projects. Reinforcement learning experts who want to advance their knowledge to tackle large-scale and complex sequential decision-making problems will also find this book useful. Working knowledge of Python programming and deep learning along with prior experience in reinforcement learning is required.

▶What this book covers
- Chapter 1, Introduction to Reinforcement Learning, provides an introduction to RL, presents motivating examples and success stories, and looks at RL applications in industry. It then gives some fundamental definitions to refresh your mind on RL concepts and concludes with a section on software and hardware setup.

- Chapter 2, Multi-Armed Bandits, covers a rather simple RL setting, bandit problems without context, which, on the other hand, has tremendous applications in industry as an alternative to the traditional A/B testing. The chapter also describes a very fundamental RL trade-off: exploration versus exploitation. It then presents three approaches to tackle this trade-off and compares them against A/B testing.

- Chapter 3, Contextual Bandits, takes the discussion on multi-armed bandits to an advanced level by adding context to the decision-making process and involving deep neural networks in decision making. We adapt a real dataset from the U.S. Census to an online advertising problem. We conclude the chapter with a section on the applications of bandit problems in industry and business.

- Chapter 4, Makings of a Markov Decision Process, builds the mathematical theory behind sequential decision processes that are solved using RL. We start with Markov chains, where we describe types of states, ergodicity, transitionary, and steady-state behavior. Then we go into Markov reward and decision processes. Along the way, we introduce return, discount, policy, value functions, and Bellman optimality, which are key concepts in RL theory that will be frequently referred to in later chapters. We conclude the chapter with a discussion on partially observed Markov decision processes. Throughout the chapter, we use a grid world example to illustrate the concepts.

- Chapter 5, Solving the Reinforcement Learning Problem, presents and compares dynamic programming, Monte Carlo, and temporal-difference methods, which are fundamental to understanding how to solve a Markov decision process. Key approaches such as policy evaluation, policy iteration, and value iteration are introduced and illustrated. Throughout the chapter, we solve an example inventory replenishment problem. Along the way, we motivate the reader for deep RL methods. We conclude the chapter with a discussion on the importance of simulation in reinforcement learning.

- Chapter 6, Deep Q-Learning at Scale, starts with a discussion on why it is challenging to use deep neural networks in reinforcement learning and how modern deep Q-learning addresses those challenges. After a thorough coverage of scalable deep Q-learning methods, we introduce Ray, a distributed computing framework, with which we implement a parallelized deep Q-learning variant. We finish the chapter by introducing RLlib, Ray's own scalable RL library.

- Chapter 7, Policy-Based Methods, introduces another important class of RL approaches: policy-based methods. You will first learn how they are different than Q-learning and why they are needed. As we build the theory for contemporary policy-based methods, we also show how you can use RLlib for their application to a sample problem.

- Chapter 8, Model-Based Methods, presents how learning a model of the environment can help an RL agent to plan its actions efficiently. In the chapter, we implement and use variants of cross-entropy methods and present Dyna, an RL framework that combines model-free and model-based approaches.

- Chapter 9, Multi-Agent Reinforcement Learning, increases gears, goes into multi-agent settings and present the challenges that come with it. In the chapter, we train tic-tac-toe agents through self-play, which you also can play against for fun.

- Chapter 10, Introducing Machine Teaching, introduces an emerging concept in RL that focuses on leveraging the subject matter expertise of a human "teacher" to make learning easy for RL agents. We present how reward function engineering, curriculum learning, demonstration learning, and action masking can help with training autonomous agents effectively.

- Chapter 11, Achieving Generalization and Overcoming Partial Observability discusses why it is important to be concerned about generalization capabilities of trained RL policies for successful real-world implementations. To this end, the chapter focuses on simulation-to-real gap, connects generalization and partial observability, and introduces domain randomization and memory mechanisms. We also present the CoinRun environment and results on how traditional regularization methods can also help with generalization in RL.

- Chapter 12, Meta-Reinforcement Learning, introduces approaches that allow an RL agent to adapt to a new environment once it is deployed for its task. This is one of the most important research directions towards achieving resilient autonomy through RL.

- Chapter 13, Exploring Advanced Topics, brings you up to speed with some of the most recent developments in RL, including state-of-the-art distributed RL, SEED RL, approaches that cracked all the Atari benchmarks, Agent57, and RL without simulation, offline RL.

- Chapter 14, Solving Robot Learning, goes into implementations of the methods covered in the earlier chapters by training a robot hand to grasp objects using manual and automated curriculum learning in PyBullet, a famous physics simulation in Python.

- Chapter 15, Supply Chain Management, gives you hands-on experience in modeling and solving an inventory replenishment problem. Along the way, we perform hyperparameter tuning for our RL agent. The chapter concludes with a discussion on how RL can be applied to vehicle routing problems.

- Chapter 16, Personalization, Marketing, and Finance goes beyond bandit models for personalization and discusses a news recommendation problem while introducing dueling bandit gradient descent and action embeddings along the way. The chapter also discusses marketing and finance applications of RL and introduces the TensorTrade library for the latter.

- Chapter 17, Smart City and Cybersecurity starts with solving a traffic light contsrol scenario as a multi-agent RL problem using the Flow framework. It then describes how RL can be applied to two other problems: providing ancillary service to a power grid and discovering cyberattacks in it.

- Chapter 18, Challenges and Future Directions in Reinforcement Learning wraps up the book by recapping the challenges in RL and connects them to the recent developments and research in the field. Finally, we present practical suggestions for the reader who want to further deepen their RL expertise.

작가 소개

▶About the Author
- Enes Bilgin
Enes Bilgin works as a senior AI engineer and a tech lead in Microsoft's Autonomous Systems division. He is a machine learning and operations research practitioner and researcher with experience in building production systems and models for top tech companies using Python, TensorFlow, and Ray/RLlib. He holds an M.S. and a Ph.D. in systems engineering from Boston University and a B.S. in industrial engineering from Bilkent University. In the past, he has worked as a research scientist at Amazon and as an operations research scientist at AMD. He also held adjunct faculty positions at the McCombs School of Business at the University of Texas at Austin and at the Ingram School of Engineering at Texas State University.

리뷰

0.0

구매자 별점
0명 평가

이 작품을 평가해 주세요!

건전한 리뷰 정착 및 양질의 리뷰를 위해 아래 해당하는 리뷰는 비공개 조치될 수 있음을 안내드립니다.
  1. 타인에게 불쾌감을 주는 욕설
  2. 비속어나 타인을 비방하는 내용
  3. 특정 종교, 민족, 계층을 비방하는 내용
  4. 해당 작품의 줄거리나 리디 서비스 이용과 관련이 없는 내용
  5. 의미를 알 수 없는 내용
  6. 광고 및 반복적인 글을 게시하여 서비스 품질을 떨어트리는 내용
  7. 저작권상 문제의 소지가 있는 내용
  8. 다른 리뷰에 대한 반박이나 논쟁을 유발하는 내용
* 결말을 예상할 수 있는 리뷰는 자제하여 주시기 바랍니다.
이 외에도 건전한 리뷰 문화 형성을 위한 운영 목적과 취지에 맞지 않는 내용은 담당자에 의해 리뷰가 비공개 처리가 될 수 있습니다.
아직 등록된 리뷰가 없습니다.
첫 번째 리뷰를 남겨주세요!
'구매자' 표시는 유료 작품 결제 후 다운로드하거나 리디셀렉트 작품을 다운로드 한 경우에만 표시됩니다.
무료 작품 (프로모션 등으로 무료로 전환된 작품 포함)
'구매자'로 표시되지 않습니다.
시리즈 내 무료 작품
'구매자'로 표시되지 않습니다. 하지만 같은 시리즈의 유료 작품을 결제한 뒤 리뷰를 수정하거나 재등록하면 '구매자'로 표시됩니다.
영구 삭제
작품을 영구 삭제해도 '구매자' 표시는 남아있습니다.
결제 취소
'구매자' 표시가 자동으로 사라집니다.

개발/프로그래밍 베스트더보기

  • 핸즈온 LLM (제이 알아마르, 마르턴 흐루턴도르스트)
  • 도커로 구축한 랩에서 혼자 실습하며 배우는 네트워크 프로토콜 입문 (미야타 히로시, 이민성)
  • LLM과 RAG로 구현하는 AI 애플리케이션 (에디유, 대니얼김)
  • 나만의 MCP 서버 만들기 with 커서 AI (서지영)
  • 개정판 | 밑바닥부터 시작하는 딥러닝 1 (사이토 고키, 이복연)
  • 생성형 AI 인 액션 (아미트 바리, 이준)
  • 데이터 삽질 끝에 UX가 보였다 (이미진(란란))
  • 지식그래프 (이광배, 이채원)
  • 생성형 AI를 위한 프롬프트 엔지니어링 (제임스 피닉스, 마이크 테일러)
  • 테디노트의 랭체인을 활용한 RAG 비법노트 심화편 (이경록)
  • 지속적 배포 (발렌티나 세르빌, 이일웅)
  • LLM 인 프로덕션 (크리스토퍼 브루소, 매슈 샤프)
  • 실전! 스프링 부트 3 & 리액트로 시작하는 모던 웹 애플리케이션 개발 (주하 힌쿨라, 변영인)
  • 혼자 공부하는 네트워크 (강민철)
  • 혼자 공부하는 컴퓨터 구조+운영체제 (강민철)
  • 객체지향의 사실과 오해 (조영호)
  • 그림으로 이해하는 알고리즘 (이시다 모리테루, 미야자키 쇼이치)
  • 코드 밖 커뮤니케이션 (재퀴 리드, 곽지원)
  • LLM을 활용한 실전 AI 애플리케이션 개발 (허정준, 정진호)
  • LLM 엔지니어링 (막심 라본, 폴 이우수틴)

본문 끝 최상단으로 돌아가기

spinner
앱으로 연결해서 다운로드하시겠습니까?
닫기 버튼
대여한 작품은 다운로드 시점부터 대여가 시작됩니다.
앱으로 연결해서 보시겠습니까?
닫기 버튼
앱이 설치되어 있지 않으면 앱 다운로드로 자동 연결됩니다.
모바일 버전