본문 바로가기

리디 접속이 원활하지 않습니다.
강제 새로 고침(Ctrl + F5)이나 브라우저 캐시 삭제를 진행해주세요.
계속해서 문제가 발생한다면 리디 접속 테스트를 통해 원인을 파악하고 대응 방법을 안내드리겠습니다.
테스트 페이지로 이동하기

Python Feature Engineering Cookbook 상세페이지

Python Feature Engineering Cookbook

Over 70 recipes for creating, engineering, and transforming features to build machine learning models

  • 관심 1
소장
전자책 정가
21,000원
판매가
21,000원
출간 정보
  • 2020.01.22 전자책 출간
듣기 기능
TTS(듣기) 지원
파일 정보
  • PDF
  • 364 쪽
  • 8.5MB
지원 환경
  • PC뷰어
  • PAPER
ISBN
9781789807820
UCI
-
Python Feature Engineering Cookbook

작품 정보

Extract accurate information from data to train and improve machine learning models using NumPy, SciPy, pandas, and scikit-learn libraries

▶Book Description
Feature engineering is invaluable for developing and enriching your machine learning models. In this cookbook, you will work with the best tools to streamline your feature engineering pipelines and techniques and simplify and improve the quality of your code.

Using Python libraries such as pandas, scikit-learn, Featuretools, and Feature-engine, you'll learn how to work with both continuous and discrete datasets and be able to transform features from unstructured datasets. You will develop the skills necessary to select the best features as well as the most suitable extraction techniques. This book will cover Python recipes that will help you automate feature engineering to simplify complex processes. You'll also get to grips with different feature engineering strategies, such as the box-cox transform, power transform, and log transform across machine learning, reinforcement learning, and natural language processing (NLP) domains.

By the end of this book, you'll have discovered tips and practical solutions to all of your feature engineering problems.

▶What You Will Learn
⦁ Simplify your feature engineering pipelines with powerful Python packages
⦁ Get to grips with imputing missing values
⦁ Encode categorical variables with a wide set of techniques
⦁ Extract insights from text quickly and effortlessly
⦁ Develop features from transactional data and time series data
⦁ Derive new features by combining existing variables
⦁ Understand how to transform, discretize, and scale your variables
⦁ Create informative variables from date and time

▶Key Features
⦁ Discover solutions for feature generation, feature extraction, and feature selection
⦁ Uncover the end-to-end feature engineering process across continuous, discrete, and unstructured datasets
⦁ Implement modern feature extraction techniques using Python's pandas, scikit-learn, SciPy and NumPy libraries

▶Who This Book Is For
This book is for machine learning professionals, AI engineers, data scientists, and NLP and reinforcement learning engineers who want to optimize and enrich their machine learning models with the best features. Knowledge of machine learning and Python coding will assist you with understanding the concepts covered in this book.

▶What this book covers
⦁ Chapter 1, Foreseeing Variable Problems in Building ML Models, covers how to identify the different problems that variables may present and that challenge machine learning algorithm performance. We'll learn how to identify missing data in variables, quantify the cardinality of the variable, and much more besides.

⦁ Chapter 2, Imputing Missing Data, explains how to engineer variables that show missing information for some observations. In a typical dataset, variables will display values for certain observations, while values will be missing for other observations. We'll introduce various techniques to fill those missing values with some additional values, and the code to execute the techniques.

⦁ Chapter 3, Encoding Categorical Variables, introduces various classical and widely used techniques to transform categorical variables into numerical variables and also demonstrates a technique for reducing the dimension of highly cardinal variables as well as how to tackle infrequent values. This chapter also includes more complex techniques for encoding categorical variables, as described and used in the 2009 KDD competition.

⦁ Chapter 4, Transforming Numerical Variables, uses various recipes to transform numerical variables, typically non-Gaussian, into variables that follow a more Gaussian-like distribution by applying multiple mathematical functions.

⦁ Chapter 5, Performing Variable Discretization, covers how to create bins and distribute the values of the variables across them. The aim of this technique is to improve the spread of values across a range. It also includes well established and frequently used techniques like equal width and equal frequency discretization and more complex processes like discretization with decision trees and many more.

⦁ Chapter 6, Working with Outliers, teaches a few mainstream techniques to remove outliers from the variables in the dataset. We'll also learn how to cap outliers at a given arbitrary minimum/maximum value.

⦁ Chapter 7, Deriving Features from Dates and Time Variables, describes how to create features from dates and time variables. Date variables can't be used as such to build machine learning models for multiple reasons. We'll learn how to combine information from multiple time variables, like calculating time elapsed between variables and also, importantly, working with variables in different time zones.

⦁ Chapter 8, Performing Feature Scaling, covers the methods that we can use to put the variables within the same scale. We'll also learn how to standardize variables, how to scale to minimum and maximum value, how to do mean normalization or scale to vector norm, among other techniques.

⦁ Chapter 9, Applying Mathematical Computations to Features, explains how to create new variables from existing ones by utilizing different mathematical computations. We'll learn how to create new features through the addition/difference/multiplication/division of existing variables and more. We will also learn how to expand the feature space with polynomial expansion and how to combine features using decision trees.

⦁ Chapter 10, Creating Features with Transactional and Time Series Data, covers how to create static features from transactional information, so that we obtain a static view of a customer, or client, at any point in time. We'll learn how to combine features using math operations, across transactions, in specific time windows and capture time between transactions. We'll also discuss how to determine time between special events. We'll briefly dive into signal processing and learn how to determine and quantify local maxima and local minima.

⦁ Chapter 11, Extracting Features from Text Variables, explains how to derive features from text variables. We'll learn to create new features through the addition of existing variables. We will learn how to capture the complexity of the text by capturing the number of characters, words, sentences, the vocabulary and the lexical variety. We will also learn how to create Bag of Words and how to implement TF-IDF with and without n-grams

작가 소개

▶About the Author
- Soledad Galli
Soledad Galli is a lead data scientist with more than 10 years of experience in world-class academic institutions and renowned businesses. She has researched, developed, and put into production machine learning models for insurance claims, credit risk assessment, and fraud prevention. Soledad received a Data Science Leaders' award in 2018 and was named one of LinkedIn's voices in data science and analytics in 2019. She is passionate about enabling people to step into and excel in data science, which is why she mentors data scientists and speaks at data science meetings regularly. She also teaches online courses on machine learning in a prestigious Massive Open Online Course platform, which have reached more than 10,000 students worldwide.

리뷰

0.0

구매자 별점
0명 평가

이 작품을 평가해 주세요!

건전한 리뷰 정착 및 양질의 리뷰를 위해 아래 해당하는 리뷰는 비공개 조치될 수 있음을 안내드립니다.
  1. 타인에게 불쾌감을 주는 욕설
  2. 비속어나 타인을 비방하는 내용
  3. 특정 종교, 민족, 계층을 비방하는 내용
  4. 해당 작품의 줄거리나 리디 서비스 이용과 관련이 없는 내용
  5. 의미를 알 수 없는 내용
  6. 광고 및 반복적인 글을 게시하여 서비스 품질을 떨어트리는 내용
  7. 저작권상 문제의 소지가 있는 내용
  8. 다른 리뷰에 대한 반박이나 논쟁을 유발하는 내용
* 결말을 예상할 수 있는 리뷰는 자제하여 주시기 바랍니다.
이 외에도 건전한 리뷰 문화 형성을 위한 운영 목적과 취지에 맞지 않는 내용은 담당자에 의해 리뷰가 비공개 처리가 될 수 있습니다.
아직 등록된 리뷰가 없습니다.
첫 번째 리뷰를 남겨주세요!
'구매자' 표시는 유료 작품 결제 후 다운로드하거나 리디셀렉트 작품을 다운로드 한 경우에만 표시됩니다.
무료 작품 (프로모션 등으로 무료로 전환된 작품 포함)
'구매자'로 표시되지 않습니다.
시리즈 내 무료 작품
'구매자'로 표시되지 않습니다. 하지만 같은 시리즈의 유료 작품을 결제한 뒤 리뷰를 수정하거나 재등록하면 '구매자'로 표시됩니다.
영구 삭제
작품을 영구 삭제해도 '구매자' 표시는 남아있습니다.
결제 취소
'구매자' 표시가 자동으로 사라집니다.

개발/프로그래밍 베스트더보기

  • 바이브 코딩 너머 개발자 생존법 (애디 오스마니, 강민혁)
  • 혼자 공부하는 바이브 코딩 with 클로드 코드 (조태호)
  • 요즘 당근 AI 개발 (당근 팀)
  • AI 자율학습 밑바닥부터 배우는 AI 에이전트 (다비드스튜디오)
  • 알아서 잘하는 에이전틱 AI 시스템 구축하기 (안자나바 비스와스, 릭 탈루크다르)
  • 도메인 주도 설계를 위한 함수형 프로그래밍 (스콧 블라신, 박주형)
  • 개정2판 | 소프트웨어 아키텍처 The Basics (마크 리처즈, 닐 포드)
  • AI 엔지니어링 (칩 후옌, 변성윤)
  • 연필과 종이로 풀어보는 딥러닝 수학 워크북 214제 (톰 예(Tom yeh) )
  • 밑바닥부터 만들면서 배우는 LLM (세바스찬 라시카, 박해선)
  • 러스트 클린 코드 (브렌든 매슈스, 윤인도)
  • 요즘 바이브 코딩 클로드 코드 완벽 가이드 (최지호(코드팩토리))
  • 처음부터 시작하는 Next.js / React 개발 입문 (미요시 아키, 김모세)
  • AI 자율학습 커서 × AI로 완성하는 나만의 웹 서비스 (성구(강성규) )
  • 개정판 | <소문난 명강의> 레트로의 유니티 6 게임 프로그래밍 에센스 (이제민)
  • 만화로 배우는 리눅스 시스템 관리 1권(PDF 버전) (Piro, 서수환)
  • 요즘 개발자를 위한 시스템 설계 수업 (디렌드라 신하 , 테자스 초프라)
  • 언리얼 엔진으로 배우는 게임 디자인 패턴 (스튜어트 버틀러, 톰 올리버)
  • 데이터베이스 설계, 이렇게 하면 된다 (미크, 윤인성)
  • 핸즈온 바이브 코딩 (정도현)

본문 끝 최상단으로 돌아가기

spinner
앱으로 연결해서 다운로드하시겠습니까?
닫기 버튼
대여한 작품은 다운로드 시점부터 대여가 시작됩니다.
앱으로 연결해서 보시겠습니까?
닫기 버튼
앱이 설치되어 있지 않으면 앱 다운로드로 자동 연결됩니다.
모바일 버전