▶Book Description
Need to turn your skills at programming into effective data science skills? Principles of Data Science is created to help you join the dots between mathematics, programming, and business analysis. With this book, you'll feel confident about asking and answering complex and sophisticated questions of your data to move from abstract and raw statistics to actionable ideas.
With a unique approach that bridges the gap between mathematics and computer science, this books takes you through the entire data science pipeline. Beginning with cleaning and preparing data, and effective data mining strategies and techniques, you'll move on to build a comprehensive picture of how every piece of the data science puzzle fits together. Learn the fundamentals of computational mathematics and statistics, as well as some pseudocode being used today by data scientists and analysts. You'll get to grips with machine learning, discover the statistical models that help you take control and navigate even the densest datasets, and find out how to create powerful visualizations that communicate what your data means
▶What You Will Learn
- Get to know the five most important steps of data science
- Use your data intelligently and learn how to handle it with care
- Bridge the gap between mathematics and programming
- Learn about probability, calculus, and how to use statistical models to control and clean your data and drive actionable results
- Build and evaluate baseline machine learning models
- Explore the most effective metrics to determine the success of your machine learning models
- Create data visualizations that communicate actionable insights
- Read and apply machine learning concepts to your problems and make actual predictions
▶Key Features
- Enhance your knowledge of coding with data science theory for practical insight into data science and analysis
- More than just a math class, learn how to perform real-world data science tasks with R and Python
- Create actionable insights and transform raw data into tangible value
▶Who This Book Is For
This book is for people who are looking to understand and utilize the basic practices of data science for any domain.
The reader should be fairly well acquainted with basic mathematics (algebra, perhaps probabilities) and should feel comfortable reading snippets of R/Python as well as pseudocode. The reader is not expected to have worked in a data field; however, they should have the urge to learn and apply the techniques put forth in this book to either their own datasets or those provided to them.
▶What this book covers
- Chapter 1, How to Sound Like a Data Scientist, gives an introduction to the basic terminology used by data scientists and a look at the types of problem we will be solving throughout this book.
- Chapter 2, Types of Data, looks at the different levels and types of data out there and how to manipulate each type. This chapter will begin to deal with the mathematics needed for data science.
- Chapter 3, The Five Steps of Data Science, uncovers the five basic steps of performing data science, including data manipulation and cleaning, and sees examples of each step in detail.
- Chapter 4, Basic Mathematics, helps us discover the basic mathematical principles that guide the actions of data scientists by seeing and solving examples in calculus, linear algebra, and more.
- Chapter 5, Impossible or Improbable –- a Gentle Introduction to Probability, is a beginner's look into probability theory and how it is used to gain an understanding of our random universe.
- Chapter 6, Advanced Probability, uses principles from the previous chapter and introduces and applies theorems, such as the Bayes Theorem, in the hope of uncovering the hidden meaning in our world.
- Chapter 7, Basic Statistics, deals with the types of problem that statistical inference attempts to explain, using the basics of experimentation, normalization, and random sampling.
- Chapter 8, Advanced Statistics, uses hypothesis testing and confidence interval in order to gain insight from our experiments. Being able to pick which test is appropriate and how to interpret p-values and other results is very important as well.
- Chapter 9, Communicating Data, explains how correlation and causation affect our interpretation of data. We will also be using visualizations in order to share our results with the world.
- Chapter 10, How to Tell If Your Toaster Is Learning –- Machine Learning Essentials, focuses on the definition of machine learning and looks at real-life examples of how and when machine learning is applied. A basic understanding of the relevance of model evaluation is introduced.
- Chapter 11, Predictions Don't Grow on Trees, or Do They?, looks at more complicated machine learning models, such as decision trees and Bayesian-based predictions, in order to solve more complex data-related tasks.
- Chapter 12, Beyond the Essentials, introduces some of the mysterious forces guiding data sciences, including bias and variance. Neural networks are introduced as a modern deep learning technique.
- Chapter 13, Case Studies, uses an array of case studies in order to solidify the ideas of data science. We will be following the entire data science workflow from start to finish multiple times for different examples, including stock price prediction and handwriting detection.