▶ Book Description
R is a high-level statistical language and is widely used among statisticians and data miners to develop analytical applications. Often, data analysis people with great analytical skills lack solid programming knowledge and are unfamiliar with the correct ways to use R. Based on the version 3.4, this book will help you develop strong fundamentals when working with R by taking you through a series of full representative examples, giving you a holistic view of R.
We begin with the basic installation and configuration of the R environment. As you progress through the exercises, you'll become thoroughly acquainted with R's features and its packages. With this book, you will learn about the basic concepts of R programming, work efficiently with graphs, create publication-ready and interactive 3D graphs, and gain a better understanding of the data at hand. The detailed step-by-step instructions will enable you to get a clean set of data, produce good visualizations, and create reports for the results. It also teaches you various methods to perform code profiling and performance enhancement with good programming practices, delegation, and parallelization.
By the end of this book, you will know how to efficiently work with data, create quality visualizations and reports, and develop code that is modular, expressive, and maintainable.
▶ What You Will Learn
- Discover techniques to leverage R’s features, and work with packages
- Perform a descriptive analysis and work with statistical models using R
- Work efficiently with objects without using loops
- Create diverse visualizations to gain better understanding of the data
- Understand ways to produce good visualizations and create reports for the results
- Read and write data from relational databases and REST APIs, both packaged and unpackaged
- Improve performance by writing better code, delegating that code to a more efficient programming language, or making it parallel
▶ Key Features
- Get a firm hold on the fundamentals of R through practical hands-on examples
- Get started with good R programming fundamentals for data science
- Exploit the different libraries of R to build interesting applications in R
▶ Who This Book Is For
This book is for those who wish to develop software in R. You don't need to be an expert or professional programmer to follow this book, but you do need to be interested in learning how R works. My hope is that this book is useful for people ranging from beginners to advanced by providing hands-on examples that may help you understand R in ways you previously did not.
▶ What this book covers
- Chapter 1, Introduction to R, covers the R basics you need to understand the rest of the examples. It is not meant to be a thorough introduction to R. Rather, it's meant to give you the very basic concepts and techniques you need to quickly get started with the three examples contained in the book, and which I introduce next.
(This book uses three examples to showcase R's wide range of functionality. The first example shows how to analyze votes with descriptive statistics and linear models, and it is presented in Chapter 2, Understanding Votes with Descriptive Statistics and Chapter 3, Predicting Votes with Linear Models.)
- Chapter 2, Understanding Votes with Descriptive Statistics, shows how to programatically create hundreds of graphs to identify relations within data visually. It shows how to create histograms, scatter plots, correlation matrices, and how to perform Principal Component Analysis (PCA).
- Chapter 3, Predicting Votes with Linear Models, shows how to programatically find the best predictive linear model for a set of data, and according to different success metrics. It also shows how to check model assumptions, and how to use cross validation to increase confidence in your results.
(The second example shows how to simulate data, visualize it, analyze its text components, and create automatic presentations with it.)
Chapter 4, Simulating Sales Data and Working with Databases, shows how to design data schema and simulate the various types of data. It also shows how to integrate real text data with simulated data, and how to use a SQL database to access it more efficiently.
- Chapter 5, Communicating Sales with Visualization, shows how to produce basic to advanced graphs, highly customized graphs. It also shows how to create dynamic 3D graphs and interactive maps.
- Chapter 6, Understanding Reviews with Text Analysis, shows how to perform text analysis step by step using Natural Language Processing (NLP) techniques, as well as sentiment analysis.
- Chapter 7, Developing Automatic Presentations, shows how to put together the results of previous chapters to create presentations that can be automatically updated with the latest data using tools such as knitr and R Markdown.
(Finally, the third example shows how to design and develop complex object-oriented systems that retrieve real-time data from cryptocurrency markets, as well as how to optimize implementations and how to build web applications around such systems.)
- Chapter 8, Object-Oriented System to Track Cryptocurrencies, introduces basic object-oriented techniques that produce complex systems when combined. Furthermore, it shows how to work with three of R’s most used object models, which are S3, S4, and R6, as well as how to make them work together.
- Chapter 9, Implementing an Efficient Simple Moving Average, shows how to iteratively improve an implementation for a Simple Moving Average (SMA), starting with what is considered to be bad code, all the way to advanced optimization techniques using parallelization, and delegation to the Fortran and C++ languages.
- Chapter 10, Adding Interactivity with Dashboards, shows how to wrap what was built during the previous two chapters to produce a modern web application using reactive programming through the Shiny package.
- Appendix, Required Packages, shows how to install the internal and external software necessary to replicate the examples in the book. Specifically, it will walk through the installation processes for Linux and macOS, but Windows follows similar principles and should not cause any problems.