Get well versed with state-of-the-art techniques to tailor training processes and boost the performance of computer vision models using machine learning and deep learning techniques
▶Book Description
Computer vision is a scientific field that enables machines to identify and process digital images and videos. This book focuses on independent recipes to help you perform various computer vision tasks using TensorFlow.
The book begins by taking you through the basics of deep learning for computer vision, along with covering TensorFlow 2.x's key features, such as the Keras and tf.data.Dataset APIs. You'll then learn about the ins and outs of common computer vision tasks, such as image classification, transfer learning, image enhancing and styling, and object detection. The book also covers autoencoders in domains such as inverse image search indexes and image denoising, while offering insights into various architectures used in the recipes, such as convolutional neural networks (CNNs), region-based CNNs (R-CNNs), VGGNet, and You Only Look Once (YOLO).
Moving on, you'll discover tips and tricks to solve any problems faced while building various computer vision applications. Finally, you'll delve into more advanced topics such as Generative Adversarial Networks (GANs), video processing, and AutoML, concluding with a section focused on techniques to help you boost the performance of your networks.
By the end of this TensorFlow book, you'll be able to confidently tackle a wide range of computer vision problems using TensorFlow 2.x.
▶What You Will Learn
-Understand how to detect objects using state-of-the-art models such as YOLOv3
-Use AutoML to predict gender and age from images
-Segment images using different approaches such as FCNs and generative models
-Learn how to improve your network's performance using rank-N accuracy, label smoothing, and test time augmentation
-Enable machines to recognize people's emotions in videos and real-time streams
-Access and reuse advanced TensorFlow Hub models to perform image classification and object detection
-Generate captions for images using CNNs and RNNs
▶Key Features
-Develop, train, and use deep learning algorithms for computer vision tasks using TensorFlow 2.x
-Discover practical recipes to overcome various challenges faced while building computer vision models
-Enable machines to gain a human level understanding to recognize and analyze digital images and videos
▶Who This Book Is For
This book is for computer vision developers and engineers, as well as deep learning practitioners looking for go-to solutions to various problems that commonly arise in computer vision. You will discover how to employ modern machine learning (ML) techniques and deep learning architectures to perform a plethora of computer vision tasks. Basic knowledge of Python programming and computer vision is required.
▶What this book covers
- Chapter 1, Getting Started with TensorFlow 2.x for Computer Vision, serves as an overview of basic deep learning concepts, as well as being a first look at some important TensorFlow 2.x features, such as the Keras and tf.data.Dataset APIs. It also teaches you about common and necessary tasks such as saving and loading a model and visualizing a network architecture. It ends with the implementation of a simple image classifier.
- Chapter 2, Performing Image Classification, goes in-depth about the most common application of deep neural networks to computer vision: image classification. It explores the common varieties of classification, such as binary and multiclass classification, and then transitions to examples of multilabel classification and out-of-the-box solutions using transfer learning and TensorFlow Hub.
- Chapter 3, Harnessing the Power of Pre-Trained Networks with Transfer Learning, focuses on transfer learning, a powerful technique to reuse networks pre-trained on massive datasets to increase development productivity and the performance of deep learningpowered computer vision applications. This chapter starts by seeing you use pre-trained networks as feature extractors. Then, you will learn how to combine deep learning with traditional machine learning algorithms through a procedure called incremental learning. Finally, the chapter closes with two examples of fine-tuning: the first using the Keras API and the second relying on TensorFlow Hub.
- Chapter 4, Enhancing and Styling Images with DeepDream, Neural Style Transfer, and Image Super-Resolution, focuses on fun and less conventional applications of deep neural networks in computer vision, namely DeepDream, neural style transfer, and image superresolution.
- Chapter 5, Reducing Noise with Autoencoders, goes over autoencoders, a composite architecture used in domains such as image restoration, inverse image search indexes, and image denoising. It starts by introducing the dense and convolutional variants of autoencoders and then explains several applications, such as inverse image search engines and outlier detection.
- Chapter 6, Generative Models and Adversarial Attacks, introduces you to many examples and applications of Generative Adversarial Networks (GANs). The chapter ends with an example of how to perform an adversarial attack on convolutional neural networks.
- Chapter 7, Captioning Images with CNNs and RNNs, focuses on how to combine both convolutional and recurrent neural networks to generate textual descriptions of images.
- Chapter 8, Fine-Grained Understanding of Images through Segmentation, focuses on image segmentation, a fine-grained version of image classification, at the pixel level. It covers seminal segmentation architectures, such as U-Net and Mask-RCNN.
- Chapter 9, Localizing Elements in Images with Object Detection, covers the complex and yet common task of object detection. It goes over both traditional approaches based on image pyramids and sliding windows and more modern solutions, such as YOLO. It includes a thorough explanation of how to leverage the TensorFlow Object Detection API to train state-of-the-art models on custom datasets.
- Chapter 10, Applying the Power of Deep Learning to Videos, expands the application of deep neural networks to videos. Here, you will find examples of how to detect emotions, recognize actions, and generate frames in a video.
- Chapter 11, Streamlining Network Implementation with AutoML, explores the exciting subfield of AutoML using Autokeras, an experimental library built on top of TensorFlow 2.x, which uses Neural Architecture Search (NAS) to arrive at the best model possible for a given problem. The chapter starts by exploring the basic features of Autokeras and closes by using AutoML to create an age and gender prediction tool.
- Chapter 12, Boosting Performance, explains in detail many different techniques that can be used to boost the performance of a network, from simple but powerful methods, such as using ensembles, to more advanced ones, such as using GradientTape to tailor the training process to the specific needs of a project.