▶Book Description
Prometheus is an open source monitoring system. It provides a modern time series database, a robust query language, several metric visualization possibilities, and a reliable alerting solution for traditional and cloud-native infrastructure.
This book covers the fundamental concepts of monitoring and explores Prometheus architecture, its data model, and how metric aggregation works. Multiple test environments are included to help explore different configuration scenarios, such as the use of various exporters and integrations. You’ll delve into PromQL, supported by several examples, and then apply that knowledge to alerting and recording rules, as well as how to test them. After that, alert routing with Alertmanager and creating visualizations with Grafana is thoroughly covered. In addition, this book covers several service discovery mechanisms and even provides an example of how to create your own. Finally, you’ll learn about Prometheus federation, cross-sharding aggregation, and also long-term storage with the help of Thanos.
By the end of this book, you’ll be able to implement and scale Prometheus as a full monitoring system on-premises, in cloud environments, in standalone instances, or using container orchestration with Kubernetes.
▶What You Will Learn
- Grasp monitoring fundamentals and implement them using Prometheus
- Discover how to extract metrics from common infrastructure services
- Find out how to take full advantage of PromQL
- Design a highly available, resilient, and scalable Prometheus stack
- Explore the power of Kubernetes Prometheus Operator
- Understand concepts such as federation and cross-shard aggregation
- Unlock seamless global views and long-term retention in cloud-native apps with Thanos
▶Key Features
- Integrate Prometheus with Alertmanager and Grafana for building a complete monitoring system
- Explore PromQL, Prometheus' functional query language, with easy-to-follow examples
- Learn how to deploy Prometheus components using Kubernetes and traditional instances
▶Who This Book Is For
If you’re a software developer, cloud administrator, site reliability engineer, DevOps enthusiast or system admin looking to set up a fail-safe monitoring and alerting system for sustaining infrastructure security and performance, this book is for you. Basic networking and infrastructure monitoring knowledge will help you understand the concepts covered in this book.
▶What this book covers
- Chapter 1, Monitoring Fundamentals, lays the foundations of several key concepts that are used throughout the book. This chapter also explores the approach Prometheus takes to metric collection and why some controversial decisions are vital for the design and architecture of its stack.
- Chapter 2, An Overview of the Prometheus Ecosystem, introduces a high-level overview of the entire Prometheus ecosystem, which components perform which jobs, and how everything interoperates logically.
- Chapter 3, Setting Up a Test Environment, presents the fundamentals of how to use the test environments provided throughout the book, and how to tinker with them to validate different configurations.
- Chapter 4, Prometheus Metrics Fundamentals, explores metrics, the core resource of Prometheus. Understanding them correctly is essential to fully utilize, manage, or even extend the Prometheus stack.
- Chapter 5, Running a Prometheus Server, focuses on the Prometheus server, providing common patterns of usage and full setup process scenarios for virtual machines and containers.
- Chapter 6, Exporters and Integrations, introduces some of the most useful exporters available, as well as providing examples on how to use them.
- Chapter 7, Prometheus Query Language – PromQL, dives into the powerful and flexible Prometheus query language to leverage its multi-dimensional data model, which allows ad hoc aggregation and the combination of time series.
- Chapter 8, Troubleshooting and Validation, provides useful guidelines on how to quickly detect and fix problems. It also presents useful endpoints that expose critical information and explores promtool, the Prometheus command-line interface and validation tool.
- Chapter 9, Defining Alerting and Recording Rules, covers the usage and testing of recording and alerting rules, providing examples along the way.
- Chapter 10, Discovering and Creating Grafana Dashboards, delves into the visualization components of the Prometheus stack, covering not only the built-in console functionality but also exploring Grafana and how to build, share, and reuse dashboards.
- Chapter 11, Understanding and Extending Alertmanager, introduces the alerting component of the stack, showing how to integrate it with several different alerting providers, and how to correctly set up clustering to enable high-availability with the deduplication of alerts.
- Chapter 12, Choosing the Right Service Discovery, explores multiple service discovery integrations, as well as providing you with the requirements and knowledge to build your own integration if required.
- Chapter 13, Scaling and Federating Prometheus, tackles the scaling of a Prometheus stack and introduces concepts such as sharding and global views, while providing context and explaining them.
- Chapter 14, Integrating Long-Term Storage with Prometheus, covers the concepts of the Prometheus read and write endpoints. Then, it deep-dives into considerations for external and long-term metric storage. Finally, it introduces an end-to-end example using Thanos.