Why Your Monitoring Dashboard May Be Feeding You Phantom Metrics
Too Long; Didn't Read
Metrics are a powerful way to monitor our applications. But they are not necessarily representative of the actual system’s state. It requires understanding of the math and nature of metrics, as well as careful design, to make sure our metrics are indeed useful to answer the questions we need. Having access to the raw data in addition to the metrics is always good, as this is ultimately the source of truth.
Determine your questions, design your metrics accordingly. Understand the different aggregation functions and their characteristic, set up the sampling interval that gives you the right granularity and balance between detection latency and storage volumes. Use varying resolution for different periods of time to balance observability and cost, and consider downscaling where possible (note that not all aggregation functions are compatible with such calculations