Observability - Istio Service Mesh

Thanks to the sidecar deployment model where Envoy proxies run next to application instances and intercept the traffic, these proxies also collect metrics. 

The metrics Envoy proxies collect and helping us get visibility into the state of your system. Gaining this visibility into our systems is critical because we need to understand what's happening and empower the operators to troubleshoot, maintain, and optimize applications. 

Istio generates three types of telemetry to provide observability to services in the mesh: 

• Metrics 

• Distributed traces 

• Access logs 


Metrics 

Istio generates metrics based on the four golden signals: latency, traffic, errors, and saturation. 

Latency represents the time it takes to service a request. These metrics should be broken down into latency of successful requests (e.g., HTTP 200) and failed requests (e.g., HTTP 500). 

Traffic measures how much demand gets placed on the system, and it's measured in system-specific metrics. For example, HTTP requests per second, or concurrent sessions, retrievals per second, and so on. 

Errors measures the rate of failed requests (e.g., HTTP 500s). 

Saturation measures how full the most constrained resources of service are. For example, utilization of a thread pool. 

The metrics are collected at different levels, starting with the most granular, the Envoy proxy-level, then the service-level and control plane metrics. 


Proxy-level metrics 

Envoy has a crucial role in generating metrics. It generates a rich set of metrics about all traffic passing through it. Using the metrics generated by Envoy, we can monitor the mesh at the lowest granularity, for example, metrics for individual listeners and clusters in the Envoy proxy. 

We can control which Envoy metrics get generated and collected at each workload instance as a mesh operator. 


Service-level metrics 

The service level metrics cover the four golden signals we mentioned earlier. These metrics allow us to monitor service-to-service communication. Additionally, lstio comes with dashboards to monitor the service behavior based on these metrics. 

Just like with the proxy-level metrics, the operator can customize which service-level metrics lstio collects. 

Istio exports the standard set of metrics to Prometheus by default. 


Control plane metrics 

Istio also emits control plane metrics that can help monitor the control plane and behavior of Istio, not user services. 

The control plane metrics include the number of conflicting inbound/outbound listeners, the number of clusters without instances, rejected or ignored configurations, and so on. 


Comments

Popular posts from this blog

DevOps Interview Questions

Calico Certified Operator: AWS Expert Questions & Answers

CKAD Certification Exam Preparation Guide and Tips