Table of ContentsView in Frames

How Performance Manager determines the performance impact for an incident

Performance Manager uses the deviation in activity, utilization, write throughput, cluster component usage, or I/O response time for a workload to determine the level of impact to workload performance. This information determines the role of each workload in the incident and how they are ranked on the Incident Details page.

Performance Manager compares the last analyzed values for a workload to the expected range of values. The difference between the values last analyzed and the expected range of values identifies the workloads whose performance was most impacted by the incident.

For example, suppose a cluster contains two workloads: Workload A and Workload B. The expected range for Workload A is 5-10 milliseconds per operation (ms/op) and its actual response time is usually around 7 ms/op. The expected range for Workload B is 10-20 ms/op and its actual response time is usually around 15 ms/op. Both workloads are well within their expected range for response time. Due to contention on the cluster, the response time of both workloads increases to 40 ms/op, crossing the performance threshold, which is the upper bounds of the expected range, and triggering incidents. The deviation in response time, from the expected values to the values above the performance threshold, for Workload A is around 33 ms/op, and the deviation for Workload B is around 25 ms/op. The response time of both workloads spike to 40 ms/op, but Workload A had the bigger performance impact because it had the higher response time deviation at 33 ms/op.

On the Incident Details page, in the Workload Details table, you can sort workloads by their deviation in activity, utilization, or throughput for a cluster component. You can also sort workloads by response time. When you select a sort option, Performance Manager analyzes the deviation in activity, utilization, throughput, or response time since the incident was detected from the expected values to determine the workload sort order. For the response time, the red dots (Performance Manager incident icon) indicate a performance threshold crossing by a victim workload, and the subsequent impact to the response time. Each red dot indicates a higher level of deviation in response time, which helps you identify the victim workloads whose response time was impacted the most by an incident.