Last9 uses a method that only considers windows with available data within the compliance period. This method sets a higher performance expectation. In addition, it increases the impact of bad windows and leads to more alerts. Thus, resulting in better quality control. Given that the objective of setting SLO is to ensure a certain level of quality, it makes sense to keep a higher bar.
To understand this better, let’s consider an example where an SLO setup is as follows:
SLO name | P90 latency |
---|---|
SLO definition | 99% of the times P90<900000 (ms) must be true for 1 Day |
Compliance target | 99% good windows |
Compliance duration | 1 day |
Putting the SLO definition in simple words, the P90 latency should be <900 seconds (15 minutes) for 99% of the time in 1 day compliance window duration when the SLO is checked at any point in time.
For 1 day, the compliance window is divided into 1440 rolling windows* of one minute each
<aside> 💡 1 day = 24 hours x 60 mins = 1440 mins *Rolling window - A Rolling window is expressed relative to the time and automatically shifts with the passage of time. At any point in time the last 1440 windows represent the rolling window for 1 day duration.
</aside>
SLO computation-
In a day we have 1440 windows to calculate the ‘P90 latency’
Only 1% of the windows (14.4 windows) can exceed P90 values of >900000 ms (15 minutes or 900 seconds). As soon as there are 15 windows with P90 values >900 seconds, this SLO will be reported broken for that day.
More than 15 windows have P90 values exceeding 900s
You can follow the below steps to set-up window-based SLOs on Last9 dashboard:
Navigate to the ‘Service Level Objectives’ section on the left panel