azure - How to prevent Application Insights Availability feature to send alerts every 5 minutes?
问题描述
I use Application Insights "Availability" feature to check a web site availability and send an alert if it is down. Now Application Insights sends an alert every 5 minutes, even the "alert failure time window" is 15 minutes. Test frequency is 5 minutes.
So I get an alert after 5 minutes, then after 10 minutes, then after 15 minutes! I get 3 alerts while I need only 1 one alert after 15 minutes. It looks like a bug for me.
How to prevent Application Insights Availability feature to send alerts every 5 minutes?
解决方案
The email (notification) is sent the moment alert condition is satisfied. It doesn't wait for alert failure time window.
Example: for alerting rule to send notification if 3 locations out of 5 turn red, and 3 locations turning red within the first second => notification will be sent during the same second. It will not wait for 5 (or 15) minutes.
This is by design with the goal to reduce TTD (time to detect).
There are two ways to handle noise:
- Configure retries (test will retry 2 times during red => green state switch)
- Increase the number of locations to trigger alert (for instance, 14 out of 16)
Either way - only one notification is supposed to be sent, not every 5/15 minutes. Multiple notifications suggest either some bug in tracking current state of an alert (bug in a product) or an Application which intermittently fails (so, alerting rule constantly changes its states green => red => green => ..., as a result email is sent during every transition). Do you get alert every 5 minutes when tests are red all the time?
Alert failure time window defines what failed location means. 5 min test interval and 5 min alert failure means that 1 last result defines whether location failed or not. 5 min test interval and 15 min alert failure means that 3 last results define whether location failed or not. So, if one of those 3 test runs failed then location is considered as failed (even though 2 results after it might have been successes).
Increasing alert failure time window makes alerting rule more aggressive (and noisy for intermittently failing apps).
推荐阅读
- visual-c++ - 获取 MFC 对话框的所有控件 ID
- c# - 将给定单词转换为相应的数字。如果单词不是有效名称,则返回 -1
- python - 如何处理 SessionNotCreatedException:消息:会话未从断开连接创建异常:无法从渲染器接收消息?
- r - 如何根据索引可被 r 中的整数整除来过滤列表中的元素?
- python - Scipy函数最小化
- swift - 将 UUID 转换为字符串表示,例如:1816 转换为 Cycling Speed 和 Cadence
- swiftui - 如何在 SwiftUI 中使用动画编译或链式动画?
- typescript - TypeError:将循环结构转换为 JSON --> 从具有构造函数“ClientRequest”的对象开始
- c - 即使在对变量使用 const 之后,可变大小的对象也可能不会被初始化
- javascript - 延迟加载 VueJS 组件,直到 axios.get 解决