[ Datadog ] Metric Monitor

Print

Configure a Monitor using Metric.

  1. Choose the detection method

    Select the type of Metric Monitor.
    - Threshold Alert: Compares the metric value with a static threshold.
    - Change Alert: Compares the absolute or relative (%) change in value between N minutes ago and now with a given threshold.
    - Anomaly Detection: Detects when the metric behaves abnormally based on past behavior.
    - Outliers Alert: Detects when a group member (host, availability zone, partition, etc.) behaves abnormally compared to the rest.
    - Forecast Alert: Predicts future metric behavior and compares it with a static threshold.

  2. Define the metric
    Select the metric to monitor.

    Select the metric. In the FROM clause, specify the monitoring target using a tag-value method. After selecting avg(min/max/sum) by, you can set the tag for grouping (e.g., selecting host triggers alerts per host).
    You can monitor by adding and calculating multiple metrics.

  3. Set alert conditions
    The Set Alert Conditions change depending on the Metric Monitor type selected in step 1.

    • Set the criteria and time for triggering an alert. (The configuration varies by type but is described in a structured format.)

    • Set the threshold. (If Advanced Recovery is available, configure the alert/warning release threshold.)

    • Through Advanced Alert, you can configure alert resolution settings, waiting time for applying alerts to new groups, and delay time adjustments in calculations.

  4. Notify your team
    Select the recipients (email) for alert notifications or choose a pre-configured channel in Integration.

    image-20240223-083015.png
    • Alert Title: The title of the message when an alert is triggered.
      - Example: [Warning] {{host.name}} server CPU usage is high.

    • Alert Message
      - The content of the message when an alert is triggered.
      - Example

      {{#is_alert}}  
      
       Occurrence Time (KST): {{local_time 'last_triggered_at' 'Asia/Seoul'}} 
        
      ## {{host.name}} ({{host.ip}}) server CPU usage has averaged {{value}} over the past 5 minutes. Please check.
      
      {{/is_alert}}  
      
      
      {{#is_alert_recovery}}
      
       Occurrence Time (KST): {{local_time 'last_triggered_at' 'Asia/Seoul'}} 
        
      ## [Recovered] {{host.name}} ({{host.ip}}) server CPU usage has dropped below {{threshold}}.
      
      {{/is_alert_recovery}}
    • Use Message Template Variables
      You can check the usage of templates and variables available for the Alert title and Message body.
      Reference for available variables: https://docs.datadoghq.com/monitors/notify/variables/?tab=is_alert

    • Notify your services and your team members settings
      Integrated notification channels such as opsgenie / slack / TEAMS / webhook and email will be displayed.
      Please set the channel or target email to propagate the alert.

    • Content displayed settings (Message composition settings)
      Set whether to include automatically added content such as query/snapshot in the Message.

    • Include Triggering tags in notification title settings
      Displays tags related to the affected target in the title of the propagated Message when an alert occurs.

    • Aggregation settings
      If a Group is selected in Set alert conditions, it will automatically be selected as a multi-alert.

    • Renotification settings
      If Alert (Warning) or Nodata persists, re-alerts will be propagated at the selected time intervals.

    • Tags settings
      Set tags for monitors that can be used when querying in Manage Monitors or setting a Downtime schedule.

    • Priority settings
      Set the severity (importance) of alerts from P1 to P5.
      Priority settings (standardized settings based on the criteria below)

  5. Define permissions and audit notifications
    Set the edit permissions for the monitor and configure notifications for modifications.

Online consultation

Contact us

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.