[ Datadog ] SSH Check Integration

Print

SSH Check Integration setting

Linux

  • Configuration file path: /etc/datadog-agent/conf.d/ssh_check.d/conf.yaml

  • Example conf.yaml configuration

    init_config: 
    instances:  - host: "<SOME_REMOTE_HOST>"                      # required    username: "<SOME_USERNAME>"                     # required    password: "<SOME_PASSWORD>"                     # or use private_key_file    add_missing_keys: true                          # default is False    min_collection_interval: 15                     # 수집 주기. default 15로 주기 변경 필요 시 설정    # private_key_file: <PATH_TO_PRIVATE_KEY>       # private_key_type:                                                # rsa or ecdsa; default is rsa    # port: 22                                      # default is port 22    # sftp_check: false                             # set False to disable SFTP check; default is True
    • The path of private_key_file must be a path and permission that the datadog account (dd-agent or ddagentuser) can access. Copy the existing id_rsa file for ssh access to the required path and use it.

      instances:  - host: "<SOME_REMOTE_HOST>"    ......    private_key_file: /opt/datadog-agent/id_rsa
    • Integration of Checks type is managed by check_runners module (default: 4), and if excessive instances are registered, Agent overhead may occur.

  • Restart Agent after setting

    sudo systemctl restart datadog-agent.service

*Reference link: SSH Integration

Windows

  • Configuration file path: C:\ProgramData\Datadog\conf.d\ssh_check.d\conf.yaml

  • When making changes in Datadog Agent Manager:
    In the Checks > Manage Checks menu, select Edit Enabled Checks in the top select box.
    Select ssh_check.d/conf.yaml from the list.
    If it is not in the list, select Add a check in the top select box, then select ssh_check.
    Selecting Add Check on the right will add ssh_check.d/conf.yaml to the Edit Enabled Check list.image-20240223-041931.png

  • Example conf.yaml configuration

    init_config: 
    instances:  - host: "<SOME_REMOTE_HOST>"                      # required    username: "<SOME_USERNAME>"                     # required    password: "<SOME_PASSWORD>"                     # or use private_key_file    add_missing_keys: true                          # default is False    min_collection_interval: 15                     # 수집 주기. default 15로 주기 변경 필요 시 설정    # private_key_file: <PATH_TO_PRIVATE_KEY>       # private_key_type:                                                # rsa or ecdsa; default is rsa    # port: 22                                      # default is port 22    # sftp_check: false                             # set False to disable SFTP check; default is True
    • The path of private_key_file must be a path and permission that the datadog account (dd-agent or ddagentuser) can access. Copy the existing id_rsa file for ssh access to the required path and use it.

      instances:  - host: "<SOME_REMOTE_HOST>"    ......    private_key_file: C:\ProgramData\Datadog\id_rsa
    • Integration of Checks type is managed by check_runners module (default: 4), and if excessive instances are registered, Agent overhead may occur.

  • Restart Agent after setting
    Run Windows CMD Prompt in administrator mode and then execute the command below.

    "%ProgramFiles%\Datadog\Datadog Agent\bin\agent.exe" restart-service

*Reference link: SSH Integration

Kebernetes

  • It is recommended to register with ConfigMap and add it to confd in values.yaml and then deploy it with helm.

    Before configuring confd, you need to set ‘useConfigMap: true’ in values.yaml.

  • Example of confd settings in values.yaml file

    confd:      ssh_check.yaml: |-      init_config:      instances:        - host: "<SOME_REMOTE_HOST>"          # required          username: "<SOME_USERNAME>"         # required          password: "<SOME_PASSWORD>"         # or use private_key_file          add_missing_keys: true              # default is False
    • Integration of Checks type is managed by check_runners module (default: 4), and if excessive instances are registered, Agent overhead may occur.

  • Restart Agent after setting
    Redistribute the changed values.yaml

    helm upgrade -f values.yaml <RELEASE_NAME> datadog/datadog

*Reference Link : SSH Integration, Kubernetes 통합 자동 탐지

SSH Check Monitor setting

Proceed with setting up Service Check Alert.

Menu path: Monitors > New Monitor > Service Check

  • Service Check Monitor composition

    image-20240223-050235.png

    ① Pick a Service Check
    - Select a Service Check target that can be monitored. In this case, select ssh.can_connect.

    image-20240223-051709.png

    ② Pick monitor scope: Tag-based range setting

    image-20240223-055442.png

    - Setting monitoring scope
    : Supports tag-based range selection from all hosts that have the same Service Check.
    When selecting a condition for Scope, it operates with the logic of AND conditions.
    If you want to target all hosts, select ‘All Monitored Hosts’.
    - Excluding Conditions apply
    : Supports tag-based exclusion range selection.
    When applying the excluding condition, it operates with the logic of an OR condition.
    ③ Set alert conditions: Setting conditions for alert occurrence

    image-20240223-060923.png

    ▶ Alert occurrence condition setting
    When setting up SSH Check Monitor, select Check Alert.
    a. Check Alert: Set alert occurrence conditions for each single service. Adjust by the number of consecutive alert failures.
    b. Cluster Alert: Set alert occurrence conditions by the service check failure rate within the cluster group.
    ▶ Set alert occurrence group by conditions.
    - Select Host.
    ▶ Setting alert occurrence and resolution conditions
    - Set the conditions for Critical/OK status.

    - Select the number of consecutive failures for Critical and the number of successes for OK.

    - Select 5 consecutive times for Critical (An alert is generated when there are 5 consecutive failures)
    ▶ Do not notify / Notify setting
    - This is the setting for notification when there is no data collection.
    - The default is ‘Do not notify’, and when set to Nofiy, a Nodata alarm occurs when there is no data for the set time.
    ▶ Alert Auto Resolve Settings

    - If the Alert status persists because the situation is lifted and not resolved after an Alert occurs,

    this is a function that automatically resolves after a set time has passed.

    - The default is ‘Never’, which does not automatically resolve. If you want to automatically resolve,

    select the time.


    ④ Notify your team: Radio settings
    image-20240223-062105.png▶ Alert Title
    - The title of the message that is transmitted when an alarm occurs. - Example: [SSH Check][Critical] No response from {{host.name}} ssh.
    - Example: [SSH Check][Severe] No response from {{host.name}} ssh.
    ▶ Alert Message
    - This is the content of the message transmitted when an alarm occurs.
    - example

    {{#is_alert}}

    발생시간(ST): {{local_time 'last_triggered_at' 'Asia/Seoul'}}

    [심각] {{host.name}} ssh가 연속 5회 응답이 없습니다. 확인 하시기 바랍니다.

    {{/is_alert}}

    {{#is_alert_recovery}}

    발생시간(KST): {{local_time 'last_triggered_at' 'Asia/Seoul'}}

    [심각 해제] {{host.name}} ssh의 응답이 정상으로 돌아왔습니다.

    {{/is_alert_recovery}}

▶ Use Message Template Variables
- You can check the usage of templates and variables available for Alert title and Message body.

▶ Notify your services and your team members setting
- Integrated channels such as opsgenie / slack / TEAMS / webhook and Noti channels such as email are displayed. Please set the channel or target email to which the alarm will be transmitted.

Please set the channel or target email to which the alarm will be propagated.

▶ Content displayed setting(Message Configuration settings)
- Set whether to include automatically added content such as query/snapshot in the message.

▶ Include Triggering tags in notification title setting
- When an alarm occurs, the subject of the message transmitted displays the tag for the target of the alarm.
▶ Aggregation setting
- Multi Alert - Host must be selected because an alarm is generated for each SSH check target host.
▶ Renotification setting
- If Alert(Warning) or Nodata persists, a re-alarm is transmitted at selected intervals.
▶ Tags setting
- Set a Tag for the monitor that can be used when setting the Downtime schedule in Manage Monitors.
▶ Priority setting
- Set the severity (importance) of the alarm from P1 to P5.
⑤ Define permission and audit notifications

image-20240223-080056.png

▶ Restrict editing setting
- Set the permission to edit the alert. When you select a role, all users with that role can edit it.
▶ Test Notifications
- Clicking the button will send a Test alarm to the selected channel with the set contents

▶ Create
- Clicking the button will save the settings.

Online consultation

Contact us

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.