Deadman checks - group-by functionality?

Hi, all

I’m experimenting with the deadman checks in InfluxDB 2, and I was wondering about the current implementation of the deadman checks.

In Influx v1, we can use the following in a TICKscript to query with a groupby-:

var data = stream
    |from()
        .retentionPolicy(rp)
        .measurement('uptime')
        .groupBy(['host'])

Is there something similar in Influx v2?

My deadman checks don’t seem to trigger when one of N instances in a given query-check have gone “offline” (e.g. not submitting data for the last 60 seconds). It seems that only when the entire measurement/field hasn’t reported data will the check trigger.

@genux That should be possible using the group function in Flux.

My guess is you need to do something like this:

import "influxdata/influxdb/monitor"
import "experimental"
from(bucket: "telegraf")
  |> range(start: -2m)
  |> filter(fn: (r) => r._measurement == "system" and r._field == "uptime")
  |> group(columns:["host"])
  |> monitor.deadman(t: experimental.subDuration(from:now(), d:1m))

Hope that helps!

Hi,

I tried using the group functionality in my query when writing my check, and it still only seems to trigger when all N instances are not reporting data. My goal here is also to trigger a deadman check if any one of the N instances is not reporting data. Is there any way to create the same deadman check above using the API/UI? If not, what are the steps to set up the deadman check you’ve described above via CLI/Flux? I’ve copied the query and request body down below:

from(bucket: "site_metrics")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "ping_metrics")
  |> filter(fn: (r) => r._field == "ping_time")
  |> filter(fn: (r) => r.site_name == "host1" or r.site_name == "host2")
  |> group(columns: ["_field", "_measurement"])

{
    "name": "test check",
    "ownerID": "ownerid",
    "orgID": "orgid",
    "query": 
            {
                "text": "from(bucket: \"site_metrics\")\n  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n  |> filter(fn: (r) => r._measurement == \"ping_metrics\")\n  |> filter(fn: (r) => r._field == \"ping_time\")\n  |> filter(fn: (r) => r.site_name == \"host2\" or r.site_name == \"host1\")\n  |> group(columns: [\"_field\", \"_measurement\"])",
                "editMode": "advanced",
                "name": "",
                "builderConfig": {
                    "buckets": [
                        "site_metrics"
                    ],
                    "tags": [
                        {
                            "key": "_measurement",
                            "values": [
                                "ping_metrics"
                            ],
                            "aggregateFunctionType": "filter"
                        },
                        {
                            "key": "_field",
                            "values": [
                                "ping_time"
                            ],
                            "aggregateFunctionType": "filter"
                        },
                        {
                            "key": "site_name",
                            "values": [
                                "host1",
                                "host2"
                            ],
                            "aggregateFunctionType": "filter"
                        },
                        {
                            "key": "dest_addr",
                            "values": [],
                            "aggregateFunctionType": "filter"
                        }
                    ],
                    "functions": [],
                    "aggregateWindow": {
                        "period": "auto"
                    }
                }
            },
    "statusMessageTemplate": "${ r._check_name } ${r._type} check is ${ r._level }",
    "every": "1m",
    "offset": "0s",
    "tags": [],
    "timeSince": "90s",
    "staleTime": "10m",
    "reportZero": false,
    "level": "CRIT",
    "type": "deadman",
    "status": "active"
}

Hi, I’m trying to understand how to get deadman-check to work in Influx2 and then alert on it. When I try to create an alert I don’t have the option to use Query Builder and there is no Group By option to select in the query generator.

I put the above query by @nathaniel in the Explore window and I do see data come back to the screen but I don’t think I can set up an alert from the Explore screen. Any help/direction would be greatly appreciated. Thank you.