Complex queries

Karthik_Pai · July 19, 2024, 6:39pm

latest_anomaly = from(bucket: "bucket_anomaly")
  |> range(start: 0)  // Adjust the time range as necessary
  |> filter(fn: (r) => r["_field"] == "anomaly" and r["_value"] == 1)
  |> sort(columns: ["_time"], desc: true)
  |> limit(n: 1)
  |> findRecord(fn: (key) => true, idx: 0)


from(bucket: "bucket_logs")
  |> range(start: 0,stop: latest_anomaly._time)  // Adjust the time range as necessary
  |> filter(fn: (r) => r["_field"] == "prediction")
  |> filter(fn: (r) => r["_value"] == 1)
  |> sort(columns: ["_time"], desc: true)
  |>limit(n:5)
  |> keep(columns: ["_time"])

i have two buckets bucket_anomaly and bucket_logs. The above code returns the latest anomaly and the last 5 anomalous log timestamps that might be causing the issue. The bucket_logs has another field called log-content. Is there any way we can return that along with the timestamps

scott · July 22, 2024, 9:27pm

@Karthik_Pai Some suggestions and notes:

There’s a better way to get the latest anomaly that doesn’t require resorting your data and should be more performant. Instead of sort() |> limit(), just use the last() function:

latest_anomaly = from(bucket: "bucket_anomaly")
  |> range(start: 0)  // Adjust the time range as necessary
  |> filter(fn: (r) => r["_field"] == "anomaly" and r["_value"] == 1)
  |> last()
  |> findRecord(fn: (key) => true, idx: 0)

When querying your bucket_logs bucket, in range(), you set the stop parameter to the _time of the last reported anomaly. It’s important to note that the stop time is exclusive, meaning it only keeps data with time stamps less than the specified time, not less than or equal to (noted in the docs). So you’re actually filtering out data with the timestamp that you want.

I’d suggest using date.add() to add a nanosecond to the stop time:
```
import "date"

from(bucket: "bucket_logs")
  |> range(start: 0, stop: date.add(d: 1ns, to: latest_anomaly._time))
  // ...
```
Karthik_Pai:

The bucket_logs has another field called log-content.

Why not just query the log-content field instead of the prediction field?
When querying the logs, instead of using sort() |> limit(), just use tail().
If you want to include the log content in the results, add _value to keep()

With these suggestions, the query would look something like this:

import "date"

latest_anomaly = from(bucket: "bucket_anomaly")
    |> range(start: 0)
    |> filter(fn: (r) => r["_field"] == "anomaly" and r["_value"] == 1)
    |> last()
    |> findRecord(fn: (key) => true, idx: 0)

from(bucket: "bucket_logs")
    |> range(start: 0,stop: date.add(d: 1ns, to: latest_anomaly._time))
    |> filter(fn: (r) => r["_field"] == "log-content")
    |> tail(n: 5)
    |> keep(columns: ["_time", "_value"])

Topic		Replies	Views
Could someone help me with migrating InfluxQL query to Flux? Please :) Fluxlang	1	367	August 18, 2021
Uncomplete flux query result? InfluxDB 2 flux	3	395	May 1, 2022
Help with query needed InfluxDB 2	2	546	April 30, 2021
Query Latest Entries Fluxlang	2	446	May 31, 2022
Please help me change this influxql query to flux InfluxDB 2 influxdb , influxql , query	1	397	December 1, 2022

Complex queries

Related Topics