Hello,
Apologies for the long read, if I could buy you a beer for getting to the bottom, I would.
I’m collecting telemetry from devices. At scale, there will be 10000’s devices reporting many metrics (error and warn in my example data, but many more in practice) along with some tags, like version. My goal is to try and spot regressions, the delta between the average of some_field for version1 and the average of some_field for version2. The ultimate goal would be to alert when “the average of some_field for v2 is x% higher than v1” or basically, it looks like v2 is worse than v1 for some_field.
Unfortunately, I have no control over the verison tag and when I use a pivot, it is creating column names I cannot use.
Here is some sample data that works ok:
insert test,version=v1,macaddress=aa errors=4,warnings=6
insert test,version=v2,macaddress=bb errors=5,warnings=7
insert test,version=v2,macaddress=cc errors=14,warnings=16
I’m using pivot because I read that its more efficient that join and the data is in the same measurement.
from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "test"
and (r._field == "errors"))
|> drop(columns: ["macaddress"])
|> aggregateWindow(
every: 5m,
fn: mean,
createEmpty: false
)
|> pivot(rowKey:["_time"], columnKey: ["version"], valueColumn: "_value")
|> map(fn: (r) => ({
_time: r._time,
_field: "delta",
_value: r.v2 - r.v1
})
)
This creates the expected result, apologies for formatting:
table
_field
_time
_value
0
delta
2020-08-25T17:10:00Z
5.5
If I use some more realistic data:
insert test,version=v1_2019-a,macaddress=aa errors=4,warnings=6
insert test,version=v2_2020-b,macaddress=bb errors=5,warnings=7
insert test,version=v2_2020-b,macaddress=cc errors=14,warnings=16
With a modified map function:
|> map(fn: (r) => ({
_time: r._time,
_field: "delta",
_value: r.v2_2020-b - r.v1_2019-a
})
)
The map throws up an error, which I’m pretty sure is due to the “-”:
Error: type error 15.20-15:21: unidentified identifier "b"
I’ve tried looking at ways of rewriting the tags/column name using regex/replaceto remove the special characters but can’t figure it out.
I can workaround this problem using a join, but at scale this might be very inefficient, especially since I ultimately want to perfom the delta on all fields, which might run into the 100’s.
v1 = from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "test"
and (r._field == "errors")
and (r.version=="v1_2019-a"))
|> truncateTimeColumn(unit: 5m)
v2 = from(bucket: "telegraf/autogen")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "test"
and (r._field == "errors")
and (r.version=="v2_2020-b"))
|> truncateTimeColumn(unit: 5m)
join(tables: {v1: v1, v2: v2}, on: ["_time"])
|> map(fn: (r) => ({
_time: r._time,
_field: "delta",
_value: r._value_v2 - r._value_v1
})
)
And finally, is there a clever way I can compare all the fields between versions programatically? Or if I have x fields, do I have to write x flux scritps to make the comparison?
Many thanks for reading!
I’m new to flux, so maybe there is a very simple answer to all of this…
Tom