Flux pivot creating columns with special characters, issues with map

thopewell · August 25, 2020, 5:44pm

Hello,

Apologies for the long read, if I could buy you a beer for getting to the bottom, I would.

I’m collecting telemetry from devices. At scale, there will be 10000’s devices reporting many metrics (error and warn in my example data, but many more in practice) along with some tags, like version. My goal is to try and spot regressions, the delta between the average of some_field for version1 and the average of some_field for version2. The ultimate goal would be to alert when “the average of some_field for v2 is x% higher than v1” or basically, it looks like v2 is worse than v1 for some_field.
Unfortunately, I have no control over the verison tag and when I use a pivot, it is creating column names I cannot use.

Here is some sample data that works ok:

insert test,version=v1,macaddress=aa errors=4,warnings=6
insert test,version=v2,macaddress=bb errors=5,warnings=7
insert test,version=v2,macaddress=cc errors=14,warnings=16

I’m using pivot because I read that its more efficient that join and the data is in the same measurement.

from(bucket: "telegraf/autogen")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "test" 
  and (r._field == "errors"))
  |> drop(columns: ["macaddress"])
  |> aggregateWindow(
     every: 5m,
     fn: mean,
     createEmpty: false
  )
  |> pivot(rowKey:["_time"], columnKey: ["version"], valueColumn: "_value")  
  |> map(fn: (r) => ({
      _time: r._time,
      _field: "delta",
      _value: r.v2 - r.v1
    })
  )

This creates the expected result, apologies for formatting:

table 
_field  
_time  
_value
0       
delta 
2020-08-25T17:10:00Z
5.5

If I use some more realistic data:

insert test,version=v1_2019-a,macaddress=aa errors=4,warnings=6
insert test,version=v2_2020-b,macaddress=bb errors=5,warnings=7
insert test,version=v2_2020-b,macaddress=cc errors=14,warnings=16

With a modified map function:

  |> map(fn: (r) => ({
      _time: r._time,
      _field: "delta",
      _value: r.v2_2020-b - r.v1_2019-a
    })
  )

The map throws up an error, which I’m pretty sure is due to the “-”:

Error: type error 15.20-15:21: unidentified identifier "b"

I’ve tried looking at ways of rewriting the tags/column name using regex/replaceto remove the special characters but can’t figure it out.

I can workaround this problem using a join, but at scale this might be very inefficient, especially since I ultimately want to perfom the delta on all fields, which might run into the 100’s.

v1 = from(bucket: "telegraf/autogen")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "test" 
  and (r._field == "errors")
  and (r.version=="v1_2019-a"))
  |> truncateTimeColumn(unit: 5m)

  
v2 = from(bucket: "telegraf/autogen")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "test" 
  and (r._field == "errors")
  and (r.version=="v2_2020-b"))
  |> truncateTimeColumn(unit: 5m)


join(tables: {v1: v1, v2: v2}, on: ["_time"])
  |> map(fn: (r) => ({
      _time: r._time,
      _field: "delta",
      _value: r._value_v2 - r._value_v1
    })
  )

And finally, is there a clever way I can compare all the fields between versions programatically? Or if I have x fields, do I have to write x flux scritps to make the comparison?

Many thanks for reading!
I’m new to flux, so maybe there is a very simple answer to all of this…
Tom

Anaisdg · August 31, 2020, 4:16pm

Hello @thopewell,
Thanks for writing such a detailed question!
Have you tried using the rename function with any luck?

thopewell · September 1, 2020, 4:04pm

thopewell:

insert test,version=v1_2019-a,macaddress=aa errors=4,warnings=6
insert test,version=v2_2020-b,macaddress=bb errors=5,warnings=7
insert test,version=v2_2020-b,macaddress=cc errors=14,warnings=16

Hi Anaisdg,
Thanks for the reply!
Great - it works, once I quote the original column names, so with the above data, this gets me the expected result:

from(bucket: "telegraf/autogen")
  |> range(start: -1d)
  |> filter(fn: (r) => r._measurement == "test" 
  and (r._field == "errors"))
  |> drop(columns: ["macaddress"])
  |> aggregateWindow(
     every: 5m,
     fn: mean,
     createEmpty: false
  )
  |> pivot(rowKey:["_time"], columnKey: ["version"], valueColumn: "_value")  
  |> rename(columns: {"v1_2019-a": "v1","v2_2020-b": "v2"})
  |> map(fn: (r) => ({
      _time: r._time,
      _field: "delta",
      _value: r.v2 - r.v1
    })
  )

Thanks for the help!

jonathan · September 1, 2020, 5:35pm

Another thing you can do in this situation is use the [] syntax to access the data.

  |> map(fn: (r) => ({
      _time: r._time,
      _field: "delta",
      _value: r["v2_2020-b"] - r["v1_2019-a"]
    })
  )

Using ["key"] works for non-standard fields and is meant for data like this where the column name is something not representable with an identifier.

This only works with string literals though so you cannot use a variable to access records in this way.

thopewell · September 2, 2020, 9:55pm

Thanks Jonatahan, I double checked and that also works. Good to know, for other use cases too.

Topic		Replies	Views
Using Pivot in Flux on InfluxDB data - what am I doing wrong? Fluxlang influxdb , flux	6	1721	November 20, 2022
How to pivot average aggregate window when some columns are in error Fluxlang influxdb , query , flux	1	34	October 4, 2024
How can I name fields when I create a new table using map? InfluxDB 2	0	23	October 31, 2024
Create a new column adding _measurement as suffix Fluxlang influxdb , schema , query , flux	0	451	September 12, 2022
Flux query to create table from field content (fields vs tags) Fluxlang query , flux	3	2105	February 15, 2022

Flux pivot creating columns with special characters, issues with map

Related topics