AggregateWindow with mixed types

Hi,
I’m trying to create an aggregateWindow (for downsampling) that contains data points with fields that have multiple types (string, int & float).

Here’s an incredibly simplified version of what’s happening

First we’ll create a empty influxdb v2 db container

docker run --rm -p 8086:8086 \
  -e DOCKER_INFLUXDB_INIT_MODE=setup \
  -e DOCKER_INFLUXDB_INIT_USERNAME=admin \
  -e DOCKER_INFLUXDB_INIT_PASSWORD=password12345 \
  -e DOCKER_INFLUXDB_INIT_ORG=scrutiny \
  -e DOCKER_INFLUXDB_INIT_BUCKET=metrics \
  -e DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=my-super-secret-auth-token \
  influxdb:2.0

After that, we’ll populate influxDB with 4 data points: 2 points for each “device_wwn”

curl --request POST \
"http://localhost:8086/api/v2/write?org=scrutiny&bucket=metrics&precision=ns" \
  --header "Authorization: Token my-super-secret-auth-token" \
  --header "Content-Type: text/plain; charset=utf-8" \
  --header "Accept: application/json" \
  --data-binary '
    smart,device_wwn=diskdeviceid01,protocol="NVMe" temperature=70.00,attr.power_cycles.attribute_id="power_cycles",attr.power_cycles.value=100,attr.host_reads.attribute_id="host_reads",attr.host_reads.value=1000 1642291849000000000
    smart,device_wwn=diskdeviceid01,protocol="NVMe" temperature=80.00,attr.power_cycles.attribute_id="power_cycles",attr.power_cycles.value=110,attr.host_reads.attribute_id="host_reads",attr.host_reads.value=2000 1642291909000000000
    smart,device_wwn=diskdeviceid02,protocol="ATA" temperature=70.00,attr.1.attribute_id="1",attr.1.value=100,attr.2.attribute_id="2",attr.2.value=1000 1642291649000000000
    smart,device_wwn=diskdeviceid02,protocol="ATA" temperature=80.00,attr.1.attribute_id="1",attr.1.value=110,attr.2.attribute_id="2",attr.2.value=2000 1642291709000000000
    '

Finally we’ll attempt to aggregate/downsample the data we just wrote (down to 1 data point per unique “device_wwn”:

Ideally the two datapoints should be:

smart,device_wwn=diskdeviceid01,protocol="NVMe" temperature=75.00,attr.power_cycles.attribute_id="power_cycles",attr.power_cycles.value=105,attr.host_reads.attribute_id="host_reads",attr.host_reads.value=1500 1642291909000000000

smart,device_wwn=diskdeviceid02,protocol="ATA" temperature=75.00,attr.1.attribute_id="1",attr.1.value=105,attr.2.attribute_id="2",attr.2.value=15000 1642291709000000000

This aggregateWindow query fails

curl -vvv --request POST "http://localhost:8086/api/v2/query?org=scrutiny" \
  --header 'Authorization: Token my-super-secret-auth-token' \
  --header 'Accept: application/csv' \
  --header 'Content-type: application/vnd.flux' \
  --data 'import "influxdata/influxdb/schema"

smart_data = from(bucket: "metrics")
|> range(start: -2y, stop: now())
|> filter(fn: (r) => r["_measurement"] == "smart" )
|> filter(fn: (r) => r["_field"] !~ /(_measurement|protocol|device_wwn)/)

smart_data
|> aggregateWindow(fn: mean, every: 1w)
|> group(columns: ["device_wwn"])
|> schema.fieldsAsCols()'


{"code":"invalid","message":"unsupported input type for mean aggregate: string"}%

But if we filter out the “attribute_id” field (which is of type string), everything works:

curl -vvv --request POST "http://localhost:8086/api/v2/query?org=scrutiny" \
  --header 'Authorization: Token my-super-secret-auth-token' \
  --header 'Accept: application/csv' \
  --header 'Content-type: application/vnd.flux' \
  --data 'import "influxdata/influxdb/schema"

smart_data = from(bucket: "metrics")
|> range(start: -2y, stop: now())
|> filter(fn: (r) => r["_measurement"] == "smart" )
|> filter(fn: (r) => r["_field"] !~ /(_measurement|protocol|device_wwn|attribute_id)/)

smart_data
|> aggregateWindow(fn: mean, every: 1w)
|> group(columns: ["device_wwn"])
|> schema.fieldsAsCols()'

As I mentioned above, this is an incredibly simplified version of my dataset, and we have dozens of fields for each point, with 1/3 being string values (which are constants). I need to find a way to have them copied into the aggregated data.

Hello @AnalogJ,
You can use this package:

But I’m having trouble thinking of way that you can programmatically filter out specific string fields and match them back into their aggregate value/series.
You could maybe filter out the string types and convert them to tags with a map() function and filtering out those string fields before performing the aggregateWindow().