Hi,
I’m trying to create an aggregateWindow (for downsampling) that contains data points with fields that have multiple types (string, int & float).
Here’s an incredibly simplified version of what’s happening
First we’ll create a empty influxdb v2 db container
docker run --rm -p 8086:8086 \
-e DOCKER_INFLUXDB_INIT_MODE=setup \
-e DOCKER_INFLUXDB_INIT_USERNAME=admin \
-e DOCKER_INFLUXDB_INIT_PASSWORD=password12345 \
-e DOCKER_INFLUXDB_INIT_ORG=scrutiny \
-e DOCKER_INFLUXDB_INIT_BUCKET=metrics \
-e DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=my-super-secret-auth-token \
influxdb:2.0
After that, we’ll populate influxDB with 4 data points: 2 points for each “device_wwn”
curl --request POST \
"http://localhost:8086/api/v2/write?org=scrutiny&bucket=metrics&precision=ns" \
--header "Authorization: Token my-super-secret-auth-token" \
--header "Content-Type: text/plain; charset=utf-8" \
--header "Accept: application/json" \
--data-binary '
smart,device_wwn=diskdeviceid01,protocol="NVMe" temperature=70.00,attr.power_cycles.attribute_id="power_cycles",attr.power_cycles.value=100,attr.host_reads.attribute_id="host_reads",attr.host_reads.value=1000 1642291849000000000
smart,device_wwn=diskdeviceid01,protocol="NVMe" temperature=80.00,attr.power_cycles.attribute_id="power_cycles",attr.power_cycles.value=110,attr.host_reads.attribute_id="host_reads",attr.host_reads.value=2000 1642291909000000000
smart,device_wwn=diskdeviceid02,protocol="ATA" temperature=70.00,attr.1.attribute_id="1",attr.1.value=100,attr.2.attribute_id="2",attr.2.value=1000 1642291649000000000
smart,device_wwn=diskdeviceid02,protocol="ATA" temperature=80.00,attr.1.attribute_id="1",attr.1.value=110,attr.2.attribute_id="2",attr.2.value=2000 1642291709000000000
'
Finally we’ll attempt to aggregate/downsample the data we just wrote (down to 1 data point per unique “device_wwn”:
Ideally the two datapoints should be:
smart,device_wwn=diskdeviceid01,protocol="NVMe" temperature=75.00,attr.power_cycles.attribute_id="power_cycles",attr.power_cycles.value=105,attr.host_reads.attribute_id="host_reads",attr.host_reads.value=1500 1642291909000000000
smart,device_wwn=diskdeviceid02,protocol="ATA" temperature=75.00,attr.1.attribute_id="1",attr.1.value=105,attr.2.attribute_id="2",attr.2.value=15000 1642291709000000000
This aggregateWindow query fails
curl -vvv --request POST "http://localhost:8086/api/v2/query?org=scrutiny" \
--header 'Authorization: Token my-super-secret-auth-token' \
--header 'Accept: application/csv' \
--header 'Content-type: application/vnd.flux' \
--data 'import "influxdata/influxdb/schema"
smart_data = from(bucket: "metrics")
|> range(start: -2y, stop: now())
|> filter(fn: (r) => r["_measurement"] == "smart" )
|> filter(fn: (r) => r["_field"] !~ /(_measurement|protocol|device_wwn)/)
smart_data
|> aggregateWindow(fn: mean, every: 1w)
|> group(columns: ["device_wwn"])
|> schema.fieldsAsCols()'
{"code":"invalid","message":"unsupported input type for mean aggregate: string"}%
But if we filter out the “attribute_id” field (which is of type string), everything works:
curl -vvv --request POST "http://localhost:8086/api/v2/query?org=scrutiny" \
--header 'Authorization: Token my-super-secret-auth-token' \
--header 'Accept: application/csv' \
--header 'Content-type: application/vnd.flux' \
--data 'import "influxdata/influxdb/schema"
smart_data = from(bucket: "metrics")
|> range(start: -2y, stop: now())
|> filter(fn: (r) => r["_measurement"] == "smart" )
|> filter(fn: (r) => r["_field"] !~ /(_measurement|protocol|device_wwn|attribute_id)/)
smart_data
|> aggregateWindow(fn: mean, every: 1w)
|> group(columns: ["device_wwn"])
|> schema.fieldsAsCols()'
As I mentioned above, this is an incredibly simplified version of my dataset, and we have dozens of fields for each point, with 1/3 being string values (which are constants). I need to find a way to have them copied into the aggregated data.