Task to store daily min/max/mean of prices

Hello,

i have pricing-records of fuel stations that i would like to shift into another bucket with min/max/mean transformations.

The records look like this and are stored in a bucket with 40d-retention.

_value,_field,_measurement,location,type
1.899,station1,Tankstelle,locationx,type1
1.959,station2,Tankstelle,locationy,type2
2.019,station3,Tankstelle,locationz,type3

Now i would like to have a task that runs every 24h and creates min/max/mean records grouped by location and type. Which stations (_field) doesn’t matter.

option task = {name: "Tankstelle_24h", every: 24h}

data =
    from(bucket: "40days")
        |> range(start: -task.every)
        |> filter(fn: (r) => r["_measurement"] == "Tankstelle")
        |> group(columns: ["location", "type"])

price_min =
    data
        |> min()
        |> set(key: "_field", value: "price_min")

price_max =
    data
        |> max()
        |> set(key: "_field", value: "price_max")

price_mean =
    data
        |> mean()
        |> set(key: "_field", value: "price_mean")

union(tables: [price_min, price_max, price_mean])
    |> set(key: "_measurement", value: "Tankstelle_24h")
    |> to(org: "myorg", bucket: "ages")

The problem is that min() and max() are generated in the ages-bucket but not mean().
If i test the same query with the data explorer (without the |> to option) the raw-view shows “NaN” in _time column but it shows mean-records.

What is wrong in my query or what did understood wrong?

Thanks

Hi @Bond246 I am going thru the InfluxDB University lessons and I remember they covered this in one of them. Min() and Max() are selector functions and Mean() is an aggregate function. I think it has to do with the group key you have in the data. I assume you read this? Flux function types and categories | Flux 0.x Documentation and specifically:

Each output table from an aggregate function will:

  • Contain a single record.
  • Have the same group key as the input table.
  • Contain the an aggregated column. The column label will be the same as the input table. The column data type depends on the specific aggregate operation. The value of the column will be null if the input table is empty or the input column has only null values.
  • Drop all columns that are:
    • not in the group key
    • not the aggregated column

Hi @grant1
thanks for your answer.

I don’t really understand the group key logic. But i always think that it is somethink like a “spedific combination of _fields and tags”.

What I understand is the difference between selectors and aggregate functions. But also in selectors they say

  • Contain one or more unmodified records.
  • Have the same group key as the input table.

So I don’t know if it has something to do with the group key but the documentation says the same about group keys in aggregate functions and in selectors.
Maybe the problem is between selection and aggretation. What i think the |> to command is doing with a selection it takes the original timestamp from min/max and there is no original timestamp for mean because its an aggregation. And i need to create new timestamps (equal) now for all three of them: min/max/mean

I changed my task now and replaced mean() with aggregateWindow()

So now it looks like this and it seem to work:

option task = {name: "Tankstelle_24h", every: 24h}

data =
    from(bucket: "40days")
        |> range(start: -task.every)
        |> filter(fn: (r) => r["_measurement"] == "Tankstelle")
        |> group(columns: ["location", "type"])

price_min =
    data
        |> min()
        |> set(key: "_field", value: "price_min")

price_max =
    data
        |> max()
        |> set(key: "_field", value: "price_max")

price_mean =
    data
        |> aggregateWindow(every: task.every, fn: mean, createEmpty: false)
        |> set(key: "_field", value: "price_mean")

union(tables: [price_min, price_max, price_mean])
    |> set(key: "_measurement", value: "Tankstelle_24h")
    |> to(org: "myorg", bucket: "ages")
1 Like