Out of memory with aggregateWindow and spread function

Hi,

I have created a task which runs after midnight to calculate the power consumption of the day before of several power meters => spread function. No problem so far…

But now I want to calculate this daily spread value initially for the last ~2 years with the following code:

from(bucket: “power_full”)
|> range(start: 2021-01-01T00:00:00Z, stop: today())
|> filter(fn: (r) => r[“_measurement”] == “Energy”)
|> aggregateWindow(every: 24h, fn: spread, createEmpty: false)
|> to(bucket: “power_daily”)

With this code I always run out of memory. If I change the function to mean it is done in about 2 minutes. I think this is because the spread function is not supported for a push down? I already tried the min and max function => also out of memory.

Time is not the problem here because it is only a initial run to prepare the bucket. Or is it possible to calculate it month by month or something like that in a loop?

Hi,
I tried to calculate month by month with the following code, but the task is still running out of memory :cry:
Is there something like a dispose which could be called between the aggregateFunctions calls?

import "timezone"
import "strings"
import "date"

option task = {name: "power_daily_rebuild", every: 876456h0m0s, offset: 20m}

option location = timezone.location(name: "Europe/Berlin")

targetBucket = "power_daily"
sourceBucket = "power_full"

aggregateFunctions = (month) => {
    startTime = date.truncate(t: month, unit: 1mo)
    endTime = date.add(d: 1mo, to: startTime)
    data =
        from(bucket: "Unimoc")
            |> range(start: startTime, stop: endTime)
            |> filter(fn: (r) => r["_measurement"] == "Energy")

    //    data
    //        |> aggregateWindow(every: 24h, fn: mean, createEmpty: false)
    //        |> set(key: "fn", value: "mean")
    //        |> to(bucket: targetBucket)
    //    data
    //        |> aggregateWindow(every: 24h, fn: min, createEmpty: false)
    //        |> set(key: "fn", value: "min")
    //        |> to(bucket: targetBucket)
    //    data
    //        |> aggregateWindow(every: 24h, fn: max, createEmpty: false)
    //        |> set(key: "fn", value: "max")
    //        |> to(bucket: targetBucket)
    data
        |> aggregateWindow(every: 24h, fn: spread, createEmpty: false)
        |> set(key: "fn", value: "spread")
        |> to(bucket: targetBucket)

    return 0
}

aggregateYear = (year) => {
    aggregateFunctions(month: date.add(d: year, to: 2020-01-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-02-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-03-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-04-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-05-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-07-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-08-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-09-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-10-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-11-01T00:00:00Z))
    aggregateFunctions(month: date.add(d: year, to: 2020-12-01T00:00:00Z))

    return 0
}

aggregateYear(year: 0y)
aggregateYear(year: 1y)
aggregateYear(year: 2y)
aggregateYear(year: 3y)

Hello @Gonzo4,
Unfortunately, Flux is notorious for running out of memory. You might be better off using a Python Client library and doing the analysis that way. These types of issues with Flux is largely why the team rewrote the storage engine in v3. You can learn more about the performance benefits here if you’re curious.
InfluxDB 3.0 is up to 45x Faster for Recent Data Compared to InfluxDB Open Source | InfluxData.

Hi,
yes, in the meantime I have written a simple Python script which does a request for each month.
=> Rebuild is done in a few minutes without memory issues :grinning:

Just out of interest: Why was the push down not implemented for the spread/min/max function? OK, with the 3.0 coming up this is not relevant, but anyhow :yum: