Troubles with linear regression

Using InfluxDB 2.1.1 on Linux

Hi,
I am trying to use the statsmodels.linearRegression() function. The problem I am currently struggling with is that the groupings are not considered.

Here is the script:

import "contrib/anaisdg/statsmodels"
original = from(bucket: "my_bucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "aht")
  |> filter(fn: (r) => r["_field"] == "mean" and r._value > 0)
  |> group(columns: ["application"]) // <-- group by application tag
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)

original |> yield(name: "original")

original
|> statsmodels.linearRegression()
|> yield(name: "lr")

The problem is: the values y_hat do not consider the groups, the linear regression seems to be calculated only for the first group (helpdesk), the values for other groups (ivr, serfselrvice) seems to be ignored.

Maybe I’m doing something wrong?

I have this same Error currently, I was talking with the creator of the function yesterday, 3/21, they are working on this, it is supposed to evaluate each table individually.

Thanks, is there a corresponding bug ticket? I’ve found nothing…

Hello @vladi and @wood2944 I’m poking some flux engineers again. Hopefully they have some time. I’ll let you know. For now you’ll have to apply to each series individually. :frowning:

@vladi and @wood2944,
It should be working now:

Hi @vladi and @wood2944, did the linked patch fix your issue?

I have upgraded to InfluxDB v2.3.0 which should include the fix, but i am still seeing the linear regression function only evaluating a single table.

Unfortunately not, I have tested since the “patch” and am still facing the same issue. I was unable to develop a workaround other than having to display the series’ individually. I have put in another issue request since to no response yet. Influxdb 2.0.9 Linear Regression Error · Issue #23544 · influxdata/influxdb · GitHub

Thanks, for your response and creating an issue, I have subscribed to the issue and reacted to show that more than one person is affected.
Fingers crossed this will be solved as would be really useful tool for predicting when a threshold will be reached in alerting.