Continuous query downsampling issue

hi all,
I’m facing a strang behaviour when downsampling in a continuous query.
I don’t have the same results if run a query with the CQ or with the command line…
I have 10 minutes data :

|2020-09-22T18:30:00|0,00|
|2020-09-22T18:40:00|0,00|
|2020-09-22T18:50:00|0,00|
|2020-09-22T19:00:00|2,40|
|2020-09-22T19:10:00|1,80|
|2020-09-22T19:20:00|3,00|
|2020-09-22T19:30:00|2,40|
|2020-09-22T19:40:00|0,60|

which I have to average in quarter data with the CQ.
so I tried with the cq

  • cq_my_eau_mean_15min CREATE CONTINUOUS QUERY cq_my_eau_mean_15min ON eau
    BEGIN SELECT mean() INTO eau.“15m”.:MEASUREMENT FROM eau.“1s”././ GROUP BY
    time(15m) END

and with the query :

  • SELECT mean(“precipitation”) FROM “eau”.“1s”.“meteorologie” GROUP BY time(15m)

And results are different

In the following data, the 1st column is the cq result and the second is the mean query:

|2020-09-22T18:30:00|0,00|0,00|
|2020-09-22T18:45:00|0,00|0,00|
|2020-09-22T19:00:00|1,20|2,10|
|2020-09-22T19:15:00|2,40|3,00|
|2020-09-22T19:30:00|3,00|1,50|
|2020-09-22T19:45:00|0,60|0,00|
|2020-09-22T20:00:00|0,00|0,00|

the mean query is as expect and not the cq.

at 19:15 I get the data from 19:00 and 19:10 => 1.8 + 2.4 = 4.2 / 2 => 2.1 saved at 19:00
at 19:30 I get only data from 19:20 => 3.0 saved at 19:30

it’s look like the cq did :
at 19:15, get data from 18:50 and 19:00 => 0.0 + 2.4 = 2.4 / 2 => 1.2 saved at 19:00
at 19:30, get data from 19:10 and 19:20 => 1.8 + + 3 = 4.8/2 => 2.4 saved at 19:15

Any idea why this difference ?

@ben_sinergy - This is quite strange. What version of InfluxDB are you using? Please review this documentation for v1.8 if you haven’t already. There are a few tricky notes about what timestamp is used and the time interval used for grouping.

I agree with you that the regular query is producing the correct results. That query should group starting on :00, :15, :30, :45 with the timestamp of the mean() being the start of the grouping interval. So the mean at 19:00 includes 19:00 through 19:14.9999.

The continuous query results are odd. It appears to have used the value (3,0) at 19:20 twice. Once for the interval reported at 19:15 and once for the interval reported at 19:30. I would expect it to have only been included in the interval reported at 19:15 (which executed at 19:30) With CQs, the reported timestamp is the beginning of the interval and should fall on natural boundaries so :00, :15, :30, :45 also but execution occurs at the end of the interval. I think you know all this.

Could you double check the output results again - maybe the combined results are not aligned correctly? Check your server clock time to make sure that’s accurate. Check that you only have the one continuous query running (SHOW CONTINUOUS QUERIES). There can be a delay between when data arrives and when it is ready to be queried but, for you, that delay would need to be more than 5 minutes which seems very unlikely.

If you discover more information, please share it, especially the InfluxDB version.

1 Like

@philjb
thanks for you answer.

I use the version 1.8.2.

I recheck recent values and the issue is the same, here below is chronograph printscreen:

used with the following values :

the CQ is made on mulitple mesurement and values but on other I didn’t see any problems. The only difference I see, is that on other mesaurements I have data each 1 or 2 seconds while on this specific measurement (meteorologie), I have data each 10 minutes…

Can you share the result of SHOW CONTINUOUS QUERIES?

This feels like we’re missing something simple or it is a bug.

image

cq_my_eau_mean_1min CREATE CONTINUOUS QUERY cq_my_eau_mean_1min ON eau BEGIN SE LECT mean() INTO eau.“1m”.:MEASUREMENT FROM eau.“1s”././ GROUP BY time(1m) END
cq_my_eau_mean_15min CREATE CONTINUOUS QUERY cq_my_eau_mean_15min ON eau BEGIN S ELECT mean() INTO eau.“15m”.:MEASUREMENT FROM eau.“1s”././ GROUP BY time(15m) END

Would you create a new CQ that doesn’t use the backreferece :MEASUREMENT which specifically targets “eau”.“1s”.“meteorologie” and runs every 15 minutes and writes INTO a different measurement? I want to see if it is something with the CQ running on all measurements. If this new CQ produces the correct mean results, then there is something odd with CQs with backreferences. Would you file a bug in GitHub - influxdata/influxdb: Scalable datastore for metrics, events, and real-time analytics in this case? You should be able to copy and edit down what we already have in this thread.

Do you know if you had this problem on v1.7?

Don’t know with v1.7.

It was a good idea to test with a new CQ but I got the same result… both cq give the same mean and there is still a difference with the “live” mean.

Do you think it should be bug in the CQ implementation ?

Hi @ben_sinergy - Yeah I was concerned the behavior would be the same! To me, this behavior seems odd at best and at worst it doesn’t match (for me) what the documentations says for how CQs process time. When I looked at the values before, I can’t see any reasonable way it arrived at the mean it did. So this seems like a bug. Unfortunately, I cannot research it further from here.

If you are willing, please file an issue at GitHub - influxdata/influxdb: Scalable datastore for metrics, events, and real-time analytics. You can copy out the relevant parts from this thread. If you are able, it would be helpful to get the minimum number of steps and data input to reproduce this behavior into the issue.

Sorry I can’t do more!

1 Like

Hi @philjb,
thanks a lot for your help. I will let you know if I reach further results.

1 Like