Cumulative Data management - Converting to Snapshots Data

Giovanni_Luisotto · August 25, 2022, 1:34pm

I’ve found an issue in managing cumulative data, meaning counters that only increment and sometimes reset (usually due to a restart).

I always had performance issues while querying those data, therefore I’ve decided to aggregate them and turn them into snapshot data using a Continuous Query, which works (almost) perfectly… almost because it loses data.
With performance issue I mean that using pre-aggregate data takes 74ms, while using the raw cumulative counters may take over 6s

The data
The data are about IO usage, 100% cumulative counters, with 9 tags and 6 fields (it doesn’t matter logically), sampled every 15 seconds.

Sample Data

time,company,database_name,file_type,host,logical_filename,measurement_db_type,physical_filename,read_bytes,read_latency_ms,reads,rg_read_stall_ms,rg_write_stall_ms,sql_instance,volume_mount_point,write_bytes,write_latency_ms,writes
25/08/2022 12:27:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4554805248,204589,71212,0,0,SQLCSRV04:SQL2017,F:,20368084992,641401,356320
25/08/2022 12:28:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:29:00,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:30:00,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:31:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:32:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:33:00,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:34:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:35:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105

Queries

Here is the query used to turn the data from cumulative to snapshot (as concise as possible)

SELECT
	 non_negative_difference(last(read_bytes))  AS read_bytes
FROM temp.sqlserver_database_io
WHERE
	time > now() -5m
GROUP BY
	TIME(1m)
	,*

The issue with this query is that the group intervals are not inclusive on both sides of the range, which translates to losing data given this is a cumulative counter.
Given the following chart:
Explanation

What happens when grouping by 1m is that part of the data is unused/lost, as the time buckets created will be from hh:mm:00.0000 to hh:mm:59.9999, visual representation below
LostData

in my case this means that the difference (between last and first) is computed on the following sets.

00:00:00 | interval 1
00:00:15 | interval 1
00:00:30 | interval 1
00:00:45 | interval 1
00:01:00 | interval 2 --this should be both, end of int 1 and beginning of int 2
00:01:15 | interval 2
00:01:30 | interval 2

In order to solve this, I came up with the following query, in which the difference calculation is performed point by point (therefore no data is lost), and is later aggregated in 1m intervals.
The problem is that it takes 2-3x more time that the previous query.

SELECT
	 sum("reads") as "reads"
FROM (
	SELECT
		 non_negative_difference("reads") as "reads"
	FROM "$InfluxDB_RetentionPolicy"."sqlserver_database_io_cumulative"
	WHERE
		time > now() -5m
	GROUP BY *
)
GROUP BY
	*
	,time(1m)

Questions

Is there a way of having inclusive grouping intervals? (I think not)
How can this kind of issue be solved? are there other approaches I’m missing?

Anaisdg · September 2, 2022, 6:54pm

Hello @Giovanni_Luisotto,
You’re correct there isn’t inclusive grouping intervals.
With Flux you might be able to use timeShift function to achieve something like this.

However I created a feature request for what you’re looking for (but for Flux):

github.com/influxdata/flux

Feature request: option to include/exclude start and stop times as a part of groups with windowing and aggregateWindow

opened 06:53PM - 02 Sep 22 UTC

Anaisdg

**Proposal**: Allow users to include the last timestamp in the first window. For… example: ``` import "contrib/jsternberg/aggregate" import "array" rows = [{_time: 2022-01-01, baz: 21.2}, {_time: 2022-01-02, baz: 23.8}, {_time: 2022-01-03, baz: 25.8}, {_time: 2022-01-04, baz: 26.8}] array.from(rows: rows) |> range(start: 2022-01-01, stop: 2022-01-04) |> window(every: 2d) ``` Would yield the following result: ``` #group,false,false,true,true,false,false #datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double #default,_result,,,,, ,result,table,_start,_stop,_time,baz ,,0,2022-01-01T00:00:00Z,2022-01-02T00:00:00Z,2022-01-01T00:00:00Z,21.2 ,,1,2022-01-02T00:00:00Z,2022-01-04T00:00:00Z,2022-01-02T00:00:00Z,23.8 ,,1,2022-01-02T00:00:00Z,2022-01-04T00:00:00Z,2022-01-03T00:00:00Z,25.8 ``` Allow the user to perform the following queries: ``` |> range(start: 2022-01-01, stop: 2022-01-04, option: inclusive) |> window(every: 2d, option: inclusive) ``` To yield the following result: ``` #group,false,false,true,true,false,false #datatype,string,long,dateTime:RFC3339,dateTime:RFC3339,dateTime:RFC3339,double #default,_result,,,,, ,result,table,_start,_stop,_time,baz ,,0,2022-01-01T00:00:00Z,2022-01-02T00:00:00Z,2022-01-01T00:00:00Z,21.2 ,,0,2022-01-02T00:00:00Z,2022-01-04T00:00:00Z,2022-01-02T00:00:00Z,23.8 ,,1,2022-01-02T00:00:00Z,2022-01-04T00:00:00Z,2022-01-02T00:00:00Z,23.8 ,,1,2022-01-02T00:00:00Z,2022-01-04T00:00:00Z,2022-01-03T00:00:00Z,25.8 ,,1,2022-01-02T00:00:00Z,2022-01-04T00:00:00Z,2022-01-04T00:00:00Z,26.8 ``` **Use case** Often users want to define window periods or groupings with redundant start and stop times. https://community.influxdata.com/t/cumulative-data-management-converting-to-snapshots-data/26260

Feel free to comment to help it gain traction.

Thanks!

Topic		Replies	Views
How to query cumulativeSum include sum in past data? InfluxDB 2 influxdb , query	4	318	June 3, 2023
Cumulative Sum: Records generated by time(1d) get discarded	0	1212	October 29, 2017
Grouped data shows a higher Sum than expected influxdb	6	1432	July 14, 2017
Difference between two cumulative values InfluxDB 2 influxdb , query , flux	5	1300	May 30, 2023
Cummulative_sum, Gruoup BY, Wrong Output	2	825	May 5, 2017

Cumulative Data management - Converting to Snapshots Data

Related topics