Cumulative Data management - Converting to Snapshots Data

I’ve found an issue in managing cumulative data, meaning counters that only increment and sometimes reset (usually due to a restart).

I always had performance issues while querying those data, therefore I’ve decided to aggregate them and turn them into snapshot data using a Continuous Query, which works (almost) perfectly… almost because it loses data.
With performance issue I mean that using pre-aggregate data takes 74ms, while using the raw cumulative counters may take over 6s

The data
The data are about IO usage, 100% cumulative counters, with 9 tags and 6 fields (it doesn’t matter logically), sampled every 15 seconds.

Sample Data
time,company,database_name,file_type,host,logical_filename,measurement_db_type,physical_filename,read_bytes,read_latency_ms,reads,rg_read_stall_ms,rg_write_stall_ms,sql_instance,volume_mount_point,write_bytes,write_latency_ms,writes
25/08/2022 12:27:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4554805248,204589,71212,0,0,SQLCSRV04:SQL2017,F:,20368084992,641401,356320
25/08/2022 12:28:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:29:00,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:30:00,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:31:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:32:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:33:00,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:34:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105
25/08/2022 12:35:01,quantumdatis_demo,AdventureWorks2016,LOG,QDSRVMONITOR,AdventureWorks2016_Log,SQLServer,F:\MSSQL14.SQL2017\MSSQL\DATA\AdventureWorks2016_Log.ldf,4562132992,204898,71325,0,0,SQLCSRV04:SQL2017,F:,20412518400,642159,357105

Queries

Here is the query used to turn the data from cumulative to snapshot (as concise as possible)

SELECT
	 non_negative_difference(last(read_bytes))  AS read_bytes
FROM temp.sqlserver_database_io
WHERE
	time > now() -5m
GROUP BY
	TIME(1m)
	,*

The issue with this query is that the group intervals are not inclusive on both sides of the range, which translates to losing data given this is a cumulative counter.
Given the following chart:
Explanation

What happens when grouping by 1m is that part of the data is unused/lost, as the time buckets created will be from hh:mm:00.0000 to hh:mm:59.9999, visual representation below
LostData

in my case this means that the difference (between last and first) is computed on the following sets.

00:00:00 | interval 1
00:00:15 | interval 1
00:00:30 | interval 1
00:00:45 | interval 1
00:01:00 | interval 2 --this should be both, end of int 1 and beginning of int 2
00:01:15 | interval 2
00:01:30 | interval 2

In order to solve this, I came up with the following query, in which the difference calculation is performed point by point (therefore no data is lost), and is later aggregated in 1m intervals.
The problem is that it takes 2-3x more time that the previous query.

SELECT
	 sum("reads") as "reads"
FROM (
	SELECT
		 non_negative_difference("reads") as "reads"
	FROM "$InfluxDB_RetentionPolicy"."sqlserver_database_io_cumulative"
	WHERE
		time > now() -5m
	GROUP BY *
)
GROUP BY
	*
	,time(1m)

Questions

  • Is there a way of having inclusive grouping intervals? (I think not)
  • How can this kind of issue be solved? are there other approaches I’m missing?

Hello @Giovanni_Luisotto,
You’re correct there isn’t inclusive grouping intervals.
With Flux you might be able to use timeShift function to achieve something like this.

However I created a feature request for what you’re looking for (but for Flux):

Feel free to comment to help it gain traction.

Thanks!