InfluxQL: Impact of functions on query perf

DB: Influx2.7
Language: InfluxQL

Hi Team,

Can you please help me understand from a technical point of view, what the impact of using EXP() and CUMULATIVE_SUM() is in InfluxQL

  1. Query without functions:
    select field from measurement where time > now()-7d

  2. Query with functions:
    select EXP(CUMULATIVE_SUM(field)) from measurement where time > now()-7d

What happens under the hood when using EXP() and CUMULATIVE_SUM() here and what’s the impact on query performance?

cc: @Anaisdg, @Jay_Clifford, @scott tagging for attention please :pray:

@ajetsharwin While I can’t give you specific execution time differences between the two, but anytime you apply a function to data, it will slightly increase query execution time (in some cases, drastically). Generally speaking though, the larger the queried data set, the longer it takes to process functions on that data.

From a technical perspective, InfluxQL processes the inner most query/function first. So in the case of this query, InfluxQL will first calculate the cumulative sum of returned results. It then applies the EXP function to the cumulative sums. So essentially the query engine makes two passes over the data. This could just be a matter of milliseconds, but it depends on the number of points returned from the queried time frame.

@scott Thank you Scott, quick follow-up on the technical point of view of using InfluxQL:

What are the impacts on performance when querying a field called fieldA from a measurement with 1000 fields and querying a field called fieldA from a measurement with 10 fields?

How will the impacts change if the where clause is filtering on tags?

What are the impacts on performance when querying a field called fieldA from a measurement with 1000 fields and querying a field called fieldA from a measurement with 10 fields?

Shouldn’t change at all.

How will the impacts change if the where clause is filtering on tags?

In v2 (the TSM storage engine), it may affect it a little, but not much. In v3 (the IOx storage engine), querying by a specific tag value does reduce query performance. This is a known issue with IOx and is being addressed.