Custom 'from' functions?

ryantxu · December 20, 2018, 4:47pm

I am working on the standard problem of needing to pick the correct rollup for the selected time interval.

We have lots of data stored every 1ms, and build grafana dashboards to show it. We currently have a very custom grafana datasource that picks a different rollup (retention policy) based on the query interval.

Is there any way to do this directly with flux? Something like:
from( my_func_that_somehow_knows_the_range() )
|> range(start: -1h)

Any ideas? The real goal is:

github.com/influxdata/influxdb

Intelligent Rollups and Querying of Aggregated Data

opened 12:17AM - 24 Aug 16 UTC

jsternberg

area/queries kind/feature-request 1.x

### Feature Request The database should support more intelligent rollups and qu…erying of aggregated data. Currently, the only way to rollup data is through manually setting up continuous queries and then manually modifying the select statements to query that data which requires the user to know which retention policies exist rather than it being discovered automatically. **Proposal:** It should be simple for an administrator to setup rollups for an entire database and users should not need knowledge of the rollups for them to automatically start using them. Using rollups should be automatic and performant. **Current behavior:** Rollups require an administrator to create a retention policy and a continuous query like this: ``` CREATE RETENTION POLICY "5m" ON mydb DURATION 0s REPLICATION 1; CREATE CONTINUOUS QUERY mycq ON mydb BEGIN SELECT mean(*) INTO mydb."5m".:MEASUREMENT FROM /.*/ GROUP BY time(5m) END; ``` It then requires the user to query the mean of a measurement like this: ``` SELECT mean_value FROM mydb."5m".cpu ``` If the server is not running when an interval should be calculated, that interval will never be run and the user needs to run that query manually. There is no way to automatically reuse the information in continuous queries to backfill data either. Also, if data is written after the last calculation, it will never enter the aggregation. It is possible to obtain partial data, but this involves telling the continuous query to resample more frequently than the default time. ``` CREATE CONTINUOUS QUERY mycq ON mydb RESAMPLE EVERY 1m BEGIN SELECT mean(*) INTO mydb."5m".:MEASUREMENT FROM /.*/ GROUP BY time(5m) END; ``` This will obtain partial data every minute, but it will not be an active result of all of the data that is available. **Desired behavior:** Administrators should have easier commands to create rollups (optional since the commands above are fairly easy to write). Users should not need to care about retention policies when trying to get the data they want. The above query the user should write is: ``` SELECT mean(value) FROM cpu GROUP BY time(5m) ``` This should use the rollup automatically if one is available and would return the same value as querying the raw data. Along with using the rollup automatically, we would also include syntax to automatically select the appropriate precision interval to be used based on the time range or number of points requested. So if we have raw data that is retained for 1 week, 1 minute aggregated data for 2 weeks, and 5 minute aggregated data for 4 weeks and we said this: ``` SELECT mean(value) FROM cpu WHERE time >= now() - 1w GROUP BY time(auto) ``` This would automatically select the 1 minute precision interval because that is the lowest precision. If we scaled this query for the past 3 weeks, we would return the 5 minute precision level. **Use case:** Downsampling long term data into new retention policies and greater performance by precalculating certain aggregates for certain intervals. This was the original use case for continuous queries, but the current continuous queries are too cumbersome for this currently. ### Documentation - [Requirements Doc](https://github.com/influxdata/influxdb/pull/7208)

but that solves the general case. I wonder if flux would let me put custom logic into a single function

Topic		Replies	Views
Automatically selection of retention policy based on range(), and requirement for now() Fluxlang influxdb , grafana , flux	0	640	September 11, 2020
Flux query over multiple downsampled buckets with different retention policies InfluxDB 2	11	7400	January 20, 2023
Query with time range calculated on basis of time returned from previous query Welcome & Getting Started influxdb	2	625	May 19, 2021
Continuous Queries without GROUP BY time Clause Fluxlang influxdb , influxql	1	1494	August 4, 2020
Compute daily return with flux Fluxlang grafana , derivative , aggregate	3	770	May 24, 2021

Custom 'from' functions?

Related topics