I recently began looking at the influxdb v2 roadmap and was just introduced to the flux language. Since you’ve solicited feedback I wanted to offer my initial thoughts. My perspective is of a longtime influxdb user who primarily uses the database via grafana. My underlying application saves a wide variety of data to influx and I use influx + grafana to explore the generated data. The visualizations I’m using change frequently as I may be focusing on any given measurement (and what it represents) for a week or two at a time. (I also change what data I’m saving to influx frequently).
As you might imagine from this use-case, influxdb’s flexibility is a crucial aspect of why it works well. Something I also place a lot of importance on is being able to quickly write new queries. To that end, brevity is crucial. I’m someone who has considered hacking on the influxdb source so I might be able to substitute just the letter “s” for “select” in a query, for example. I also manually delete data with a scheduled task rather than use retention policies because I find the idea of specifying which retention policy I am referring to in every query very unpalatable (and the default can’t be set at the level of a measurement).
From this perspective, my first reaction to the flux syntax was of dismay, because it appears to require a significant increase in boilerplate to cover the same ground in an influxql query.
To start, consider the most basic possible select query:
select * from my_measurement
from(bucket:"dbname/autogen") |> filter(fn: (r) => r._measurement == "my_measurement"))
Several questions arise from this basic example:
- do I really need to specify db (aka bucket) every query? no “use dbname”?
- do I really need to quote the identifiers? That adds a lot of extra keystrokes over time.
- why do you have to prefix the function with “fn:”? Isn’t the arrow syntax sufficient to ascertain it’s a function?
- why is it “_measurement”, instead of “measurement”?
There’s several alternatives that would improve this substantially.
What if the argument names passed to the inline function were the column names? e.g.
select a, b / 10 from measurement where c = "tagval"
from((dbname) => my_measurement)) |> filter((c) => c == "tagval")) |> map((a, b) => (a, b / 10))
Another way of condensing bucket/measurement selection might be
There’s probably a bunch of problems with these suggested syntaxes, but they are examples to serve a point. As someone who primarily uses influxdb to explore data, I’m dreading the extra verbosity of flux. Anything you can do to streamline the syntax would be much appreciated.