Unnecessary verbosity in flux syntax

#1

I recently began looking at the influxdb v2 roadmap and was just introduced to the flux language. Since you’ve solicited feedback I wanted to offer my initial thoughts. My perspective is of a longtime influxdb user who primarily uses the database via grafana. My underlying application saves a wide variety of data to influx and I use influx + grafana to explore the generated data. The visualizations I’m using change frequently as I may be focusing on any given measurement (and what it represents) for a week or two at a time. (I also change what data I’m saving to influx frequently).

As you might imagine from this use-case, influxdb’s flexibility is a crucial aspect of why it works well. Something I also place a lot of importance on is being able to quickly write new queries. To that end, brevity is crucial. I’m someone who has considered hacking on the influxdb source so I might be able to substitute just the letter “s” for “select” in a query, for example. I also manually delete data with a scheduled task rather than use retention policies because I find the idea of specifying which retention policy I am referring to in every query very unpalatable (and the default can’t be set at the level of a measurement).

From this perspective, my first reaction to the flux syntax was of dismay, because it appears to require a significant increase in boilerplate to cover the same ground in an influxql query.

To start, consider the most basic possible select query:

select * from my_measurement

this becomes

from(bucket:"dbname/autogen")
  |> filter(fn: (r) => r._measurement == "my_measurement"))

Several questions arise from this basic example:

  • do I really need to specify db (aka bucket) every query? no “use dbname”?
  • do I really need to quote the identifiers? That adds a lot of extra keystrokes over time.
  • why do you have to prefix the function with “fn:”? Isn’t the arrow syntax sufficient to ascertain it’s a function?
  • why is it “_measurement”, instead of “measurement”?

There’s several alternatives that would improve this substantially.

What if the argument names passed to the inline function were the column names? e.g.

select a, b / 10 from measurement where c = "tagval"

becomes

from((dbname) => my_measurement))
    |> filter((c) => c == "tagval"))
    |> map((a, b) => (a, b / 10))

Another way of condensing bucket/measurement selection might be

bucket(dbname).select(my_measurement)

There’s probably a bunch of problems with these suggested syntaxes, but they are examples to serve a point. As someone who primarily uses influxdb to explore data, I’m dreading the extra verbosity of flux. Anything you can do to streamline the syntax would be much appreciated.

1 Like
#2

Hi! Thank you for your feedback, we really do appreciate it. While I am an employee of InfluxData, I am not on the Flux team. I will answer these questions as best I can, and perhaps someone from the Flux team may wish to inject their own thoughts too.

I’ll start with your initial questions:

  • do I really need to specify db (aka bucket) every query? no “use dbname”?

Flux is a standalone language. While our current implementation runs within InfluxDB, it’s use-cases cast a much wider net. So yes, specifying a DB is required.

  • do I really need to quote the identifiers? That adds a lot of extra keystrokes over time.

Yes. Like in most programming language, this reflects a string and not a language construct. You could define variables with field names, which would cause ambiguity. So strings it is.

  • why do you have to prefix the function with “fn:”? Isn’t the arrow syntax sufficient to ascertain it’s a function?

Flux is in it’s early iterations. fn is required for now, but there’s no reason this couldn’t adopt JavaScript short syntax in time.

  • why is it “_measurement”, instead of “measurement”?

Underscores are often used in computing to represent private fields, instead of user defined. This is what this means here.


With regards to your final question, there’s no reason you couldn’t define your own function that reduces boilerplate and handles the from and filter requirements without spelling it out each time.

I hope this helps! Please continue to provide your feedback and we’ll take it on-board and hope to make Flux your default language for querying data :smile:

#3

The clarify a little bit more on this question, fn is a parameter/argument for the filter() function that expects a single object, r, which represents each row or record in the input stream. Flux uses keyword arguments, not positional (more info here). So you aren’t prefixing the function with fn; you’re saying fn (the predicate function) is this …

1 Like
#4

Thanks Scott! That completely slipped my mind. Thanks for bringing that up :+1:

#5

I’ll add a few thing here. As for use <db>, that’s actually just part of the CLI, not a part of InfluxQL. So it might be possible to do something like that for the Flux REPL, eliminating the from part of the query, but it would look odd to have filter or range as the start of the query.

We’ve given some thought to creating shorthand syntax for the most common query tasks (range, filter) and I wrote an idea late last year here: https://github.com/influxdata/flux/issues/264

I’d be curious to hear your thoughts on that proposal.