Add column to data if it does not exist

Hello,

Is it possible to add a column with default value to data if it does not already exist?

I have tried the following approaches

fill(column: "columnName", value: 0.0)
fill column not found: missingColumn

map(fn: (r) => ({r with columnName: if exists r.columnName then r.columnName else 0.0}))
missing object properties (columnName)

@tkupari What version of InfluxDB/Flux are you using?

InfluxDB 2.0.0 Alpha 18
Flux 0.49.0

Thanks @tkupari. I can reproduce the error. From what I can tell, this should work but I may be missing something. You should submit an issue to the Flux repo.

Thanks for the verification @scott. Should they both work or is fill() meant only for existing columns?

@tkupari fill() only works with existing columns.

I talked with the Flux engineers and this appears to, in fact, be a bug… specifically with the typing system. The good news is that they’re in the process of refactoring the typing engine to be smarter and much more robust. The bad news is that it’s not a quick fix.

There is a workaround, although it’s not ideal. You could create two streams, one that filters data with the column and one that filters data without. Then create a union of the two streams:

missingColumnName = from(bucket: "example-bucket")
  |> range(start: -1h)
  |> filter(fn: (r) => not exists r.columnName)
  |> map(fn: (r) => ({ r with columnName: 0.0 }))

hasColumnName = from(bucket: "example-bucket")
  |> range(start: -1h)
  |> filter(fn: (r) => exists r.columnName)

union(tables: [missingColumnName, hasColumnName])

It’s not the most performant of queries, but it’s a temporary workaround for what you’re trying to do. Be sure to filter each stream to the specific dataset you want to add the column to. This will help to optimize the query.

Thanks @scott! I’m afraid this might not be feasible in our use case as we are doing calculations with 5 different values which could all be missing, and if I understood correctly it would require 10 streams. I think we can filter out the missing data for now.

Should I still submit an issue to the Flux repo?

No need to. It’s related to this existing issue: https://github.com/influxdata/flux/issues/2036