How can I pass null as literal?

juliopereirab · May 4, 2020, 6:07pm

I intend to use fill with usePrevious on previously manipulated data. What I would like to do is pass some nulls into the table under a certain condition, to then run fill() function.

How do I pass a null literal?

Anaisdg · May 4, 2020, 8:53pm

Hello @juliopereirab,
Welcome! Can you please provide an example input and desired output?

juliopereirab · May 5, 2020, 11:55am

Hello @Anaisdg,

Here’s a sample of the code:

from(bucket: “00000_15cb1129-1137-4a92-b761-9fac2a5d0be9”)
|> range(start:2020-04-26T12:00:00Z, stop: 2020-04-26T15:00:00Z)
|> drop(columns: ["_start", “_stop”, “ChunkEnd”, “ProviderId”])
|> filter(fn: ® => r[“SourceId”] == “bookie1” or r[“SourceId”] == “bookie2”)
|> filter(fn: ® => r["_field"] == “price-000875918750000000”)
|> group()
|> sort(columns: ["_time"])
|> map(fn: ® => {
bookie1_value = if r.SourceId == “bookie1” then r._value else 0
bookie2_value = if r.SourceId == “bookie2” then r._value else 0
return {_time: r._time, bookie1: bookie1_value, bookie2: bookie2_value}
})
|>fill(column: “bookie1”, usePrevious: true)
|>fill(column: “bookie2”, usePrevious: true)

As you may see I’m producing two columns based on values with different tags. As I’m assigning values respectively to the new columns, I need a fallback value that would later be replaced with fill()/previous function. Currently I’m using 0 as fallback value, which doesn’t work because fill() as it expects null. Is there anyway pass null or solve this in some other way?.

Thanks,

Anaisdg · May 6, 2020, 4:28pm

I’m thinking there might be some other way to solve this. I’m still having trouble understanding. what you’re trying to do.
I don’t understand why you’re grouping only to then filter by those tags.
Can you help m understand why you can’t apply a usePrevious before grouping? It seems to me that way you wouldn’t have to “undo the group” by bookie1_value = if r.SourceId == “bookie1” and bookie2_value = if r.SourceId == “bookie2”.
To help me help you can you provide me with an exported annotated csv of your data, so that I can play with it? Can you also describe the goal generally/take a step back for me please?

juliopereirab · May 6, 2020, 8:58pm

I know the query might not be the most precise, although the main objective is to make a comparison of the results of the two bookies for each data point that get’s into the measurement/field. Now naturally, as a data point covers the data of a single bookie, and I want to make a comparison for the two bookmakers for each time a point gets into the database, I need a way of replicating the past values for comparison. Here is an example:

In the screenshot I’m passing there are three mini-tables, the first with the original data structure. I want to go from this state to having two columns to compare both values on each data entry. To do that I map through the values and generate the two columns, but for each row I have only one value, and I would need a sort of “fallback” value that would be detected by fill() to cover the gaps; that fallback value happens to be null, but I don’t know how to pass it.

I know there may be other ways of producing this result, but it seemed basic to be able to have null as literal. On the other hand I know I could make a script that would automate filling the gaps while receiving the data points from the source, but the first lazy approach was to see if that could be set into a plain query with what is there already.
Is there any suggestion?

Thanks,

jonathan · July 30, 2020, 7:22pm

I think you might be able to use pivot for this.

from(bucket: "test2")
    |> range(start: 0, stop: 20)
    |> filter(fn: (r) => r._measurement == "m0" and r._field == "price")
    |> pivot(columnKey: ["SourceId"], rowKey: ["_time"], valueColumn: "_value")
    |> fill(column: "bookie1", usePrevious: true)
    |> fill(column: "bookie2", usePrevious: true)

This is because pivot performs what is essentially an outer join from a single stream of data by splitting that stream into multiple streams and joining them.

The ideal way to do this would likely be to use a join function. We have a better join function we have worked on, but I don’t recommend it for this because it only supports inner join right now and you really need outer join. I also think this could benefit from allowing fill to operate on multiple columns or take a function to filter which rows. I’ve created an issue to address this so we can potentially make this easier.

github.com/influxdata/flux

Allow fill to accept multiple columns

opened 07:21PM - 30 Jul 20 UTC

jsternberg

enhancement community team/query

This was prompted by this community issue: https://community.influxdata.com/t/ho…w-can-i-pass-null-as-literal/14174/5 I think it would make it easier to use join or pivot if fill was able to operate on multiple columns. At the current moment, you have to write a separate fill function for each column you want filled. It might be useful to allow fill to operate on multiple columns or to accept a schema such as these: ``` |> fill(columns: ["bookie1", "bookie2"]) |> fill(fn: (col) => col.label =~ /bookie/) ``` This would prevent the need to write a new fill function for each column that you want to fill.

There’s also this existing issue to have an outer join.

github.com/influxdata/flux

add outer join

opened 10:25PM - 17 Apr 20 UTC

jacobmarble

team/query Lighthouse

I need to join against a constant, possibly incomplete data set. If incomplete, …then query results should indicate that. In this example, if `x` returns any `env` value that is unknown to `y`, then the row in `x` is omitted from the join output. `method: "outer"` would fix this by returning all data in `x`, with `null` or similar for the missing value in `y`. ``` import "csv" alertThresholdByEnv = " #datatype,string,long,string,long #group,false,false,false,false #default,,,, ,result,table,env,_value ,,0,acc,64 ,,0,prod01-eu-central-1,24 ,,0,prod01-us-central-1,24 ,,0,prod01-us-west-2,24 ,,0,stag01-us-east-4,64 ,,0,stag02-us-east-1,64 ,,0,stag03-us-east-1,64 ,,0,toolsus1,43 " y = csv.from(csv: alertThresholdByEnv) x = from(bucket: "apps") ... join(tables: {left: x, right: y}, on: ["env"]) ```

Anaisdg · August 20, 2022, 1:06am

Hello @juliopereirab,
You can now perform full or outer joins with the following function:

You query would look like this:

import "array"
import "join"

data =
    array.from(
        rows: [
              {_time: 2022-01-01T00:00:00Z, Price: 20, Bookmaker: "Bookie1"},
              {_time: 2022-02-01T00:00:00Z, Price: 18, Bookmaker: "Bookie2"},
              {_time: 2022-03-01T00:00:00Z, Price: 21, Bookmaker: "Bookie1"},
              {_time: 2022-04-01T00:00:00Z, Price: 20, Bookmaker: "Bookie2"},
        ],
    )


data
  |> pivot(rowKey:["_time"], columnKey: ["Bookmaker"], valueColumn: "Price")
  |> fill(column: "1", usePrevious: true)
  |> fill(column: "2", usePrevious: true)
  |> yield(name: "solution before joins")

left = data 
|> filter(fn: (r) => r.Bookmaker == "Bookie1")

right = data 
|> filter(fn: (r) => r.Bookmaker == "Bookie2")

join.full(
    left: left,
    right: right,
    on: (l, r) => l._time == r._time,
    as: (l, r) => {
        time = if exists l._time then l._time else r._time
        return {_time: time, Bookie1: l.Price, Bookie2: r.Price}
    },
)
  |> fill(column: "Bookie1", usePrevious: true)
  |> fill(column: "Bookie2", usePrevious: true)
  |> yield(name: "solution after joins")

scott · August 26, 2022, 4:06pm

@juliopereirab Flux 0.179.0 introduced debug.null() which returns a null value of a specified type. It’s currently available in InfluxDB Cloud, InfluxDB 2.4, or InfluxDB nightly.

import "internal/debug"

debug.null(type: "string")
// Returns a null string

Anaisdg · August 30, 2022, 9:04pm

Hello @scott,
How would you suggest using it here? I don’t see it.
Thank you.

scott · August 30, 2022, 10:53pm

@Anaisdg Using the original query posted in this thread:

import "internal/debug"

from(bucket: "00000_15cb1129-1137-4a92-b761-9fac2a5d0be9")
    |> range(start: 2020-04-26T12:00:00Z, stop: 2020-04-26T15:00:00Z)
    |> filter(fn: (r) => r["SourceId"] == "bookie1" or r["SourceId"] == "bookie2")
    |> filter(fn: (r) => r["_field"] == "price-000875918750000000")
    |> group()
    |> sort(columns: ["_time"])
    |> drop(columns: ["_start", "_stop", "ChunkEnd", "ProviderId"])
    |> map(
        fn: (r) => {
            bookie1_value = if r.SourceId == "bookie1" then r._value else debug.null(type: "int")
            bookie2_value = if r.SourceId == "bookie2" then r._value else debug.null(type: "int")

            return {_time: r._time, bookie1: bookie1_value, bookie2: bookie2_value}
        },
    )
    |> fill(column: "bookie1", usePrevious: true)
    |> fill(column: "bookie2", usePrevious: true)

juliopereirab · September 1, 2022, 3:46pm

That’s great, many thanks for sharing a solution. It’s nice to see some possible null that could be coupled with other steps to fill up the gaps.

Topic		Replies	Views
Problem with Map syntax Fluxlang	6	436	April 6, 2022
Fill the previous value if there is no value flux	2	457	June 17, 2022
NULL in flux queries Fluxlang query , flux	2	798	April 26, 2021
Fill() does not fill previous InfluxDB 2 flux	3	106	August 6, 2024
NULL fallback / COALESCE function InfluxDB 2 influxql , query	0	651	February 17, 2023

How can I pass null as literal?

Related topics