Tag Variable based on time

I am trying to get influx to filter tag values based on time, but the query I am using doesn’t seem to work:

  from(bucket: "my_bucket")
    |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
    |> keep(columns: ["mac_address"])
    |> group()
    |> distinct(column: "mac_address")

The plan was then to use this query in Grafana as a variable

Can anyone help as regardless of what time value I set it shows the same count of mac addresses

@oneofthemany Flux will only return tag values that exist in the queried data set, so it can’t guarantee that the count of mac addresses will always be the same because there may be a different number of mac addresses depending on the queried time range. With that said, I’d recommend using schema.tagValues() to get values of the mac_address tag.

import "influxdata/influxdb/schema"

schema.tagValues(
  bucket: "my_bucket",
  tag: "mac_address",
  start: -30d
)

@scott - thanks for the reply, but that also does not work

re your comment

Flux will only return tag values that exist in the queried data set, so it can’t guarantee that the count of mac addresses will always be the same

the issue that presents itself is that the list is not affected by the start: -Xd function within the schema.tagValues query

Question on the above:

Is this a known bug?

ideally the solution needs to be flexible based on the range within a given time period to offer better variable granularity

@oneofthemany I just tested, and schema.tagValues() does appear to be respecting the time range defined in the function. Here’s what I did to test:

  1. Created a test InfluxDB bucket.

  2. Used the following query to write to points to InfluxDB with different tag values at different times.

    import "array"
    import "experimental"
    
    array.from(rows: [
        {_time: now(), _measurement: "test", testTag: "t1", _field: "foo", _value: 1.0},
        {_time: experimental.subDuration(d: 40d, from: now()), _measurement: "test", testTag: "t2", _field: "foo", _value: 2.0},
    ])
        |> to(bucket: "test")
    
  3. I then queried two different time ranges – one to include one tag value, the other to return both. The tests behaved as expected:

    import "influxdata/influxdb/schema"
    
    schema.tagValues(bucket: "test", tag: "testTag", start: -30d)
    

    Returned:

    _value
    t1
    import "influxdata/influxdb/schema"
    
    schema.tagValues(bucket: "test", tag: "testTag", start: -60d)
    

    Returned:

    _value
    t1
    t2

One thing to note here is that schema.tagValues does not accept a stop parameter, so the right time boundary is always now(). However, you can create your own custom function to accept the stop time. Here’s the source definition for schema.tagValues: flux/schema.flux at master · influxdata/flux · GitHub

You can create your own using that definition:

customTagValues = (bucket, tag, predicate=(r) => true, start=-30d, stop=now()) => from(bucket: bucket)
    |> range(start: start, stop: stop)
    |> filter(fn: predicate)
    |> keep(columns: [tag])
    |> group()
    |> distinct(column: tag)

customTagValues(
    bucket: "my_bucket",
    start: v.timeRangeStart,
    stop: v.timeRangeStop,
)

Apologies for the late reply, as we have been running into several difficulties trying to auto generate the MAC addresses based on the values caught within the DB.

The easiest was for us to create a CSV variable and add that variable to the query.

It is just a shame that you can’t have a multiple MAC query within the drop down and assign each of those the "or function like you do in the script query builder:

|> filter(fn: (r) => r["mac_addr"] == 00:11:22:33:44:55 or r["mac_addr"] == aa:bb:cc:dd:ee:ff)

Are you looking to have multiple option with the v.function drop down in the UI, as I would have thought that just appending the “or” function when more than one item is selected would be accomplishable.

@oneofthemany You could instead extract the list of MAC addresses as an array and use contains() to see if it exists in the array. Also, sidenote, since you originally created this thread, the stop parameter was added to schema.tagValues(). You just need to use InfluxDB 2.2 or newer.

import "influxdata/influxdb/schema"

mac_addresses = 
    schema.tagValues(
        bucket: "my_bucket",
        tag: "mac_address",
        start: v.timeRangeStart,
        stop: v.timeRangeStop
    )
        |> findColumn(fn: (key) => true, column: "_value")

from(bucket: "example-bucket")
    |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
    |> filter(fn: (r) => contains(set: mac_addresses, value: r.mac_addr))

@scott

Will look at testing array this week - thanks for the update :slight_smile:

Can you confirm whether multi value select function from variable dropdown is being worked based on the “or” assignment to each value selected?

@oneofthemany I don’t know if multi-select variables are planned. It’d certainly be a valuable addition. You should submit a feature request on the InfluxDB UI repo.

If multi-select variables were available, it wouldn’t extend the query with OR statements in a filter() call. The way that the current dashboard variable implementation works is that each variable represents a simple assignment (x = y). The way a multi-select variable would works is that it would return an array of selected values (x = [y, z]). So you would still need to use contains() to check if a value is in the “selected” array.

For example, if you had a multi-select variable named mac_addresses, that variable would return an array of selected mac addresses and would be stored in the v record. To reference the array, you’d use v.mac_addresses:

from(bucket: "example-bucket")
    |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
    |> filter(fn: (r) => contains(set: v.mac_addresses, value: r.mac_addr))

Hey @scott : apologies for the late reply, but I have been shuffling around other things.

I have tested the above and it yields a zero value.

The query itself uses other value which should limit the value to a single mac which can be accomplished in the standard dashboard query.

The need is for something like the following:

import "influxdata/influxdb/schema"

mac_addr = 
    schema.tagValues(
        bucket: v.bucket,
        tag: "mac_addr",
        start: v.timeRangeStart,
        stop: v.timeRangeStop
    )
        |> findColumn(fn: (key) => true, column: "_value")

from(bucket: v.bucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == v.measurement)
  |> filter(fn: (r) => r["filter_1"] == v.filter_1)
  |> filter(fn: (r) => r["filter_2"] == v.filter_2)
  |> filter(fn: (r) => contains(set: mac_addr, value: r.mac))

filter_1 and filter_2 are being used to only provide a value of a single MAC address

Is this achievable based on my query above?

Hi @scott,

I was using the tagValues() function before I saw your post and figured out the reason the stop time range wasn’t working for me, so first of all, thanks.

I need to retrieve the “_time” value for each distinct tag in my database, but I cannot find the correct solution. Can you help me?

Here’s my current query:

from(bucket: "${Bucket}")
    |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
    |> filter(fn: (r) => r["tag_2"] == "${Tag2}")
    |> keep(columns: ["tag", "_time"])
    |> group()
    |> distinct(column: "tag")

If I remove the distinct, I have the “_time” in my result, but the tags repeat themselves. I tried using unique without success.

Please, if this requires a new thread or post, let me know.

Kind regards,

@maguri With the nature of tags in the InfluxDB data structure, I would expect them to repeat themselves. Do you have tags that you expect to be unique and unrepeated? If so, that data may be better suited to be stored as a field.

Do you have some sample data you could share?

The main purpose of this tag is to filter all measurements received with the specified tag_value in a query. I believe it is currently effective as it is.

This tag serves as a production batch ID, providing a reference ID for each job or batch.

I have been considering that perhaps what I need to do is execute a query over measurements and retrieve the first measurement for each distinct or unique tag.