Filter Based on Last Known Value of Another Point

balsa · August 6, 2022, 8:31am

I have a number of users (organizations), each with their own devices (buckets), each with a number of components (points) writing into InfluxDB. In principle, I’d like to ask the question “Give me all readings of type ____ within geographic bounding box _____ from within the last ______ hours that I have permission to view.” I’m trying to learn whether this is possible within Flux, or whether I’ll need to implement a Postgres cache that periodically updates.

Suppose for example I had data that looks like:

{"point": "GPS", "field": "latitude", "value": 1, "time": "2022-01-01T01:00:00"}
{"point": "GPS", "field": "longitude", "value": 2, "time": "2022-01-01T01:00:00"}
<...snip...>
{"point": "THERMOMETER", "field": "temperature_C", "value": 3, "time": "2022-01-01T02:00:00"}

Is it possible to write a Flux query that would allow me to retrieve temperature_C where the last-known latitude and longitude was within some range?

Anaisdg · August 15, 2022, 9:40pm

Hello @balsa,
Welcome!
Yes that should be possible.
You would use a combination of the flux geo package:

With conditional filters:

And the range function:

Here’s the Flux, the query and scripting language for InfluxDB, docs:

For example I might use a query that looks something like this:

import "experimental/geo"

from(bucket: "example-bucket")
    |> range(start: -1d)
    |> filter(fn: (r) => r._measurement == "example-measurement")
    |> geo.gridFilter(region: {lat: 40.69335938, lon: -73.30078125, radius: 20.0})
    |> filter(fn: (r) => types.isType(v: r._value, type: "string") or types.isType(v: r._value, type: "bool"))
    |> yield(name: "_results")

I hope that helps! Let me know if you need some more clarity.

I used the geo.gridFilter instead of conditional filtering, but you could also use conditional flitering to specify lat and lons in between specified ranges.

balsa · August 16, 2022, 2:54am

Thank you! I will get some sample data loaded so that I can confirm that this works, and will let you know the results.

balsa · August 16, 2022, 8:18am

I tried the geo package, and what’s not clear is how I might use this result to get other measurements with the last-known latitude and longitude being within the bounding box. Here’s what I tried:

First, I used this sample data:

gps,type=gps lat=1.0,lon=1.0 1660600000000000000
measurement,type=measurement value=1 1660600000000010000
measurement,type=measurement value=2 1660600000000020000
measurement,type=measurement value=3 1660600000000030000
gps,type=gps lat=4.0,lon=4.0 1660600000000040000
measurement,type=measurement value=5 1660600000000050000
measurement,type=measurement value=6 1660600000000060000
measurement,type=measurement value=7 1660600000000070000
gps,type=gps lat=8.0,lon=8.0 1660600000000080000
measurement,type=measurement value=9 1660600000000090000

My goal is to retrieve the measurements with value=5, 6 and 7 by searching for points with the last-known GPS coordinate near 4,4: I can’t rely on the GPS coordinates being always available, nor can I rely on the GPS reading occurring at the same time as my measurements.

I tried, e.g.:

import "experimental/geo"

from(bucket: "Test Bucket")
    |> range(start: -30d)
    |> filter(fn: (r) => r._measurement == "gps")
    |> geo.shapeData(latField: "lat", lonField: "lon", level: 10)
    |> geo.gridFilter(region: {lat: 4.0, lon: 4.0, radius: 0.5})
    |> yield(name: "_results")

But, this only gives me the GPS coordinate itself, not the measurements taken when the last-known GPS coordinate was 4,4.

Maybe the thing to do is to nest queries, find the GPS coordinate’s timestamp as well as the next one following it, then return measurements within that window? Or is there a way to ask this question directly within a single Flux query?

Anaisdg · August 16, 2022, 5:24pm

Hello @balsa,

For that I would do something like:

import "experimental/geo"

time = (from(bucket: "test")
    |> range(start: -30d)
    |> filter(fn: (r) => r._measurement == "gps")
    |> geo.shapeData(latField: "lat", lonField: "lon", level: 10)
    |> geo.gridFilter(region: {lat: 4.0, lon: 4.0, radius: 0.5})
    |> findRecord(idx: 0 , fn: (key) => true))._time

from(bucket: "test")
    |> range(start: time)
    |> filter(fn: (r) => r._measurement == "measurement")

Or maybe

from(bucket: "test")
    |> range(start: -30d)
    |> group()
    |> sort(columns: ["_time"], desc: false)    
    |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")    
    |> fill(column: "lat", usePrevious: true)   
    |> fill(column: "lon", usePrevious: true)    
    |> filter(fn: (r) =>  r.lat >= 3.0 and  r.lat <= 5.0 or r.lon >= 3.0 and  r.lon <= 5.0)

balsa · August 16, 2022, 7:06pm

Thank you, @Anaisdg! That is exactly what I needed. I appreciate the help.

One final question: do you have an idea of whether there would be a substantial scaling or performance difference between these two approaches?

Anaisdg · August 16, 2022, 9:12pm

Hello @balsa,
Hmm to be honest I rally don’t know which would be more performant. I don’t have much experience with the geopackage to have an intuitive feel.
I can suggest checking out the flux profiler though:

Topic		Replies	Views
Getting data with last values of different fields Fluxlang flux	1	690	November 17, 2019
Geomap with flux InfluxDB 2 grafana	0	491	July 3, 2023
Flux: Get value by _time InfluxDB 2 influxdb , query , flux	5	1180	July 20, 2023
Filter basend on condition InfluxDB 2	1	15	March 21, 2025
Please explain If statments in Flux to me Fluxlang influxdb , flux	3	1447	August 13, 2021

Filter Based on Last Known Value of Another Point

Related topics