I’m trying to effectively join a timeseries table with some reference data via a http get request.
Is it possible to either:
Join the tables? I tried joining them but I get no results. I think it’s because the table groups are different even though I specify buildingId for both tables?
import "http/requests"
import "experimental/json"
import "array"
import "join"
import "dict"
response = requests.get(url: "...")
data = json.parse(data: response.body)
statusFrequencies = array.from(rows: data)
|> map(fn: (r) => ({r with key: string(v: r.buildingId), value: r.statusFrequencySeconds}))
|> group(columns: ["buildingId"])
a = from(bucket: "live")
|> range(start: 2022-03-23T00:00:00Z, stop: 2022-03-24T00:00:01Z)
|> filter(fn: (r) => r["_measurement"] == "CONNECTION_QUALITY")
|> group(columns: ["buildingId"])
Create a dictionary out of the http response? Something like this:
import "http/requests"
import "experimental/json"
import "array"
import "join"
import "dict"
response = requests.get(url: "...")
data = json.parse(data: response.body)
statusFrequencies = data |> array.map(fn: (x) => ({key: x.buildingId, value: x.statusFrequencySeconds}))
dict.fromList(statusFrequencies)
scott
January 11, 2023, 8:08pm
2
@Rod_McCutcheon Do you have an example of the returned response body?
[
{
"buildingId":1,
"statusFrequencySeconds":60
},
{
"buildingId":4,
"statusFrequencySeconds":300
},
{
"buildingId":5,
"statusFrequencySeconds":60
},
{
"buildingId":2,
"statusFrequencySeconds":300
},
{
"buildingId":3,
"statusFrequencySeconds":300
}
]
scott
January 11, 2023, 10:03pm
4
You definitely have the right idea and could go either route (join or dictionary). There are pros and cons to either route (mainly performance, but I’m not sure which one will perform better). Joining the data is probably the simpler route. Your query above didn’t include the actual join, but it should look something like this:
import "http/requests"
import "experimental/json"
import "array"
import "join"
response = requests.get(url: "...")
data = json.parse(data: response.body)
statusFrequencies = array.from(rows: data)
|> group(columns: ["buildingId"])
a = from(bucket: "live")
|> range(start: 2022-03-23T00:00:00Z, stop: 2022-03-24T00:00:01Z)
|> filter(fn: (r) => r["_measurement"] == "CONNECTION_QUALITY")
|> group(columns: ["buildingId"])
join.inner(
left: a,
right: statusFrequencies,
on: (l, r) => l.buildingId == r.buildingId,
as: (l, r) => ({l with statusFrequency: r.statusFrequencySeconds}),
)
Thanks Scott, I’m still not seeing any results for the join. Is it because the _start and _stop fields are also part of the group? How do I fix that?
I ended up solving this by converting the array to a dictionary, and then doing a lookup.
import "experimental"
import "date"
import "http/requests"
import "experimental"
import "experimental/json"
import "experimental/array"
import "array"
import "dict"
response = requests.get(url: "...")
data = json.parse(data: response.body)
a = array.from(rows: data)
b = array.map(arr: data, fn: (x) => ({ key: float(v: x.buildingId), value: x.statusFrequencySeconds }))
statusFrequencies = dict.fromList(pairs: b)
from(bucket: "live")
|> range(start: -5m)
|> filter(fn: (r) => r["_measurement"] == "LIGHT_LEVEL")
|> filter(fn: (r) => r["_field"] == "value")
|> aggregateWindow(every: 5m, fn: count, createEmpty: true)
|> toFloat()
|> map(fn: (r) => ({r with _value: r._value / dict.get(dict: statusFrequencies, key: float(v: r.buildingId), default: 1.0) / 60.0 * 255.0}))
|> map(fn: (r) => ({r with _time: experimental.addDuration(d: -5m, to: r._time)}))
|> map(fn: (r) => ({r with _measurement: "CONNECTION_QUALITY"}))
|> filter(fn: (r) => r._value > 0.0)
|> to(bucket: "five_minutes")
I had the same issue. |> drop(columns: [“_start”, “_stop”]) worked for me. So, yes, the problem is that _start and _stop are part of the group.