I’ve tried to find a solution of the join()
performance issue, see
and tried
import "experimental"
offmst = from(bucket: "dca")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "TfcMonitor" and
r["host"] == "cbmin00y" and
r["_field"] == "offabs")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
|> keep(columns: ["_time","_value"])
offend = from(bucket: "dca")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "TfcMonitor" and
r["host"] != "cbmin00y" and
r["_field"] == "offabs")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
|> keep(columns: ["_time","_value", "host", "oid"])
|> group()
|> sort(columns: ["_time"])
experimental.join(left:offmst, right:offend,
fn: (left, right) => ({
left with
host: right.host,
oid: right.oid,
_value: right._value - left._value
}))
|> group(columns: ["host", "oid"], mode:"by")
|> sort(columns: ["_time"])
|> yield()
which should be equivalent to the conventional
offmst = from(bucket: "dca")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "TfcMonitor" and
r["host"] == "cbmin00y" and
r["_field"] == "offabs")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
|> keep(columns: ["_time","_value"])
|> rename(columns: {_value: "off_mst"})
offend = from(bucket: "dca")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "TfcMonitor" and
r["host"] != "cbmin00y" and
r["_field"] == "offabs")
|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: true)
|> keep(columns: ["_time","_value", "host", "oid"])
|> group()
|> sort(columns: ["_time"])
|> rename(columns: {_value: "off_end"})
join(tables: {mst:offmst, end:offend}, on: ["_time"])
|> map(fn: (r) => ({ r with _value: r.off_end - r.off_mst}))
|> drop(columns: ["off_mst","off_end"])
|> group(columns: ["host", "oid"], mode:"by")
|> sort(columns: ["_time"])
|> yield()
For modest window sizes experimental.joint()
seems to give the correct result, and shows good preformance
Wind #rec cpu_tot mem_tot cpu_join
10m 2017 0.423 983808 0.253
20m 1009 0.244 475904 0.122
60m 337 0.159 182848 0.065
So CPU time and memory consumption is now linear with row count, and not longer quadratic.
But for larger window size I get a panic: unknown type invalid.
The full message from syslog
is attached.
Is that a known bug ?
Have others run into that ?
Or should I file a github issue on this ?
exp_join_panic.txt (4.4 KB)