I’ve got a more or less constant stream of data coming in to a measurement (“loadbalancing_member_events”) that I’m picking and choosing from, grouping by a couple dimensions, averaging across a particular dimension, and storing back into Influx as a new measurement. I’m frequently getting gaps in the output, up to a dozen times an hour. Here’s the relevant part of the TICKscript:
var flaps = stream
|from()
.database('environment')
.retentionPolicy('autogen')
.measurement('loadbalancing_member_events')
// use only down events (going down then up is a full flap; don't count halves)
.where(lambda: "event" == 'readiness change' AND "transition" == 'down')
|delete()
.field('text')
.tag('transition')
|window()
.period(5m)
.every(10s)
|groupBy(['pool', 'reporting_lb'])
|count('event')
|groupBy(['pool'])
|mean('count')
// flaps|httpOut('flaps')
flaps
|influxDBOut()
.create()
.database('environment')
.retentionPolicy('autogen')
.measurement('loadbalancing_flap_count_mean_across_lbs')
And here’s a visualization showing the gaps:
And here’s using the CLT, looking at a particular pool:
> select * from loadbalancing_flap_count_mean_across_lbs where pool = '/Common/pool_one' and time >= '2017-11-08 08:19:30' and time < '2017-11-08 08:20:30'
name: loadbalancing_flap_count_mean_across_lbs
time mean pool
---- ---- ----
2017-11-08T08:19:39Z 18.8 /Common/pool_one
2017-11-08T08:19:49Z 18.6 /Common/pool_one
2017-11-08T08:19:59Z 16.8 /Common/pool_one
2017-11-08T08:20:10Z 16 /Common/pool_one
2017-11-08T08:20:20Z 14.8 /Common/pool_one
Oh, wait… Now I see it. Ha.
Looks like there’s some kind of slowly accreting lag going on that just pushed us over a threshold. Viewing without -precision rfc3339
on the CLT, I can see the points are roughly as close as always, but just incremented over into the next unit (ms -> s), and so the visualization is acting wonky.
Would align() solve this?