## Overview
I'm trying to leverage Kapacitor 1.5.7 on Linux/amd64 for context… aware traffic alerting in our multi-tenant commerce system at our ingress points for traffic spikes, etc... Typically this means data collection on two (optionally three -- in this example the data is sent but not grouped) fields:
1. The IP of the requester
2. The ID of the store for which the request is destined
3. (optionally) The URI
We collect and report on the data in 1-2 minute windows and don't care about any data outside such window. That is, if the point in question is over 2 minutes old, it should be expired and expunged. Kapacitor is a standalone Influx product for this use case -- there is no InfluxDB instance for this data, no retention policies, etc... Data is transmitted to Kapacitor via the UDP listener.
The format of the message is the following:
```
combined,uri=/path/to/some/product,id=123456789,ip=127.0.0.1,role=ingresstype count=1
```
As shown in singled-out stream stats later detailed in this issue, the cardinality of each of the aforementioned fields is roughly:
- `uri` - unknown, medium
- `id` - 20-30k within a minute window, avg
- `ip` - 40-60k within a minute window, avg
- `role` - at most 3
The `count` parameter exists so that we can run a mathematical sum operation on the data in the pipe. However, this is redundant because each request generates its own message, its own point, always.
We found Kapacitor struggling with unbounded memory growth in our Production systems, something we did not observe in other (non live traffic) environments. Our initial response to these uncontrollable runaway memory situations was to examine and reduce the cardinality of sets, particularly group by operations on streams. We initially tried reporting on the IP address, the Store ID, and the URI. These are all relatively high cardinality fields, but putting them together in an ordered group by wasn't helping with efficient and unbounded use of memory. So, we paired things back to the following tick script where the `uri` is dropped from the equation:
```
dbrp "toptraffic"."autogen"
var streamCounts = stream
|from()
.groupBy('ip', 'id')
.measurement('combined')
|barrier()
.period(1m)
.delete(TRUE)
|window()
.period(1m)
.every(5s)
.align()
|sum('count')
.as('totalCount')
streamCounts
|alert()
.flapping(0.25, 0.5)
.history(21)
.warn(lambda: "totalCount" > 17500)
.crit(lambda: "totalCount" > 22500)
.message('''Observed {{ index .Fields "totalCount" }} requests to Production Store ID {{ index .Tags "id" }} for IP {{ index .Tags "ip" }} within the last minute.''')
.noRecoveries()
.stateChangesOnly(5m)
.slack()
.channel('#ops-noise')
streamCounts
|alert()
.flapping(0.25, 0.5)
.history(21)
.warn(lambda: "totalCount" > 17500)
.crit(lambda: "totalCount" > 22500)
.message('''Observed {{ index .Fields "totalCount" }} requests to Production Store ID {{ index .Tags "id" }} for IP {{ index .Tags "ip" }} within the last minute.''')
.stateChangesOnly(5m)
.exec('/usr/bin/kapacitor_pubsub_stdin_invoker.sh')
.log('/var/log/kapacitor/alerts.log')
```
The script is straight forward enough; we group on the stream by `ip`, then `id`, from the `combined` measurement. A barrier exists to delete data after one minute. These operations are assigned to a stream variable which is used in alerting to do different things (at the same threshold).
The dot graph and sample output of that tick script while running renders:
```
DOT:
digraph top_combined {
graph [throughput="14278.31 points/s"];
stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="518885257"];
from1 [avg_exec_time_ns="43.931µs" errors="0" working_cardinality="0" ];
from1 -> barrier2 [processed="518885257"];
barrier2 [avg_exec_time_ns="65.018µs" errors="0" working_cardinality="93843" ];
barrier2 -> window3 [processed="518865870"];
window3 [avg_exec_time_ns="107.889µs" errors="0" working_cardinality="93843" ];
window3 -> sum4 [processed="145863177"];
sum4 [avg_exec_time_ns="148.928µs" errors="0" working_cardinality="33327" ];
sum4 -> alert6 [processed="145863177"];
sum4 -> alert5 [processed="145863177"];
alert6 [alerts_inhibited="0" alerts_triggered="0" avg_exec_time_ns="53.506µs" crits_triggered="0" errors="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" working_cardinality="33327" ];
alert5 [alerts_inhibited="0" alerts_triggered="0" avg_exec_time_ns="63.915µs" crits_triggered="0" errors="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" working_cardinality="33327" ];
}
```
As previously mentioned, what we saw with this is that over time (pretty quickly) we ran out of memory. The following graph shows various tweaks to the aforementioned script changing things like the window and barrier periods didn't seem to make any difference to how fast the script/pipeline/Kapacitor consumed memory.
<img width="1662" alt="Screen Shot 2021-01-27 at 3 25 00 PM" src="https://user-images.githubusercontent.com/532881/106067584-df7f3380-60b3-11eb-896b-cdc702dfdd3f.png">
The various spikes in memory show me altering the tick script, removing the window, removing the barrier, changing the barrier from idle to period, changing the time of the barrier tick / window, etc... During these iterations I collected data. The data below is from engagement of the aforementioned tick script, with only changes to the window and barrier periods.
Heap dumps show the following for `in use objects`:
```
go tool pprof -inuse_objects --text kapacitord top_combined/heap\?debug=1
File: kapacitord
Type: inuse_objects
Showing nodes accounting for 255642703, 97.39% of 262504092 total
Dropped 98 nodes (cum <= 1312520)
flat flat% sum% cum cum%
47093992 17.94% 17.94% 47094003 17.94% time.NewTicker
35101126 13.37% 31.31% 35101126 13.37% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
34180645 13.02% 44.33% 34180645 13.02% github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
31006029 11.81% 56.14% 78231108 29.80% github.com/influxdata/kapacitor.newPeriodicBarrier
17670196 6.73% 62.88% 17670196 6.73% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
15859954 6.04% 68.92% 94091062 35.84% github.com/influxdata/kapacitor.(*BarrierNode).newBarrier
15840535 6.03% 74.95% 15840535 6.03% github.com/influxdata/kapacitor.(*periodicBarrier).emitBarrier
15281270 5.82% 80.77% 22687063 8.64% github.com/influxdata/kapacitor/models.ToGroupID
11141290 4.24% 85.02% 11141290 4.24% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Name
7405793 2.82% 87.84% 7405793 2.82% strings.(*Builder).WriteRune
5028186 1.92% 89.75% 5028186 1.92% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.encodeTags
3452307 1.32% 91.07% 8480493 3.23% github.com/influxdata/kapacitor.convertFloatPoint
1977093 0.75% 91.82% 3763058 1.43% net.(*UDPConn).readFrom
1835036 0.7% 92.52% 4403938 1.68% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePointsWithPrecision
1818726 0.69% 93.21% 5581784 2.13% github.com/influxdata/kapacitor/services/udp.(*Service).serve
1785965 0.68% 93.89% 1785965 0.68% syscall.anyToSockaddr
1766316 0.67% 94.57% 2568902 0.98% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
1758629 0.67% 95.24% 1758629 0.67% github.com/influxdata/kapacitor/edge.BatchPointFromPoint
1729659 0.66% 95.90% 1942655 0.74% github.com/influxdata/kapacitor/edge.NewPointMessage
1682284 0.64% 96.54% 1682284 0.64% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
769871 0.29% 96.83% 1687514 0.64% github.com/influxdata/kapacitor/edge.(*statsEdge).incCollected
766201 0.29% 97.12% 2120657 0.81% github.com/influxdata/kapacitor.(*AlertNode).renderID
163847 0.062% 97.19% 1429628 0.54% github.com/influxdata/kapacitor.(*AlertNode).NewGroup
158393 0.06% 97.25% 15998928 6.09% github.com/influxdata/kapacitor.(*periodicBarrier).periodicEmitter
103541 0.039% 97.28% 1862170 0.71% github.com/influxdata/kapacitor.(*windowTimeBuffer).points
98309 0.037% 97.32% 2690768 1.03% github.com/influxdata/kapacitor.(*windowByTime).batch
85587 0.033% 97.35% 97154573 37.01% github.com/influxdata/kapacitor/edge.(*groupedConsumer).getOrCreateGroup
81923 0.031% 97.39% 94522520 36.01% github.com/influxdata/kapacitor.(*BarrierNode).NewGroup
0 0% 97.39% 3412833 1.30% github.com/influxdata/kapacitor.(*AlertNode).runAlert
0 0% 97.39% 126742493 48.28% github.com/influxdata/kapacitor.(*BarrierNode).runBarrierEmitter
0 0% 97.39% 23164149 8.82% github.com/influxdata/kapacitor.(*FromNode).Point
0 0% 97.39% 23619547 9.00% github.com/influxdata/kapacitor.(*FromNode).runStream
0 0% 97.39% 10632284 4.05% github.com/influxdata/kapacitor.(*InfluxQLNode).runInfluxQL
0 0% 97.39% 67208031 25.60% github.com/influxdata/kapacitor.(*TaskMaster).WritePoints
0 0% 97.39% 4753986 1.81% github.com/influxdata/kapacitor.(*WindowNode).runWindow
0 0% 97.39% 1420015 0.54% github.com/influxdata/kapacitor.(*alertState).Point
0 0% 97.39% 8480493 3.23% github.com/influxdata/kapacitor.(*floatPointAggregator).AggregatePoint
0 0% 97.39% 8824603 3.36% github.com/influxdata/kapacitor.(*influxqlGroup).BatchPoint
0 0% 97.39% 169161143 64.44% github.com/influxdata/kapacitor.(*node).start.func1
0 0% 97.39% 2230930 0.85% github.com/influxdata/kapacitor.(*windowByTime).Point
0 0% 97.39% 169160555 64.44% github.com/influxdata/kapacitor/edge.(*consumer).Consume
0 0% 97.39% 8824603 3.36% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).BatchPoint
0 0% 97.39% 1360800 0.52% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).EndBatch
0 0% 97.39% 27898115 10.63% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Point
0 0% 97.39% 1687514 0.64% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).forward
0 0% 97.39% 10632284 4.05% github.com/influxdata/kapacitor/edge.(*groupedConsumer).BufferedBatch
0 0% 97.39% 145541008 55.44% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Consume
0 0% 97.39% 134249262 51.14% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Point
0 0% 97.39% 22129995 8.43% github.com/influxdata/kapacitor/edge.(*pointMessage).SetDimensions
0 0% 97.39% 1623485 0.62% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Collect
0 0% 97.39% 8824603 3.36% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).BatchPoint
0 0% 97.39% 26815094 10.22% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Point
0 0% 97.39% 10632284 4.05% github.com/influxdata/kapacitor/edge.receiveBufferedBatch
0 0% 97.39% 71611969 27.28% github.com/influxdata/kapacitor/services/udp.(*Service).processPackets
0 0% 97.39% 17670196 6.73% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Fields
0 0% 97.39% 1682284 0.64% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Tags
0 0% 97.39% 4403938 1.68% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePoints
0 0% 97.39% 1785965 0.68% internal/poll.(*FD).ReadFrom
0 0% 97.39% 3763058 1.43% net.(*UDPConn).ReadFromUDP
0 0% 97.39% 1785965 0.68% net.(*netFD).readFrom
0 0% 97.39% 262393459 100% runtime.goexit
0 0% 97.39% 1785965 0.68% syscall.Recvfrom
```
and for `in use space`:
```
go tool pprof --text kapacitord top_combined/heap\?debug=1
File: kapacitord
Type: inuse_space
Showing nodes accounting for 19557.32MB, 97.33% of 20093.60MB total
Dropped 99 nodes (cum <= 100.47MB)
flat flat% sum% cum cum%
5447.82MB 27.11% 27.11% 5447.82MB 27.11% github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
4027.07MB 20.04% 47.15% 7133.04MB 35.50% github.com/influxdata/kapacitor.newPeriodicBarrier
3100.24MB 15.43% 62.58% 3101.97MB 15.44% time.NewTicker
1270.70MB 6.32% 68.91% 1270.70MB 6.32% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
964.19MB 4.80% 73.71% 964.19MB 4.80% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
886.50MB 4.41% 78.12% 8272.54MB 41.17% github.com/influxdata/kapacitor.(*BarrierNode).NewGroup
485.13MB 2.41% 80.53% 546.14MB 2.72% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
485.01MB 2.41% 82.95% 485.01MB 2.41% github.com/influxdata/kapacitor.(*periodicBarrier).emitBarrier
460.51MB 2.29% 85.24% 686.52MB 3.42% github.com/influxdata/kapacitor/models.ToGroupID
305.03MB 1.52% 86.75% 602.55MB 3.00% github.com/influxdata/kapacitor.convertFloatPoint
297.52MB 1.48% 88.24% 297.52MB 1.48% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.encodeTags
242MB 1.20% 89.44% 7375.04MB 36.70% github.com/influxdata/kapacitor.(*BarrierNode).newBarrier
237.53MB 1.18% 90.62% 242.53MB 1.21% github.com/influxdata/kapacitor/edge.NewPointMessage
226.01MB 1.12% 91.75% 226.01MB 1.12% strings.(*Builder).WriteRune
194.03MB 0.97% 92.71% 194.03MB 0.97% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
170MB 0.85% 93.56% 170MB 0.85% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Name
142.02MB 0.71% 94.27% 142.02MB 0.71% github.com/influxdata/kapacitor/edge.(*pointMessage).ShallowCopy
119.01MB 0.59% 94.86% 318.52MB 1.59% github.com/influxdata/kapacitor/services/udp.(*Service).serve
109.01MB 0.54% 95.40% 109.01MB 0.54% syscall.anyToSockaddr
97.69MB 0.49% 95.89% 239.72MB 1.19% github.com/influxdata/kapacitor/edge.(*statsEdge).incCollected
90.50MB 0.45% 96.34% 199.51MB 0.99% net.(*UDPConn).readFrom
56.50MB 0.28% 96.62% 106.51MB 0.53% github.com/influxdata/kapacitor.(*AlertNode).renderID
55.12MB 0.27% 96.89% 8495.18MB 42.28% github.com/influxdata/kapacitor/edge.(*groupedConsumer).getOrCreateGroup
32.18MB 0.16% 97.05% 112.69MB 0.56% github.com/influxdata/kapacitor.(*windowTimeBuffer).points
28MB 0.14% 97.19% 574.14MB 2.86% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePointsWithPrecision
14.50MB 0.072% 97.26% 499.52MB 2.49% github.com/influxdata/kapacitor.(*periodicBarrier).periodicEmitter
7.50MB 0.037% 97.30% 101.01MB 0.5% github.com/influxdata/kapacitor.(*AlertNode).NewGroup
6MB 0.03% 97.33% 152.19MB 0.76% github.com/influxdata/kapacitor.(*windowByTime).batch
0 0% 97.33% 271.99MB 1.35% github.com/influxdata/kapacitor.(*AlertNode).runAlert
0 0% 97.33% 13426.72MB 66.82% github.com/influxdata/kapacitor.(*BarrierNode).runBarrierEmitter
0 0% 97.33% 815.04MB 4.06% github.com/influxdata/kapacitor.(*FromNode).Point
0 0% 97.33% 884.96MB 4.40% github.com/influxdata/kapacitor.(*FromNode).runStream
0 0% 97.33% 840.02MB 4.18% github.com/influxdata/kapacitor.(*InfluxQLNode).runInfluxQL
0 0% 97.33% 2820.44MB 14.04% github.com/influxdata/kapacitor.(*TaskMaster).WritePoints
0 0% 97.33% 411.47MB 2.05% github.com/influxdata/kapacitor.(*WindowNode).runWindow
0 0% 97.33% 602.55MB 3.00% github.com/influxdata/kapacitor.(*floatPointAggregator).AggregatePoint
0 0% 97.33% 649.56MB 3.23% github.com/influxdata/kapacitor.(*influxqlGroup).BatchPoint
0 0% 97.33% 15835.15MB 78.81% github.com/influxdata/kapacitor.(*node).start.func1
0 0% 97.33% 150.93MB 0.75% github.com/influxdata/kapacitor.(*windowByTime).Point
0 0% 97.33% 15829.73MB 78.78% github.com/influxdata/kapacitor/edge.(*consumer).Consume
0 0% 97.33% 649.56MB 3.23% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).BatchPoint
0 0% 97.33% 153.94MB 0.77% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).EndBatch
0 0% 97.33% 1192.77MB 5.94% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Point
0 0% 97.33% 239.72MB 1.19% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).forward
0 0% 97.33% 840.02MB 4.18% github.com/influxdata/kapacitor/edge.(*groupedConsumer).BufferedBatch
0 0% 97.33% 14944.77MB 74.38% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Consume
0 0% 97.33% 14072.25MB 70.03% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Point
0 0% 97.33% 673.02MB 3.35% github.com/influxdata/kapacitor/edge.(*pointMessage).SetDimensions
0 0% 97.33% 230.77MB 1.15% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Collect
0 0% 97.33% 649.56MB 3.23% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).BatchPoint
0 0% 97.33% 1037.97MB 5.17% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Point
0 0% 97.33% 840.02MB 4.18% github.com/influxdata/kapacitor/edge.receiveBufferedBatch
0 0% 97.33% 3394.58MB 16.89% github.com/influxdata/kapacitor/services/udp.(*Service).processPackets
0 0% 97.33% 964.19MB 4.80% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Fields
0 0% 97.33% 194.03MB 0.97% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Tags
0 0% 97.33% 574.14MB 2.86% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePoints
0 0% 97.33% 109.01MB 0.54% internal/poll.(*FD).ReadFrom
0 0% 97.33% 199.51MB 0.99% net.(*UDPConn).ReadFromUDP
0 0% 97.33% 109.01MB 0.54% net.(*netFD).readFrom
0 0% 97.33% 20051.80MB 99.79% runtime.goexit
0 0% 97.33% 109.01MB 0.54% syscall.Recvfrom
```
A profile dump at roughly the same time shows:
```
go tool pprof --text kapacitord top_combined/profile
File: kapacitord
Type: cpu
Time: Jan 25, 2021 at 7:32pm (PST)
Duration: 30.17s, Total samples = 43.72s (144.91%)
Showing nodes accounting for 34.25s, 78.34% of 43.72s total
Dropped 359 nodes (cum <= 0.22s)
flat flat% sum% cum cum%
3.20s 7.32% 7.32% 3.60s 8.23% syscall.Syscall6
2.96s 6.77% 14.09% 2.96s 6.77% runtime.futex
2.45s 5.60% 19.69% 2.45s 5.60% runtime.epollwait
1.87s 4.28% 23.97% 1.87s 4.28% runtime.usleep
1.54s 3.52% 27.49% 2.13s 4.87% runtime.mapaccess2_faststr
1.30s 2.97% 30.47% 6.20s 14.18% runtime.mallocgc
1.28s 2.93% 33.39% 1.28s 2.93% runtime.nextFreeFast
0.94s 2.15% 35.54% 1.18s 2.70% runtime.heapBitsSetType
0.94s 2.15% 37.69% 0.94s 2.15% runtime.memclrNoHeapPointers
0.88s 2.01% 39.71% 0.88s 2.01% runtime.memmove
0.86s 1.97% 41.67% 0.91s 2.08% runtime.lock
0.80s 1.83% 43.50% 3.64s 8.33% runtime.selectgo
0.69s 1.58% 45.08% 0.69s 1.58% memeqbody
0.64s 1.46% 46.55% 0.67s 1.53% runtime.unlock
0.60s 1.37% 47.92% 0.61s 1.40% runtime.(*itabTableType).find
0.50s 1.14% 49.06% 6.86s 15.69% runtime.findrunnable
0.46s 1.05% 50.11% 0.46s 1.05% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.scanLine
0.45s 1.03% 51.14% 0.46s 1.05% time.now
0.42s 0.96% 52.10% 1.62s 3.71% runtime.mapassign_faststr
0.37s 0.85% 52.95% 0.37s 0.85% aeshashbody
0.33s 0.75% 53.71% 0.33s 0.75% github.com/influxdata/kapacitor/edge.(*pointMessage).Fields
0.33s 0.75% 54.46% 0.33s 0.75% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.scanTo
0.32s 0.73% 55.19% 0.38s 0.87% runtime.mapiternext
0.29s 0.66% 55.86% 0.29s 0.66% runtime.(*waitq).dequeue
0.26s 0.59% 56.45% 3.93s 8.99% runtime.newobject
0.24s 0.55% 57.00% 3.44s 7.87% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Collect
0.24s 0.55% 57.55% 1.06s 2.42% runtime.(*mcentral).cacheSpan
0.24s 0.55% 58.10% 0.24s 0.55% runtime.casgstatus
0.24s 0.55% 58.65% 0.24s 0.55% sync.(*RWMutex).RLock
0.23s 0.53% 59.17% 0.23s 0.53% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.scanTagsValue
0.23s 0.53% 59.70% 0.84s 1.92% runtime.getitab
0.23s 0.53% 60.22% 1.92s 4.39% runtime.runqgrab
0.22s 0.5% 60.73% 5.38s 12.31% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Point
0.21s 0.48% 61.21% 1.80s 4.12% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).forward
0.21s 0.48% 61.69% 0.52s 1.19% runtime.mapaccess1
0.21s 0.48% 62.17% 2.73s 6.24% runtime.netpoll
0.20s 0.46% 62.63% 3.81s 8.71% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Point
0.20s 0.46% 63.08% 1.56s 3.57% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.encodeTags
0.20s 0.46% 63.54% 0.27s 0.62% sync.(*RWMutex).Unlock
0.19s 0.43% 63.98% 0.27s 0.62% runtime.mapaccess1_faststr
0.18s 0.41% 64.39% 1.44s 3.29% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.scanKey
0.18s 0.41% 64.80% 0.97s 2.22% runtime.sellock
0.15s 0.34% 65.14% 3s 6.86% github.com/influxdata/kapacitor.convertFloatPoint
0.15s 0.34% 65.48% 1.17s 2.68% github.com/influxdata/kapacitor/edge.(*statsEdge).incCollected
0.15s 0.34% 65.83% 0.77s 1.76% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
0.14s 0.32% 66.15% 0.37s 0.85% syscall.anyToSockaddr
0.13s 0.3% 66.45% 0.72s 1.65% github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
0.13s 0.3% 66.74% 3.07s 7.02% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePointsWithPrecision
0.13s 0.3% 67.04% 0.71s 1.62% runtime.assertI2I2
0.13s 0.3% 67.34% 0.92s 2.10% runtime.slicebytetostring
0.12s 0.27% 67.61% 0.54s 1.24% github.com/influxdata/kapacitor/edge.NewPointMessage
0.12s 0.27% 67.89% 1.54s 3.52% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
0.12s 0.27% 68.16% 0.42s 0.96% runtime.makemap
0.12s 0.27% 68.44% 0.40s 0.91% runtime.mapiterinit
0.12s 0.27% 68.71% 0.33s 0.75% runtime.memhash
0.11s 0.25% 68.96% 5.96s 13.63% github.com/influxdata/kapacitor.(*TaskMaster).WritePoints
0.11s 0.25% 69.21% 1.52s 3.48% github.com/influxdata/kapacitor/edge.(*channelEdge).Emit
0.10s 0.23% 69.44% 2.22s 5.08% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Emit
0.10s 0.23% 69.67% 0.33s 0.75% github.com/influxdata/kapacitor/expvar.(*Map).Add
0.10s 0.23% 69.90% 9.62s 22.00% github.com/influxdata/kapacitor/services/udp.(*Service).processPackets
0.10s 0.23% 70.13% 0.42s 0.96% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.scanTags
0.09s 0.21% 70.33% 6.89s 15.76% github.com/influxdata/kapacitor/services/udp.(*Service).serve
0.09s 0.21% 70.54% 5.10s 11.67% net.(*UDPConn).readFrom
0.09s 0.21% 70.75% 7.64s 17.47% runtime.schedule
0.08s 0.18% 70.93% 1.28s 2.93% github.com/influxdata/kapacitor.(*streamEdge).CollectPoint
0.08s 0.18% 71.11% 6.38s 14.59% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Point
0.08s 0.18% 71.29% 0.34s 0.78% github.com/influxdata/kapacitor/edge.(*statsEdge).incEmitted
0.08s 0.18% 71.48% 0.45s 1.03% github.com/influxdata/kapacitor/models.ToGroupID
0.08s 0.18% 71.66% 0.25s 0.57% runtime.chanrecv
0.08s 0.18% 71.84% 1.05s 2.40% runtime.chansend
0.08s 0.18% 72.03% 1.56s 3.57% runtime.makeslice
0.07s 0.16% 72.19% 1.13s 2.58% github.com/influxdata/kapacitor.(*windowByTime).Point
0.07s 0.16% 72.35% 14.28s 32.66% github.com/influxdata/kapacitor/edge.(*consumer).Consume
0.07s 0.16% 72.51% 0.24s 0.55% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Next
0.07s 0.16% 72.67% 0.84s 1.92% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Tags
0.07s 0.16% 72.83% 0.24s 0.55% math/rand.(*Rand).Int63
0.06s 0.14% 72.96% 0.70s 1.60% github.com/influxdata/kapacitor.(*windowTimeBuffer).points
0.06s 0.14% 73.10% 0.33s 0.75% github.com/influxdata/kapacitor/timer.(*timer).Start
0.06s 0.14% 73.24% 4.63s 10.59% internal/poll.(*FD).ReadFrom
0.06s 0.14% 73.38% 0.25s 0.57% runtime.mapaccess2
0.06s 0.14% 73.51% 0.95s 2.17% runtime.notesleep
0.05s 0.11% 73.63% 0.25s 0.57% github.com/influxdata/kapacitor.(*AlertNode).serverInfo
0.05s 0.11% 73.74% 0.55s 1.26% github.com/influxdata/kapacitor.(*StreamNode).runSourceStream
0.05s 0.11% 73.86% 1.29s 2.95% github.com/influxdata/kapacitor.(*TaskMaster).forkPoint
0.05s 0.11% 73.97% 0.88s 2.01% github.com/influxdata/kapacitor.(*streamEdge).EmitPoint
0.05s 0.11% 74.09% 2.12s 4.85% github.com/influxdata/kapacitor/edge.(*channelEdge).Collect
0.05s 0.11% 74.20% 5.15s 11.78% net.(*UDPConn).ReadFromUDP
0.05s 0.11% 74.31% 0.31s 0.71% runtime.convI2I
0.05s 0.11% 74.43% 0.60s 1.37% runtime.resetspinning
0.05s 0.11% 74.54% 1.97s 4.51% runtime.runqsteal
0.05s 0.11% 74.66% 1.52s 3.48% runtime.send
0.05s 0.11% 74.77% 0.37s 0.85% runtime.strhash
0.05s 0.11% 74.89% 0.24s 0.55% runtime.typedmemmove
0.05s 0.11% 75.00% 3.65s 8.35% syscall.recvfrom
0.05s 0.11% 75.11% 0.36s 0.82% text/template.(*state).evalField
0.04s 0.091% 75.21% 1.59s 3.64% github.com/influxdata/kapacitor.(*alertState).Point
0.04s 0.091% 75.30% 0.53s 1.21% github.com/influxdata/kapacitor.EvalPredicate
0.04s 0.091% 75.39% 0.45s 1.03% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Barrier
0.04s 0.091% 75.48% 1.42s 3.25% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
0.04s 0.091% 75.57% 2.16s 4.94% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
0.04s 0.091% 75.66% 4.67s 10.68% net.(*netFD).readFrom
0.04s 0.091% 75.75% 0.42s 0.96% runtime.selunlock
0.04s 0.091% 75.85% 0.36s 0.82% runtime.sysmon
0.04s 0.091% 75.94% 0.32s 0.73% sort.Strings
0.04s 0.091% 76.03% 4.09s 9.35% syscall.Recvfrom
0.04s 0.091% 76.12% 0.77s 1.76% text/template.(*state).walk
0.03s 0.069% 76.19% 0.47s 1.08% github.com/influxdata/kapacitor.(*FromNode).Point
0.03s 0.069% 76.26% 1.09s 2.49% github.com/influxdata/kapacitor.(*windowByTime).batch
0.03s 0.069% 76.33% 1.57s 3.59% github.com/influxdata/kapacitor/edge.(*groupedConsumer).getOrCreateGroup
0.03s 0.069% 76.40% 0.59s 1.35% github.com/influxdata/kapacitor/edge.BatchPointFromPoint
0.03s 0.069% 76.46% 4.16s 9.52% github.com/influxdata/kapacitor/edge.receiveBufferedBatch
0.03s 0.069% 76.53% 1.09s 2.49% runtime.(*mcache).refill
0.03s 0.069% 76.60% 0.22s 0.5% runtime.entersyscall
0.03s 0.069% 76.67% 0.29s 0.66% runtime.gentraceback
0.03s 0.069% 76.74% 7.87s 18.00% runtime.mcall
0.03s 0.069% 76.81% 0.27s 0.62% runtime.notetsleep_internal
0.03s 0.069% 76.88% 1.88s 4.30% runtime.startm
0.03s 0.069% 76.94% 2s 4.57% runtime.systemstack
0.03s 0.069% 77.01% 0.26s 0.59% strconv.ParseFloat
0.03s 0.069% 77.08% 0.42s 0.96% text/template.(*state).evalCommand
0.03s 0.069% 77.15% 0.39s 0.89% text/template.(*state).evalFieldChain
0.02s 0.046% 77.20% 0.56s 1.28% github.com/influxdata/kapacitor.(*AlertNode).determineLevel
0.02s 0.046% 77.24% 2.19s 5.01% github.com/influxdata/kapacitor.(*TaskMaster).runForking
0.02s 0.046% 77.29% 3.06s 7.00% github.com/influxdata/kapacitor.(*floatPointAggregator).AggregatePoint
0.02s 0.046% 77.33% 0.51s 1.17% github.com/influxdata/kapacitor.(*periodicBarrier).emitBarrier
0.02s 0.046% 77.38% 0.72s 1.65% github.com/influxdata/kapacitor.(*periodicBarrier).periodicEmitter
0.02s 0.046% 77.42% 3.36s 7.69% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).BatchPoint
0.02s 0.046% 77.47% 4.16s 9.52% github.com/influxdata/kapacitor/edge.(*groupedConsumer).BufferedBatch
0.02s 0.046% 77.52% 1.44s 3.29% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Fields
0.02s 0.046% 77.56% 0.29s 0.66% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseFloatBytes
0.02s 0.046% 77.61% 0.26s 0.59% math/rand.(*Rand).Float64
0.02s 0.046% 77.65% 1.24s 2.84% runtime.(*mcache).nextFree
0.02s 0.046% 77.70% 0.28s 0.64% runtime.(*mheap).alloc_m
0.02s 0.046% 77.74% 1.01s 2.31% runtime.chansend1
0.02s 0.046% 77.79% 1.30s 2.97% runtime.goready
0.02s 0.046% 77.84% 7.83s 17.91% runtime.park_m
0.02s 0.046% 77.88% 1.01s 2.31% runtime.stopm
0.02s 0.046% 77.93% 0.85s 1.94% text/template.(*Template).execute
0.01s 0.023% 77.95% 0.54s 1.24% github.com/influxdata/kapacitor.(*AlertNode).findFirstMatchLevel
0.01s 0.023% 77.97% 0.22s 0.5% github.com/influxdata/kapacitor/edge.(*batchStatsEdge).Collect
0.01s 0.023% 78.00% 0.29s 0.66% github.com/influxdata/kapacitor/edge.(*groupedConsumer).DeleteGroup
0.01s 0.023% 78.02% 0.27s 0.62% github.com/influxdata/kapacitor/edge.(*pointMessage).SetDimensions
0.01s 0.023% 78.04% 0.32s 0.73% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Barrier
0.01s 0.023% 78.06% 0.38s 0.87% github.com/influxdata/kapacitor/edge.Forward
0.01s 0.023% 78.09% 0.22s 0.5% github.com/influxdata/kapacitor/tick/stateful.(*expression).EvalBool
0.01s 0.023% 78.11% 3.14s 7.18% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePoints
0.01s 0.023% 78.13% 0.43s 0.98% runtime.(*mheap).alloc
0.01s 0.023% 78.16% 0.24s 0.55% runtime.(*mheap).allocSpanLocked
0.01s 0.023% 78.18% 1.84s 4.21% runtime.futexwakeup
0.01s 0.023% 78.20% 1.28s 2.93% runtime.goready.func1
0.01s 0.023% 78.23% 0.23s 0.53% runtime.makemap_small
0.01s 0.023% 78.25% 1.82s 4.16% runtime.notewakeup
0.01s 0.023% 78.27% 1.27s 2.90% runtime.ready
0.01s 0.023% 78.29% 1.66s 3.80% runtime.wakep
0.01s 0.023% 78.32% 0.22s 0.5% sort.Sort
0.01s 0.023% 78.34% 0.30s 0.69% text/template.(*state).printValue
0 0% 78.34% 0.65s 1.49% github.com/influxdata/kapacitor.(*AlertNode).NewGroup
0 0% 78.34% 1.26s 2.88% github.com/influxdata/kapacitor.(*AlertNode).renderID
0 0% 78.34% 3.07s 7.02% github.com/influxdata/kapacitor.(*AlertNode).runAlert
0 0% 78.34% 0.39s 0.89% github.com/influxdata/kapacitor.(*BarrierNode).NewGroup
0 0% 78.34% 0.30s 0.69% github.com/influxdata/kapacitor.(*BarrierNode).newBarrier
0 0% 78.34% 2.41s 5.51% github.com/influxdata/kapacitor.(*BarrierNode).runBarrierEmitter
0 0% 78.34% 1.50s 3.43% github.com/influxdata/kapacitor.(*FromNode).runStream
0 0% 78.34% 4.33s 9.90% github.com/influxdata/kapacitor.(*InfluxQLNode).runInfluxQL
0 0% 78.34% 2.19s 5.01% github.com/influxdata/kapacitor.(*TaskMaster).stream.func1
0 0% 78.34% 2.97s 6.79% github.com/influxdata/kapacitor.(*WindowNode).runWindow
0 0% 78.34% 3.29s 7.53% github.com/influxdata/kapacitor.(*influxqlGroup).BatchPoint
0 0% 78.34% 0.22s 0.5% github.com/influxdata/kapacitor.(*influxqlGroup).realizeReduceContextFromFields
0 0% 78.34% 14.83s 33.92% github.com/influxdata/kapacitor.(*node).start.func1
0 0% 78.34% 0.30s 0.69% github.com/influxdata/kapacitor.(*windowByTime).Barrier
0 0% 78.34% 0.29s 0.66% github.com/influxdata/kapacitor.newPeriodicBarrier
0 0% 78.34% 0.51s 1.17% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).EndBatch
0 0% 78.34% 0.46s 1.05% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Barrier
0 0% 78.34% 12.78s 29.23% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Consume
0 0% 78.34% 3.33s 7.62% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).BatchPoint
0 0% 78.34% 0.32s 0.73% github.com/influxdata/kapacitor/edge.NewBeginBatchMessage
0 0% 78.34% 1.71s 3.91% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.NewTags
0 0% 78.34% 0.29s 0.66% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).FloatValue
0 0% 78.34% 0.46s 1.05% runtime.(*mcentral).grow
0 0% 78.34% 0.28s 0.64% runtime.(*mheap).alloc.func1
0 0% 78.34% 0.22s 0.5% runtime.chanrecv2
0 0% 78.34% 0.28s 0.64% runtime.copystack
0 0% 78.34% 0.24s 0.55% runtime.entersyscallblock
0 0% 78.34% 0.24s 0.55% runtime.entersyscallblock_handoff
0 0% 78.34% 1.13s 2.58% runtime.futexsleep
0 0% 78.34% 0.24s 0.55% runtime.handoffp
0 0% 78.34% 0.34s 0.78% runtime.mstart
0 0% 78.34% 0.34s 0.78% runtime.mstart1
0 0% 78.34% 0.29s 0.66% runtime.newstack
0 0% 78.34% 0.48s 1.10% runtime.notetsleepg
0 0% 78.34% 0.66s 1.51% runtime.timerproc
0 0% 78.34% 0.85s 1.94% text/template.(*Template).Execute
0 0% 78.34% 0.39s 0.89% text/template.(*state).evalFieldNode
0 0% 78.34% 0.42s 0.96% text/template.(*state).evalPipeline
```
Perplexed I decided to chop things up and create two tick scripts instead that monitor each of those metrics independently. The first, `top_ips` does no variable assignment in the tick script and things are piped together in a single flow. The second, `top_stores` has assignment and piping such that data streams to two alerts that do slightly different things with those triggers, like the aforementioned combined script.
Data to the measurement `ips` looks like:
```
ips,ip=127.0.0.1,role= ingresstype
```
Here's the show output for `top_ips`:
```
ID: top_ips
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 22 Jan 21 22:25 UTC
Modified: 26 Jan 21 06:52 UTC
LastEnabled: 26 Jan 21 06:52 UTC
Databases Retention Policies: ["toptraffic"."autogen"]
TICKscript:
dbrp "toptraffic"."autogen"
stream
|from()
.groupBy('ip')
.measurement('ips')
|barrier()
.period(1m)
.delete(TRUE)
|window()
.period(1m)
.every(5s)
.align()
|sum('count')
.as('totalCount')
|alert()
.flapping(0.25, 0.5)
.history(21)
.warn(lambda: "totalCount" > 17500)
.crit(lambda: "totalCount" > 22500)
.message('''Observed {{ index .Fields "totalCount" }} requests to Production IP {{ index .Tags "ip" }} within the last 1 minute.''')
.stateChangesOnly(5m)
.slack()
.channel('#ops-noise')
.exec('/usr/bin/kapacitor_pubsub_stdin_invoker.sh')
.log('/var/log/kapacitor/alerts.log')
DOT:
digraph top_ips {
graph [throughput="18076.66 points/s"];
stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="1891796886"];
from1 [avg_exec_time_ns="45.655µs" errors="0" working_cardinality="0" ];
from1 -> barrier2 [processed="1891796886"];
barrier2 [avg_exec_time_ns="22.507µs" errors="0" working_cardinality="58218" ];
barrier2 -> window3 [processed="1891721166"];
window3 [avg_exec_time_ns="251.156µs" errors="0" working_cardinality="58218" ];
window3 -> sum4 [processed="376239068"];
sum4 [avg_exec_time_ns="101.993µs" errors="0" working_cardinality="34455" ];
sum4 -> alert5 [processed="376239068"];
alert5 [alerts_inhibited="0" alerts_triggered="835" avg_exec_time_ns="58.686µs" crits_triggered="101" errors="0" infos_triggered="0" oks_triggered="367" warns_triggered="367" working_cardinality="34455" ];
}
```
... and for top stores the data looks like:
```
stores,id=123456789,role=ingresstype
```
with a evaluated script like:
```
ID: top_stores
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 22 Jan 21 22:30 UTC
Modified: 26 Jan 21 06:05 UTC
LastEnabled: 26 Jan 21 06:05 UTC
Databases Retention Policies: ["toptraffic"."autogen"]
TICKscript:
dbrp "toptraffic"."autogen"
var stores = stream
|from()
.groupBy('id')
.measurement('stores')
|barrier()
.period(1m)
.delete(TRUE)
|window()
.period(1m)
.every(5s)
.align()
|sum('count')
.as('totalCount')
stores
|alert()
.flapping(0.25, 0.5)
.history(21)
.warn(lambda: "totalCount" > 17500)
.crit(lambda: "totalCount" > 22500)
.message('''Observed {{ index .Fields "totalCount" }} requests to Production Store ID {{ index .Tags "id" }} within the last minute.''')
.noRecoveries()
.stateChangesOnly(5m)
.slack()
.channel('#ops-noise')
stores
|alert()
.flapping(0.25, 0.5)
.history(21)
.warn(lambda: "totalCount" > 17500)
.crit(lambda: "totalCount" > 22500)
.message('''Observed {{ index .Fields "totalCount" }} requests to Production Store ID {{ index .Tags "id" }} within the last minute.''')
.stateChangesOnly(5m)
.exec('/usr/bin/kapacitor_pubsub_stdin_invoker.sh')
.log('/var/log/kapacitor/alerts.log')
DOT:
digraph top_stores {
graph [throughput="15742.66 points/s"];
stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="1816122105"];
from1 [avg_exec_time_ns="25.922µs" errors="0" working_cardinality="0" ];
from1 -> barrier2 [processed="1816122105"];
barrier2 [avg_exec_time_ns="16.675µs" errors="0" working_cardinality="19560" ];
barrier2 -> window3 [processed="1816013270"];
window3 [avg_exec_time_ns="164.84µs" errors="0" working_cardinality="19560" ];
window3 -> sum4 [processed="191193880"];
sum4 [avg_exec_time_ns="592.492µs" errors="0" working_cardinality="12185" ];
sum4 -> alert6 [processed="191193879"];
sum4 -> alert5 [processed="191193879"];
alert6 [alerts_inhibited="0" alerts_triggered="1375" avg_exec_time_ns="93.317µs" crits_triggered="206" errors="0" infos_triggered="0" oks_triggered="586" warns_triggered="583" working_cardinality="12185" ];
alert5 [alerts_inhibited="0" alerts_triggered="789" avg_exec_time_ns="229.573µs" crits_triggered="206" errors="0" infos_triggered="0" oks_triggered="0" warns_triggered="583" working_cardinality="12185" ];
}
```
Note the cardinality of these, at least at the time sampled, was ~12K for store IDs and ~34k for IPs. These on their own seem small potatoes, and, even in the combined script where a group by splits first by the IP, then the Store, shouldn't be too much data for a one or two minute window.
At first this seemed to be a more stable approach, memory didn't seem to grow as fast and I thought we'd level off. Unfortunately, as the graph shows below, we did not.
<img width="1656" alt="Screen Shot 2021-01-27 at 3 45 32 PM" src="https://user-images.githubusercontent.com/532881/106069114-c035d580-60b6-11eb-9689-7b01b51c08ac.png">
Heap dumps show the following for `in use objects`:
```
go tool pprof -inuse_objects --text kapacitord top_ip_and_store_id_last/heap\?debug=1
File: kapacitord
Type: inuse_objects
Showing nodes accounting for 247823085, 97.68% of 253717036 total
Dropped 130 nodes (cum <= 1268585)
flat flat% sum% cum cum%
43042527 16.96% 16.96% 43042532 16.96% time.NewTicker
33785532 13.32% 30.28% 33785532 13.32% github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
29265618 11.53% 41.82% 72471995 28.56% github.com/influxdata/kapacitor.newPeriodicBarrier
28924029 11.40% 53.22% 28924029 11.40% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
26395161 10.40% 63.62% 26395161 10.40% github.com/influxdata/kapacitor/models.ToGroupID
14090455 5.55% 69.17% 86562450 34.12% github.com/influxdata/kapacitor.(*BarrierNode).newBarrier
13910441 5.48% 74.66% 13910441 5.48% github.com/influxdata/kapacitor.(*periodicBarrier).emitBarrier
13590942 5.36% 80.01% 13590942 5.36% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.encodeTags
9683888 3.82% 83.83% 9683888 3.82% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
9202897 3.63% 87.46% 22793839 8.98% github.com/influxdata/kapacitor.convertFloatPoint
7045227 2.78% 90.23% 7045227 2.78% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Name
4871732 1.92% 92.15% 4871732 1.92% github.com/influxdata/kapacitor/edge.BatchPointFromPoint
1784280 0.7% 92.86% 1784280 0.7% github.com/influxdata/kapacitor/edge.(*pointMessage).ShallowCopy
1693245 0.67% 93.52% 2004546 0.79% github.com/influxdata/kapacitor/edge.NewPointMessage
1658880 0.65% 94.18% 2491189 0.98% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
1656757 0.65% 94.83% 4679768 1.84% github.com/influxdata/kapacitor/services/udp.(*Service).serve
1646692 0.65% 95.48% 1646692 0.65% syscall.anyToSockaddr
1627118 0.64% 96.12% 1640457 0.65% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
1376319 0.54% 96.66% 3023011 1.19% net.(*UDPConn).readFrom
1376277 0.54% 97.21% 3867466 1.52% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePointsWithPrecision
617947 0.24% 97.45% 2016082 0.79% github.com/influxdata/kapacitor.(*AlertNode).renderID
163847 0.065% 97.51% 86966602 34.28% github.com/influxdata/kapacitor.(*BarrierNode).NewGroup
152931 0.06% 97.57% 14063372 5.54% github.com/influxdata/kapacitor.(*periodicBarrier).periodicEmitter
108311 0.043% 97.62% 4980043 1.96% github.com/influxdata/kapacitor.(*windowTimeBuffer).points
106502 0.042% 97.66% 5704480 2.25% github.com/influxdata/kapacitor.(*windowByTime).batch
45530 0.018% 97.68% 88450861 34.86% github.com/influxdata/kapacitor/edge.(*groupedConsumer).getOrCreateGroup
0 0% 97.68% 2851193 1.12% github.com/influxdata/kapacitor.(*AlertNode).runAlert
0 0% 97.68% 118282743 46.62% github.com/influxdata/kapacitor.(*BarrierNode).runBarrierEmitter
0 0% 97.68% 27556840 10.86% github.com/influxdata/kapacitor.(*FromNode).Point
0 0% 97.68% 27968454 11.02% github.com/influxdata/kapacitor.(*FromNode).runStream
0 0% 97.68% 24799627 9.77% github.com/influxdata/kapacitor.(*InfluxQLNode).runInfluxQL
0 0% 97.68% 48848474 19.25% github.com/influxdata/kapacitor.(*TaskMaster).WritePoints
0 0% 97.68% 8255397 3.25% github.com/influxdata/kapacitor.(*WindowNode).runWindow
0 0% 97.68% 1802300 0.71% github.com/influxdata/kapacitor.(*alertState).Point
0 0% 97.68% 22793839 8.98% github.com/influxdata/kapacitor.(*floatPointAggregator).AggregatePoint
0 0% 97.68% 23395427 9.22% github.com/influxdata/kapacitor.(*influxqlGroup).BatchPoint
0 0% 97.68% 182157414 71.80% github.com/influxdata/kapacitor.(*node).start.func1
0 0% 97.68% 5456825 2.15% github.com/influxdata/kapacitor.(*windowByTime).Point
0 0% 97.68% 182157408 71.80% github.com/influxdata/kapacitor/edge.(*consumer).Consume
0 0% 97.68% 23395427 9.22% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).BatchPoint
0 0% 97.68% 35505282 13.99% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Point
0 0% 97.68% 24799627 9.77% github.com/influxdata/kapacitor/edge.(*groupedConsumer).BufferedBatch
0 0% 97.68% 154188954 60.77% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Consume
0 0% 97.68% 128939059 50.82% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Point
0 0% 97.68% 25772560 10.16% github.com/influxdata/kapacitor/edge.(*pointMessage).SetDimensions
0 0% 97.68% 23395427 9.22% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).BatchPoint
0 0% 97.68% 34815965 13.72% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Point
0 0% 97.68% 24799627 9.77% github.com/influxdata/kapacitor/edge.receiveBufferedBatch
0 0% 97.68% 52715940 20.78% github.com/influxdata/kapacitor/services/udp.(*Service).processPackets
0 0% 97.68% 9683888 3.82% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Fields
0 0% 97.68% 1640457 0.65% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Tags
0 0% 97.68% 3867466 1.52% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePoints
0 0% 97.68% 1646692 0.65% internal/poll.(*FD).ReadFrom
0 0% 97.68% 3023011 1.19% net.(*UDPConn).ReadFromUDP
0 0% 97.68% 1646692 0.65% net.(*netFD).readFrom
0 0% 97.68% 253634506 100% runtime.goexit
0 0% 97.68% 1646692 0.65% syscall.Recvfrom
```
and for `in use space`:
```
go tool pprof --text kapacitord top_ip_and_store_id_last/heap\?debug=1
File: kapacitord
Type: inuse_space
Showing nodes accounting for 19161.22MB, 97.53% of 19646.18MB total
Dropped 128 nodes (cum <= 98.23MB)
flat flat% sum% cum cum%
5369.30MB 27.33% 27.33% 5369.30MB 27.33% github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
3794.53MB 19.31% 46.64% 6652.30MB 33.86% github.com/influxdata/kapacitor.newPeriodicBarrier
2852.22MB 14.52% 61.16% 2852.77MB 14.52% time.NewTicker
1230.21MB 6.26% 67.42% 1230.21MB 6.26% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
945.72MB 4.81% 72.24% 945.72MB 4.81% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
892.46MB 4.54% 76.78% 7766.76MB 39.53% github.com/influxdata/kapacitor.(*BarrierNode).NewGroup
551.54MB 2.81% 79.59% 966.55MB 4.92% github.com/influxdata/kapacitor.convertFloatPoint
541.01MB 2.75% 82.34% 541.01MB 2.75% github.com/influxdata/kapacitor/models.ToGroupID
455.63MB 2.32% 84.66% 521.63MB 2.66% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
426.01MB 2.17% 86.83% 426.01MB 2.17% github.com/influxdata/kapacitor.(*periodicBarrier).emitBarrier
415.01MB 2.11% 88.94% 415.01MB 2.11% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.encodeTags
245.03MB 1.25% 90.19% 245.03MB 1.25% github.com/influxdata/kapacitor/edge.(*pointMessage).ShallowCopy
232.53MB 1.18% 91.37% 238.03MB 1.21% github.com/influxdata/kapacitor/edge.NewPointMessage
223.01MB 1.14% 92.51% 223.01MB 1.14% github.com/influxdata/kapacitor/edge.BatchPointFromPoint
215MB 1.09% 93.60% 6867.30MB 34.95% github.com/influxdata/kapacitor.(*BarrierNode).newBarrier
188.02MB 0.96% 94.56% 190.53MB 0.97% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
108.51MB 0.55% 95.11% 272.02MB 1.38% github.com/influxdata/kapacitor/services/udp.(*Service).serve
107.50MB 0.55% 95.66% 107.50MB 0.55% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Name
100.51MB 0.51% 96.17% 100.51MB 0.51% syscall.anyToSockaddr
83.84MB 0.43% 96.60% 306.85MB 1.56% github.com/influxdata/kapacitor.(*windowTimeBuffer).points
63MB 0.32% 96.92% 163.51MB 0.83% net.(*UDPConn).readFrom
46.70MB 0.24% 97.16% 146.72MB 0.75% github.com/influxdata/kapacitor/edge.(*statsEdge).incCollected
32.42MB 0.17% 97.32% 7904.19MB 40.23% github.com/influxdata/kapacitor/edge.(*groupedConsumer).getOrCreateGroup
21MB 0.11% 97.43% 542.63MB 2.76% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePointsWithPrecision
14MB 0.071% 97.50% 440.01MB 2.24% github.com/influxdata/kapacitor.(*periodicBarrier).periodicEmitter
6.50MB 0.033% 97.53% 340.85MB 1.73% github.com/influxdata/kapacitor.(*windowByTime).batch
0 0% 97.53% 192.18MB 0.98% github.com/influxdata/kapacitor.(*AlertNode).runAlert
0 0% 97.53% 12749.75MB 64.90% github.com/influxdata/kapacitor.(*BarrierNode).runBarrierEmitter
0 0% 97.53% 774.55MB 3.94% github.com/influxdata/kapacitor.(*FromNode).Point
0 0% 97.53% 832.51MB 4.24% github.com/influxdata/kapacitor.(*FromNode).runStream
0 0% 97.53% 1171.02MB 5.96% github.com/influxdata/kapacitor.(*InfluxQLNode).runInfluxQL
0 0% 97.53% 2687.48MB 13.68% github.com/influxdata/kapacitor.(*TaskMaster).WritePoints
0 0% 97.53% 724MB 3.69% github.com/influxdata/kapacitor.(*WindowNode).runWindow
0 0% 97.53% 966.55MB 4.92% github.com/influxdata/kapacitor.(*floatPointAggregator).AggregatePoint
0 0% 97.53% 1028.06MB 5.23% github.com/influxdata/kapacitor.(*influxqlGroup).BatchPoint
0 0% 97.53% 15669.47MB 79.76% github.com/influxdata/kapacitor.(*node).start.func1
0 0% 97.53% 370.80MB 1.89% github.com/influxdata/kapacitor.(*windowByTime).Point
0 0% 97.53% 15659.62MB 79.71% github.com/influxdata/kapacitor/edge.(*consumer).Consume
0 0% 97.53% 1028.06MB 5.23% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).BatchPoint
0 0% 97.53% 118.41MB 0.6% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).EndBatch
0 0% 97.53% 1316.67MB 6.70% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Point
0 0% 97.53% 147.22MB 0.75% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).forward
0 0% 97.53% 1171.02MB 5.96% github.com/influxdata/kapacitor/edge.(*groupedConsumer).BufferedBatch
0 0% 97.53% 14827.11MB 75.47% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Consume
0 0% 97.53% 13633.07MB 69.39% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Point
0 0% 97.53% 529.51MB 2.70% github.com/influxdata/kapacitor/edge.(*pointMessage).SetDimensions
0 0% 97.53% 140.38MB 0.71% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Collect
0 0% 97.53% 1028.06MB 5.23% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).BatchPoint
0 0% 97.53% 1208.85MB 6.15% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Point
0 0% 97.53% 1171.02MB 5.96% github.com/influxdata/kapacitor/edge.receiveBufferedBatch
0 0% 97.53% 3230.11MB 16.44% github.com/influxdata/kapacitor/services/udp.(*Service).processPackets
0 0% 97.53% 945.72MB 4.81% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Fields
0 0% 97.53% 190.53MB 0.97% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Tags
0 0% 97.53% 542.63MB 2.76% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePoints
0 0% 97.53% 100.51MB 0.51% internal/poll.(*FD).ReadFrom
0 0% 97.53% 163.51MB 0.83% net.(*UDPConn).ReadFromUDP
0 0% 97.53% 100.51MB 0.51% net.(*netFD).readFrom
0 0% 97.53% 19615.16MB 99.84% runtime.goexit
0 0% 97.53% 100.51MB 0.51% syscall.Recvfrom
```
A profile dump at roughly the same time shows:
```
go tool pprof --text kapacitord top_ip_and_store_id_last/profile
File: kapacitord
Type: cpu
Time: Jan 27, 2021 at 2:20pm (PST)
Duration: 30.17s, Total samples = 66.49s (220.40%)
Showing nodes accounting for 56.07s, 84.33% of 66.49s total
Dropped 417 nodes (cum <= 0.33s)
flat flat% sum% cum cum%
11.90s 17.90% 17.90% 12.39s 18.63% runtime.findObject
9.26s 13.93% 31.82% 26.13s 39.30% runtime.scanobject
5.15s 7.75% 39.57% 5.15s 7.75% runtime.markBits.isMarked
2.63s 3.96% 43.53% 2.78s 4.18% syscall.Syscall6
2.17s 3.26% 46.79% 2.88s 4.33% runtime.mapaccess2_faststr
1.32s 1.99% 48.77% 1.32s 1.99% runtime.epollwait
1.26s 1.90% 50.67% 9.32s 14.02% runtime.mallocgc
1.25s 1.88% 52.55% 1.25s 1.88% runtime.futex
1.17s 1.76% 54.31% 1.19s 1.79% runtime.pageIndexOf
0.96s 1.44% 55.75% 1.13s 1.70% runtime.heapBitsSetType
0.90s 1.35% 57.11% 0.90s 1.35% runtime.usleep
0.84s 1.26% 58.37% 0.84s 1.26% runtime.nextFreeFast
0.80s 1.20% 59.57% 0.83s 1.25% runtime.(*itabTableType).find
0.74s 1.11% 60.69% 0.74s 1.11% runtime.memclrNoHeapPointers
0.73s 1.10% 61.78% 4.10s 6.17% runtime.selectgo
0.71s 1.07% 62.85% 3.78s 5.69% runtime.gcWriteBarrier
0.70s 1.05% 63.90% 0.70s 1.05% memeqbody
0.67s 1.01% 64.91% 0.77s 1.16% runtime.lock
0.56s 0.84% 65.75% 0.60s 0.9% runtime.spanOf
0.55s 0.83% 66.58% 0.62s 0.93% runtime.unlock
0.52s 0.78% 67.36% 0.52s 0.78% runtime.memmove
0.48s 0.72% 68.09% 0.48s 0.72% github.com/influxdata/kapacitor/edge.(*pointMessage).Fields
0.48s 0.72% 68.81% 1.09s 1.64% runtime.gcmarknewobject
0.44s 0.66% 69.47% 0.44s 0.66% runtime.spanOfUnchecked
0.42s 0.63% 70.10% 0.42s 0.63% aeshashbody
0.39s 0.59% 70.69% 3.51s 5.28% runtime.wbBufFlush1
0.36s 0.54% 71.23% 5.38s 8.09% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).Point
0.35s 0.53% 71.76% 1s 1.50% runtime.mapiternext
0.33s 0.5% 72.25% 7.40s 11.13% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Point
0.33s 0.5% 72.75% 6.15s 9.25% runtime.greyobject
0.32s 0.48% 73.23% 1.15s 1.73% runtime.getitab
0.31s 0.47% 73.70% 3.36s 5.05% runtime.findrunnable
0.30s 0.45% 74.15% 2.87s 4.32% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.encodeTags
0.29s 0.44% 74.58% 2.07s 3.11% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).forward
0.29s 0.44% 75.02% 0.58s 0.87% runtime.mapaccess1
0.29s 0.44% 75.45% 1.96s 2.95% runtime.mapassign_faststr
0.25s 0.38% 75.83% 0.61s 0.92% runtime.markBitsForAddr
0.20s 0.3% 76.13% 5.48s 8.24% runtime.newobject
0.20s 0.3% 76.43% 1.06s 1.59% runtime.runqgrab
0.19s 0.29% 76.72% 1.21s 1.82% github.com/influxdata/kapacitor/edge.(*statsEdge).incCollected
0.19s 0.29% 77.00% 0.84s 1.26% runtime.bulkBarrierPreWrite
0.18s 0.27% 77.27% 3.34s 5.02% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Collect
0.15s 0.23% 77.50% 19.80s 29.78% github.com/influxdata/kapacitor/edge.(*consumer).Consume
0.15s 0.23% 77.73% 3.20s 4.81% github.com/influxdata/kapacitor/edge.(*streamStatsEdge).Emit
0.15s 0.23% 77.95% 0.96s 1.44% runtime.sellock
0.14s 0.21% 78.16% 0.35s 0.53% github.com/influxdata/kapacitor.(*windowTimeBuffer).insert
0.14s 0.21% 78.37% 0.52s 0.78% runtime.selunlock
0.13s 0.2% 78.57% 0.95s 1.43% runtime.slicebytetostring
0.13s 0.2% 78.76% 0.96s 1.44% runtime.typedmemmove
0.12s 0.18% 78.94% 23.22s 34.92% runtime.gcDrain
0.12s 0.18% 79.12% 3.90s 5.87% runtime.schedule
0.11s 0.17% 79.29% 2.24s 3.37% github.com/influxdata/kapacitor.(*windowByTime).Point
0.11s 0.17% 79.46% 1.44s 2.17% github.com/influxdata/kapacitor/edge.(*pointMessage).GroupInfo
0.10s 0.15% 79.61% 1.80s 2.71% github.com/influxdata/kapacitor.(*TaskMaster).forkPoint
0.10s 0.15% 79.76% 0.80s 1.20% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.scanKey
0.10s 0.15% 79.91% 4.20s 6.32% net.(*UDPConn).readFrom
0.10s 0.15% 80.06% 0.97s 1.46% runtime.assertI2I2
0.10s 0.15% 80.21% 1.70s 2.56% runtime.makeslice
0.09s 0.14% 80.34% 0.83s 1.25% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parseTags
0.09s 0.14% 80.48% 0.34s 0.51% runtime.memhash
0.09s 0.14% 80.61% 1.45s 2.18% runtime.netpoll
0.08s 0.12% 80.73% 5.77s 8.68% github.com/influxdata/kapacitor.(*influxqlGroup).BatchPoint
0.08s 0.12% 80.85% 1.32s 1.99% github.com/influxdata/kapacitor/edge.(*groupedConsumer).getOrCreateGroup
0.08s 0.12% 80.97% 3.42s 5.14% runtime.gcDrainN
0.08s 0.12% 81.09% 0.90s 1.35% runtime.mapiterinit
0.07s 0.11% 81.20% 1.48s 2.23% github.com/influxdata/kapacitor.(*alertState).Point
0.07s 0.11% 81.31% 2.06s 3.10% github.com/influxdata/kapacitor/edge.(*channelEdge).Collect
0.07s 0.11% 81.41% 2.02s 3.04% github.com/influxdata/kapacitor/edge.(*channelEdge).Emit
0.06s 0.09% 81.50% 1.09s 1.64% github.com/influxdata/kapacitor.(*StreamNode).runSourceStream
0.06s 0.09% 81.59% 0.61s 0.92% github.com/influxdata/kapacitor.(*streamEdge).CollectPoint
0.06s 0.09% 81.68% 5.32s 8.00% github.com/influxdata/kapacitor.convertFloatPoint
0.06s 0.09% 81.77% 0.67s 1.01% github.com/influxdata/kapacitor/edge.(*statsEdge).incEmitted
0.06s 0.09% 81.86% 5.14s 7.73% github.com/influxdata/kapacitor/services/udp.(*Service).serve
0.06s 0.09% 81.95% 3.62s 5.44% internal/poll.(*FD).ReadFrom
0.06s 0.09% 82.04% 4.26s 6.41% net.(*UDPConn).ReadFromUDP
0.06s 0.09% 82.13% 0.53s 0.8% runtime.makemap
0.05s 0.075% 82.21% 2.83s 4.26% github.com/influxdata/kapacitor.(*TaskMaster).runForking
0.05s 0.075% 82.28% 0.90s 1.35% github.com/influxdata/kapacitor/edge.NewBatchPointMessage
0.05s 0.075% 82.36% 0.38s 0.57% github.com/influxdata/kapacitor/models.ToGroupID
0.05s 0.075% 82.43% 2.11s 3.17% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePointsWithPrecision
0.05s 0.075% 82.51% 0.64s 0.96% runtime.(*mcentral).cacheSpan
0.05s 0.075% 82.58% 0.39s 0.59% runtime.strhash
0.05s 0.075% 82.66% 0.63s 0.95% sort.Strings
0.04s 0.06% 82.72% 4.62s 6.95% github.com/influxdata/kapacitor.(*TaskMaster).WritePoints
0.04s 0.06% 82.78% 1.66s 2.50% github.com/influxdata/kapacitor.(*windowTimeBuffer).points
0.04s 0.06% 82.84% 7.16s 10.77% github.com/influxdata/kapacitor/services/udp.(*Service).processPackets
0.04s 0.06% 82.90% 1.24s 1.86% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.Tags.Map
0.04s 0.06% 82.96% 1.60s 2.41% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.parsePoint
0.04s 0.06% 83.02% 0.44s 0.66% runtime.chansend1
0.04s 0.06% 83.08% 30.96s 46.56% runtime.systemstack
0.04s 0.06% 83.14% 0.34s 0.51% syscall.anyToSockaddr
0.04s 0.06% 83.20% 2.82s 4.24% syscall.recvfrom
0.03s 0.045% 83.25% 0.80s 1.20% github.com/influxdata/kapacitor.(*FromNode).Point
0.03s 0.045% 83.29% 0.44s 0.66% github.com/influxdata/kapacitor.EvalPredicate
0.03s 0.045% 83.34% 8.25s 12.41% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Point
0.03s 0.045% 83.38% 0.38s 0.57% github.com/influxdata/kapacitor/edge.Forward
0.03s 0.045% 83.43% 0.67s 1.01% runtime.(*mcache).refill
0.03s 0.045% 83.47% 1.09s 1.64% runtime.runqsteal
0.03s 0.045% 83.52% 0.64s 0.96% runtime.send
0.03s 0.045% 83.56% 0.47s 0.71% runtime.timerproc
0.03s 0.045% 83.61% 3.19s 4.80% syscall.Recvfrom
0.03s 0.045% 83.65% 0.54s 0.81% text/template.(*state).walk
0.02s 0.03% 83.68% 0.98s 1.47% github.com/influxdata/kapacitor.(*streamEdge).EmitPoint
0.02s 0.03% 83.71% 1.97s 2.96% github.com/influxdata/kapacitor.(*windowByTime).batch
0.02s 0.03% 83.74% 0.36s 0.54% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).Barrier
0.02s 0.03% 83.77% 5.86s 8.81% github.com/influxdata/kapacitor/edge.(*timedForwardReceiver).BatchPoint
0.02s 0.03% 83.80% 1.40s 2.11% github.com/influxdata/kapacitor/edge.BatchPointFromPoint
0.02s 0.03% 83.83% 3.10s 4.66% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/influxql.NewTags
0.02s 0.03% 83.86% 0.87s 1.31% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Tags
0.02s 0.03% 83.89% 2.16s 3.25% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.ParsePoints
0.02s 0.03% 83.92% 3.65s 5.49% net.(*netFD).readFrom
0.02s 0.03% 83.95% 0.42s 0.63% runtime.chansend
0.02s 0.03% 83.98% 4.20s 6.32% runtime.mcall
0.02s 0.03% 84.01% 4s 6.02% runtime.park_m
0.02s 0.03% 84.04% 0.41s 0.62% runtime.ready
0.02s 0.03% 84.07% 0.38s 0.57% runtime.stopm
0.02s 0.03% 84.10% 3.54s 5.32% runtime.wbBufFlush
0.01s 0.015% 84.12% 0.45s 0.68% github.com/influxdata/kapacitor.(*AlertNode).findFirstMatchLevel
0.01s 0.015% 84.13% 5.40s 8.12% github.com/influxdata/kapacitor.(*floatPointAggregator).AggregatePoint
0.01s 0.015% 84.15% 1.78s 2.68% github.com/influxdata/kapacitor.(*periodicBarrier).emitBarrier
0.01s 0.015% 84.16% 2s 3.01% github.com/influxdata/kapacitor.(*periodicBarrier).periodicEmitter
0.01s 0.015% 84.18% 6.59s 9.91% github.com/influxdata/kapacitor/edge.(*groupedConsumer).BufferedBatch
0.01s 0.015% 84.19% 0.50s 0.75% github.com/influxdata/kapacitor/edge.NewPointMessage
0.01s 0.015% 84.21% 6.62s 9.96% github.com/influxdata/kapacitor/edge.receiveBufferedBatch
0.01s 0.015% 84.22% 1.11s 1.67% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).unmarshalBinary
0.01s 0.015% 84.24% 0.75s 1.13% runtime.(*mcache).nextFree
0.01s 0.015% 84.25% 0.36s 0.54% runtime.convTslice
0.01s 0.015% 84.27% 0.53s 0.8% runtime.futexsleep
0.01s 0.015% 84.28% 0.74s 1.11% runtime.futexwakeup
0.01s 0.015% 84.30% 0.34s 0.51% runtime.notesleep
0.01s 0.015% 84.31% 0.68s 1.02% runtime.notewakeup
0.01s 0.015% 84.33% 3.52s 5.29% runtime.wbBufFlush.func1
0 0% 84.33% 0.45s 0.68% github.com/influxdata/kapacitor.(*AlertNode).determineLevel
0 0% 84.33% 1.02s 1.53% github.com/influxdata/kapacitor.(*AlertNode).renderID
0 0% 84.33% 2.42s 3.64% github.com/influxdata/kapacitor.(*AlertNode).runAlert
0 0% 84.33% 3.25s 4.89% github.com/influxdata/kapacitor.(*BarrierNode).runBarrierEmitter
0 0% 84.33% 2.35s 3.53% github.com/influxdata/kapacitor.(*FromNode).runStream
0 0% 84.33% 6.89s 10.36% github.com/influxdata/kapacitor.(*InfluxQLNode).runInfluxQL
0 0% 84.33% 2.83s 4.26% github.com/influxdata/kapacitor.(*TaskMaster).stream.func1
0 0% 84.33% 4.89s 7.35% github.com/influxdata/kapacitor.(*WindowNode).runWindow
0 0% 84.33% 20.89s 31.42% github.com/influxdata/kapacitor.(*node).start.func1
0 0% 84.33% 5.86s 8.81% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).BatchPoint
0 0% 84.33% 0.50s 0.75% github.com/influxdata/kapacitor/edge.(*forwardingReceiver).EndBatch
0 0% 84.33% 0.37s 0.56% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Barrier
0 0% 84.33% 17.45s 26.24% github.com/influxdata/kapacitor/edge.(*groupedConsumer).Consume
0 0% 84.33% 0.36s 0.54% github.com/influxdata/kapacitor/edge.(*pointMessage).ShallowCopy
0 0% 84.33% 1.12s 1.68% github.com/influxdata/kapacitor/vendor/github.com/influxdata/influxdb/models.(*point).Fields
0 0% 84.33% 1.31s 1.97% runtime.convT2E
0 0% 84.33% 3.42s 5.14% runtime.gcAssistAlloc
0 0% 84.33% 3.42s 5.14% runtime.gcAssistAlloc.func1
0 0% 84.33% 3.42s 5.14% runtime.gcAssistAlloc1
0 0% 84.33% 23.23s 34.94% runtime.gcBgMarkWorker
0 0% 84.33% 23.22s 34.92% runtime.gcBgMarkWorker.func2
0 0% 84.33% 0.41s 0.62% runtime.goready
0 0% 84.33% 0.41s 0.62% runtime.goready.func1
0 0% 84.33% 0.67s 1.01% runtime.startm
0 0% 84.33% 0.51s 0.77% runtime.wakep
0 0% 84.33% 0.64s 0.96% text/template.(*Template).Execute
0 0% 84.33% 0.64s 0.96% text/template.(*Template).execute
```
So I'm left head scratching of running through many iterations of changes to the tick scripts. The data, while constantly flowing the Kapacitor doesn't cause memory budge from about a gigabyte if all the tick scripts are disabled / inhibited from processing.
A few questions:
1. Am I correctly configuring Kapacitor to do what I want here for data directly sent to it, that we don't care about at all past the window (again, 1-2 minutes)? That is, the only way to purge data once you open a stream and consume from it is to have barrier events remove them -- this seems to be adding high CPU load and potentially accounting for memory when you get over a size of a certain number, even if cardinality is in check.
2. Configuring the barrier to emit delete on idle causes quicker growth, a steeper graph, but that's probably because it's not the option for that node we want ... there will essentially never be an idle period in the data Kapacitor receives for these measurements in Production. That said, a lot of time and overhead seem to be spent with the tickers responsible for purging the data.
3. The carnality does fluctuate and stay within a reasonable size when a tick script is example, as the various aforementioned outputs show. So why the growth till OOM of Kapacitor?
4. Though `sum` doesn't really seem to be anywhere near the problem with this configuration and the growth issue, is there a cleaner way of identifying the count of points in the stream? Is it possible for me to drop the value `count=1` given that's the only purpose it serves?
5. Retention in this situation -- where data never interacts with InfluxDB in any way -- is moot, no where in Kapacitor is retention honored, exhibited by the fact that Kapacitor allows you to place already-existing-in-Influx retention periods on data in Kapacitor. Is that right?
I've attached ([top_combined_and_top_ip_and_store_kapacitor_1.5.7_growth.tar.gz](https://github.com/influxdata/kapacitor/files/5883862/top_combined_and_top_ip_and_store_kapacitor_1.5.7_growth.tar.gz)) the various dumps / profile captures, etc... from the time each of these steps were in place, an output of the structure is:
```
tree -D
.
├── top_combined
│ ├── [Jan 25 19:31] allocs?debug=1
│ ├── [Jan 25 19:31] goroutine?debug=1
│ ├── [Jan 25 19:33] goroutine?debug=2
│ ├── [Jan 25 19:31] heap?debug=1
│ ├── [Jan 25 22:52] index.html
│ ├── [Jan 25 19:32] mutex?debug=1
│ ├── [Jan 25 19:32] profile
│ ├── [Jan 25 19:32] show_top_combined
│ ├── [Jan 25 19:32] threadcreate?debug=1
│ └── [Jan 25 19:32] trace
├── top_ip_and_store_id
│ ├── [Jan 27 11:26] allocs?debug=1
│ ├── [Jan 27 11:23] goroutine?debug=1
│ ├── [Jan 27 11:23] goroutine?debug=2
│ ├── [Jan 27 11:25] heap?debug=1
│ ├── [Jan 27 11:26] index.html
│ ├── [Jan 27 11:24] mutex?debug=1
│ ├── [Jan 27 11:25] profile
│ ├── [Jan 27 11:27] show_top_ips
│ ├── [Jan 27 11:27] show_top_stores
│ ├── [Jan 27 11:26] threadcreate?debug=1
│ └── [Jan 27 11:24] trace
└── top_ip_and_store_id_last
├── [Jan 27 14:18] allocs?debug=1
├── [Jan 27 14:19] goroutine?debug=1
├── [Jan 27 14:19] goroutine?debug=2
├── [Jan 27 14:19] heap?debug=1
├── [Jan 27 14:24] index.html
├── [Jan 27 14:19] mutex?debug=1
├── [Jan 27 14:20] profile
├── [Jan 27 14:20] threadcreate?debug=1
└── [Jan 27 14:20] trace
3 directories, 30 files
```
With the files in `top_ip_and_store_id` and `top_ip_and_store_id_last` taken a few hours apart as the ^^^ shows.