Kapacitor data flow

Luv · March 8, 2019, 10:03am

Hi,

I am pretty confused about kapacitor. I have started my kapacitor on a node, and my influx is running on another node.

I have setup influxdb in kapacitor.conf, and It says in the subscriptions,

  [influxdb.subscriptions]
    # Set of databases and retention policies to subscribe to.
    # If empty will subscribe to all, minus the list in
    # influxdb.excluded-subscriptions
    #
    # Format
    # db_name = <list of retention policies>
    #
    # Example:
    # my_database = [ "telegraf", "telegraf" ]

So, it will subscribe to all databases by default. So, I read about subscriptions,

It says that

Rather than querying InfluxDB for data except when using the [BatchNode] all data is copied to your Kapacitor server or cluster through an InfluxDB subscription.

So, it means that all data will be copied on kapacitor. For how long will it be stored on kapacitor machine? If on telegraf database I have a retention policy of 7 days, will kapacitor also store this data for 7 days?

Doesn’t it introduce an extra overhead of getting a bigger machine for kapacitor so that it can handle the data properly?

I just want alerts on my telegraf database, this is my TICKscript, it is generated by chronograf.

    TICKscript:
    var db = 'telegraf'

    var rp = 'autogen'

    var measurement = 'cpu'

    var groupBy = ['host']

    var whereFilter = lambda: TRUE

    var name = 'Custom CPU'

    var idVar = name + '-{{.Group}}'

    var message = 'CPU high'

    var idTag = 'alertID'

    var levelTag = 'level'

    var messageField = 'message'

    var durationField = 'duration'

    var outputDB = 'chronograf'

    var outputRP = 'autogen'

    var outputMeasurement = 'alerts'

    var triggerType = 'threshold'

    var crit = 5

    var data = stream
        |from()
            .database(db)
            .retentionPolicy(rp)
            .measurement(measurement)
            .groupBy(groupBy)
            .where(whereFilter)
        |eval(lambda: "usage_idle")
            .as('value')

    var trigger = data
        |alert()
            .crit(lambda: "value" > crit)
            .message(message)
            .id(idVar)
            .idTag(idTag)
            .levelTag(levelTag)
            .messageField(messageField)
            .durationField(durationField)
            .log('/tmp/alerts.log')
            .slack()
            .workspace('xxxxx')

    trigger
        |eval(lambda: float("value"))
            .as('value')
            .keep()
        |influxDBOut()
            .create()
            .database(outputDB)
            .retentionPolicy(outputRP)
            .measurement(outputMeasurement)
            .tag('alertName', name)
            .tag('triggerType', triggerType)

    trigger
        |httpOut('output')

This is just a test alert, which will be triggered if idle cpu is more than 5%, which is always true. I should be getting alerts all the time.

But I have got no alert. Also, there is no proper tutorial on how to set up kapacitor properly with alerts. Its all very scattered.

Luv · March 8, 2019, 12:37pm

github.com/influxdata/kapacitor

Kapacitor in-memory storage questions

opened 07:14PM - 27 Jan 16 UTC

closed 09:06PM - 15 Feb 16 UTC

yosiat

Hi, _Continue from https://github.com/influxdata/kapacitor/issues/73#issuecomme…nt-175249878_ As I understand, Kapacitor has in-memory storage instead of influxdb, according to this I have some question before I use it in production. ### Retention policy Assuming I have _one_ simple alert that checks some measurement (for example cpu_usage) and checks if is it greater than 80 then we are in critical, does kapacitor store the points for cpu_usage forever in ram? what is the retention policy for each alert? Is it calculated from the alert - for example, alert with only condition - that points won't be saved, but If I have alert that uses window for 10 minutes the points will be stored for 10 minutes? ### Data loss If I want to restart kapacitor and I have points in memory or kapacitor restarted for some reason, how can I make sure there is no data loss? ### Stats How can I see for monitoring kapacitor (itself) purposes how much points are stored in kapacitor? And just to make sure - the alerts run on every point? or there is some scheduling - for example, every 10 seconds all of the alert are running?

This might help you people incase anybody lands here

voiprodrigo · March 8, 2019, 2:54pm

All data from the subscriptions will be streamed to Kapacitor. If you don’t configure any stream task, that data will be discarded. If you configure a stream task with a window node with a certain period, Kapacitor will continuously store in memory the data it needs to cover those windows.
If you configure a batch task, kapacitor will use the memory required to store the results of the query (or queries). Once the task pipeline finishes, that is discarded.

It’s another process on your system consuming CPU and RAM. If it’s significant overhead or not, it only depends on the volume of data you subscribe to, and the volume and computational complexity of the tasks you define (including the scope of your queries and the window periods you work with).

Topic		Replies	Views
Telegraf -> Kapacitor -> InfluxDB possible? influxdb , telegraf , kapacitor	2	2380	March 7, 2017
Influx Subscription, telegraf / kapacitor influxdb	0	647	September 8, 2018
Use Kapacitor to export data into various databases Kapacitor telegraf , kapacitor	2	558	July 15, 2020
How does a retention policy impact a subscription? Kapacitor influxdb , kapacitor	2	1009	May 10, 2017
How long does Kapacitor internally store data sensitive data. Kapacitor 1.5.7 Store kapacitor	2	656	April 26, 2021

Kapacitor data flow

Related topics