TICK or Kapacitor for anomaly detection

Sunil_Jacob · May 8, 2017, 3:30pm

Hi All,

I am just a novice learning to use TICK stack for our organization. i have configured the TICK stack for my processes and server running. However i would like to do the following processes:

File System Utilization more than 80%
Need to look into the root cause of disk space usage ( can be because of log file )
try to clean up the file system
send a notification to slack
httpd process failure
Check the root cause by looking into the log file
try to restart the http service and monitor for sometime
send a notification to slack about the nature of the problem and recovery steps taken

Is this possible with the Kapacitor, An early help would be really appreciated.

nathaniel · May 8, 2017, 3:44pm

Are you asking if Kapacitor can perform the root cause analysis (RCA) or just notify you that there is a problem?

If you want Kapacitor to perform the RCA how would you for example, script finding the cause of disk filling up?

Sunil_Jacob · May 9, 2017, 7:29am

Thanks for getting back Nathaniel. Yes i was asking if Kapacitor can perform RCA or detect an anomaly from the common pattern and act upon it
Suppose if the log file directory is getting accumulated with files , can it delete the old files and clean the filesystem?

nathaniel · May 9, 2017, 3:45pm

The short answer is yes, Kapacitor can do those things.

The longer answer is Kapacitor does that via scripts that you still have to write.

For example using the disk usage case there are two parts to the problem. First detect that disk usage is full. Second clean up the disk usage. The first part Kapacitor can do natively, the second part you need to write a script for and Kapacitor will trigger it when needed.

Here is an example TICkscript to automatically clear up disk space when it gets over 80%.

batch
    |query('SELECT last(disk_usage) as disk_usage FROM telegraf.autogen.disk')
        .period(1h)
        ,every(1h)
        .groupBy(*)
    |alert()
        .crit(lambda: "disk_usage" > 80)
        // Call the disk cleaning script, the information about which disk etc is getting full is passed over STDIN as JSON to the process.
        .exec('/usr/local/bin/clean-up-disk.sh')

Sunil_Jacob · May 12, 2017, 3:42am

Thanks @nathaniel for the the info

Sunil_Jacob · May 12, 2017, 7:20am

@nathaniel
I tried to use the code using the above code , with some modifications. Please the code below:
batch |query('SELECT last(used_percent) as disk_usage FROM telegraf.autogen.disk') .period(1m) .every(1m) .groupBy(*) |alert() .crit(lambda: "disk_usage" > 80) // Call the disk cleaning script, the information about which disk etc is getting full is passed over STDIN as JSON to the process. .log('/tmp/usage.log') .exec('/usr/local/bin/testStdin.sh')

When i run the above code the log is getting written, but it seems the testStdin.sh is not getting called. Please find the code below:

#!/bin/bash read disk_usage postToSlack -t "this is a test message" -b "$var" -c "devops" -u "https://hooks.slack.com/services/T592WECRX/B59ND03UM/yT2k4kTb2Quj3cbaSD1fSOx"

I have given the permissions

-rwxr-xr-x 1 kapacitor kapacitor 164 May 12 05:46 /usr/local/bin/testStdin.sh

Please advise

Topic		Replies	Views
[RFC] Kapacitor-unit: test automation for TICK scripts Kapacitor influxdata , kapacitor	1	1333	September 21, 2018
Filesystem alert	5	739	November 18, 2019
Kapacitor alert InfluxDB 2 kapacitor	0	385	October 29, 2019
Kapacitor debug high memory kapacitor	0	341	April 6, 2021
Kapacitor High Memory Usage Kapacitor kapacitor	0	912	April 30, 2020

TICK or Kapacitor for anomaly detection

Related topics