We are bench-marking TICK stack for a larger deployment. Here is our design:
Telegref-Clients-------> Telegraf-proxy ------> Influxdb-OSS ------> Kapacitor -----> AWS API-Gateway -----> AWS-SQS
Kapacitor is configured to send alerts to API-GW with HTTPPOST. We are concerned with “How many alerts can it send at once?”. Unfortunately, numbers are not good.
Test Scenario:
We have 4000 clients. They send OK data for 10 minutes and CRITICAL data for 10 munites. This will generate 4000 alerts every 10 minutes. We observed that it takes more than 25 minutes to deliver all alerts to the Target. This makes rate of alert handling hardly 250/minute. It is is worth mentioning that AWS API-GW has good request rate of 10,000 RPS.
- Is it this normal behavior of kapacitor?
- Is this behavior limited to HTTPPOST?