Hello InfluxData Community,
I hope you all are staying safe and healthy. Do any of you perform forecasting or anomaly detection with Influx? If so, I’d love to connect, hear about what you’re doing and the challenges you’re facing.
Thanks!
Our use of InfluxDB is primarily Industrial IoT, primarily with environmental monitoring. We would love to eventually implement some sort of anomaly detection, and we’ve explored it.
We have alarms with some level of sophistication already in place if readings exceed certain thresholds. We’d love to complement this with automated monitoring of unusual behavior that’s inside those thresholds.
Anomaly detection is hard. Our domain further complicates the existing challenges. Generally speaking, classic anomaly detection techniques work best with regular patterns and a priori knowledge of variations. In our use that’s just not possible. Anomalies can be slow drifts, unexpected oscillations, momentary blips, etc. Small changes to environmental controls or new policies for managing our environment can wipe out the historical data necessary to retrain a system or develop a well understood envelope of typical behavior. This can be further exacerbated by the interrelation of environmental factors as they’re generally not independent of one another.
One approach we are hopeful about and have experimented with to a limited degree is using Numenta’s style of machine intelligence. It’s not deep learning or an expert system. It’s modeled on brain structures. Similar to human perception it’s able to recognize patterns without training. Among other things it’s able to detect variations in patterns of new data compared to trailing recent data. In concept this is a near-perfect match to our use case. The devil is in the details, of course. And, we’ve only gotten as far as using their experimentation tool with our own data. The results were promising, but we had to put this work on the back burner. In concept it’s an automated version of our own human-based anomaly detection techniques today where we glance at dashboards and notice changes that cause us to go, “Hmmm. What’s that?”
Hi @Anaisdg
I wanted to leverage Prophet for our influx data using python influxdb Dataframe.
But there is an issue which has not been resolved.
Infer (freq) frequency to dataframe after querying data using DataFrameClient
Issues with python influx dataframeclient
Also reported over github:
Thanks
Ashish
Hello @Ashish_Sikarwar,
I’m sorry that hasn’t been fixed. We’re prioritizing the v2 clients. However they have compatibility with 1.8+ clients.
Perhaps you can use the v2 client and connect to Influxdb v1.8+
If you plan to use this new client library with InfluxDB 1.8 or greater, here are some important tips about gathering the required authorization parameters:
- Bucket Name:
There is no concept of Bucket in InfluxDB v1.x. However, a bucket is simply the combination of the database name and it’s retention policy. So, you can specify this by providing the InfluxDB 1.x database name and retention policy separated by a forward slash (/). If you do not supply a retention policy, the default retention policy is used.
For example: a bucket name of telegraf/1week allows you to write the InfluxDB 1.x database named “telegraf” with a retention policy named “1week”. Similarly, telegraf/ or telegraf allows you to write to the InfluxDB 1.x database named “telegraf” and the default retention policy (typically autogen).
- Token:
In InfluxDB v2, API Tokens are used to access the platform and all its capabilities. InfluxDB v1.x uses a username and password combination when accessing the HTTP APIs. Provide your InfluxDB 1.x username and password separated by a colon ( to represent the token. For example: username:password
- Org:
The org parameter is ignored in compatibility mode with InfluxDB v1.x and can be left empty.
The same querying applies as the v2 docs, but the connection URL is different. For the Flux query endpoint, use:
http://:8086/api/v2/query
Hello @mkarlesky,
Thank you for your answer and for sharing your story. Are you looking for a “deviation from the pack” sort of anomaly detection? Have you looked into using DBSCAN or Median Absolute Deviation?
@Anaisdg Thank you so much for starting this thread and replying so quickly.
It’s been quite a while since we last looked at this application. I do not recall if we looked specifically at DBSCAN or Median Absolute Deviation. I think we looked at clustering and statistical approaches in general. If memory serves we shied away from these for fear of having to perform lots of processing of our data into multiple dimensions to search the frequency domain and time domain for multiple flavors of anomalies. Plus the needed tweaking of knobs for good results seemed a bit daunting as well.
I gather you’re exploring anomaly detection features for InfluxDB 2? If you’re able to explain what sort of abilities InfluxDB might support and how it could work I’d be happy to try to evaluate these in light of our use case. Depending on where we end up, I might even be able to provide some real data snapshots of real anomalies in our usage.
Thank you @Anaisdg for your reply!
I will definitely try that.
@Anaisdg mkarlesky
I would like to try too what InfluxDB has for Anomaly detection.
We plan to use Influxdb for anomaly detection in our database. We perform forecast that are stored in an influx database and when it’s available, the measurements of the corresponding site. We have basic indicator (MAE, RMSE and other) and we will see if we can implement the calculation directly in the base and alerting if there is large error.
This is project for next months.
A colleague will be giving a webinar on Data Science Central in May. I’ll follow up with more info when I have it. Thanks!
Hello @mkarlesky and @Theo_Masson and @Ashish_Sikarwar,
I encourage you to register for this webinar No-Code ML for Forecasting and Anomaly Detection | InfluxData.
During this webinar, you will learn:
- How to initiate machine learning tasks directly within the Influx visual interface without intimate knowledge of how these algorithms are implemented.
- How data scientists can wrap existing, or develop new, machine learning algorithms for publication to the InfluxDB time series platform using familiar languages and frameworks.
Thanks!
Thank you @Anaisdg for sharing a wonderful opportunity.
Hi @Ashish_Sikarwar i used fbprophet for forecasting before switching to keras but i didn’t get what the issue exactly is. Maybe i can help !
@Theo_Masson we used kapacitor to monitor the error and even retrain models with the exec() nodes
@gregory_scafarto,
Cool!! If you’re able to share, what problems were you trying to solve and why did you switch to Keras? And what algorithms did you use?
You can’t do incremental learning with fbprophet so every time you want to retrain a model, you’re going to query all the data from the database and use 100% of cpu and ram during the length of the training. If the number of time series to monitor is important then it’s too expensive in term of ressources. With tf and Keras you can retrain the model on the last points only
I implemented lstm models after decomposing the signal in trend, seasonal,and residual. I also smoothed the trend.Then I made a t a 95% confidence interval with the formula + - 1.96*std(residual) as the residual was gaussian like.
For the moment for anomaly detection I’m using Kapacitor, I compute the mean and the standard deviation of the last week (with the integrated functions) and then compare it with the mean and standard deviation of this week and trigg an alert if the variation is important. It’s a bit simple but I’m looking for a “light” solution without going out of the system. I tried to make a UDF but I never succeed to import the libraries I needed
Hi, Can you share your “simple” script? Variance comparison with last week? I was trying to do something similar but had some troubles with kapacitor syntax.
var host = 'DB'
var mean_before = batch
|query('SELECT mean(real_y) as mean_before FROM "telegraf"."autogen".mean_ymem WHERE host=host ')
.period(24h)
.every(24h)
.align()
.offset(24h)
|shift(24h)
|last('mean_before')
.as('mean_before')
|log()
.prefix('P0-1')
.level('DEBUG')
var stddev_before = batch
|query('SELECT stddev(real_y) as stddev_before FROM "telegraf"."autogen".mean_ymem WHERE host=host ')
.period(24h)
.every(24h)
.align()
.offset(24h)
|shift(24h)
|last('stddev_before')
.as('stddev_before')
|log()
.prefix('P0-2')
.level('DEBUG')
var mean_after = batch
|query('SELECT mean(real_y) as mean_after FROM "telegraf"."autogen".mean_ymem WHERE host=host ')
.period(24h)
.every(24h)
.align()
|last('mean_after')
.as('mean_after')
|log()
.prefix('P0-3')
.level('DEBUG')
var stddev_after = batch
|query('SELECT stddev(real_y) as stddev_after FROM "telegraf"."autogen".mean_ymem WHERE host=host ')
.period(24h)
.every(24h)
.align()
|last('stddev_after')
.as('stddev_after')
|log()
.prefix('P0-4')
.level('DEBUG')
var joined_data = mean_before
|join(mean_after,stddev_after,stddev_before)
.as('mean_before', 'mean_after','stddev_after','stddev_before')
.tolerance(1h)
var performance_error = joined_data
|eval(lambda: abs("mean_before.mean_before" - "mean_after.mean_after"),
lambda: abs("stddev_before.stddev_before" - "stddev_after.stddev_after"))
.as('mean_dev','std_dev')
|alert()
.crit(lambda: "mean_dev" > 0 AND "std_dev" > 0)
.message('ecart en moyenne : {{ index .Fields "mean_dev" }} , ecart en variance : {{ index .Fields "std_dev" }}')
.slack()
|log()
.prefix('P0-5')
.level('DEBUG')
```
Log nodes are not necessary (just help for debbuging) + just change the offset and shift to adapt for one week
Hi @gregory_scafarto, Sorry it took so long but better late then never.
Where are you now with Anomaly Detection, I have both InfluxQL and Flux to test different strategies.
Are you using any specific algorithm or ML package?
Hi,
I don’t work there anymore but if i remember right, i used the adtk toolkit ( Anomaly Detection Toolkit (ADTK) — ADTK 0.6.2 documentation).
I implemented all the models of the library as Kapacitor udfs.
But the most promissing results were with an auto-encoder LSTM (encoded with tensorflow) also implemented as a udf.
I can share the udf exemple template for adtk if you want.
I still was on Influx1.8 but i think now it could be done in a simplest way with Flux.