Filter data glitches

sauvant · October 1, 2019, 12:57pm

Hi experts,

I started working with influxdb (and grafana) some weeks ago and I already fell in love

Currently I collect 8 integer values (load cell data AD converted by 8 HX711s) every 25ms. Works fine including the nice dashboard possibilities in grafana. But a strange effect in the data source results in data glitches every now and then. Normal values are below 1000 without load, go up to 50000 with load on the load cells. Single values in between are simply measured wrong. The wrong values are different for the 8 cells but more or less static for each one of them (cell1: 34618-34621, cell2: 152776 or 152778, …). Meaning whenever a wrong value appears it is about the same wrong value per load cell. Strange and hard to avoid unfortunately.

Logically it is not too complex to handle these records:

Value differs more than GLITCH_MIN_DIFF from direct predecessor and successor? Delete it. (Or change it to the average of predecessor and successor). This would not even consider the fact that the wrong values are quite predictable per load cell.

Can you help me to implement such filter in influxdb (could run periodically to clean up new data) or grafana (filtering while reporting)?

Thanks and best regards
Keith

Anaisdg · October 1, 2019, 5:22pm

Hello @sauvant,
Thanks for your question.
It sounds like you want to use a Continuous Query, but I’m having trouble understanding your problem. Could you please share some of your data in lp so I can get a better understanding of your schema and what you’re trying to accomplish?

sauvant · October 1, 2019, 6:01pm

Hi, thanks for your help!

This is what my data looks like on the grafana dashboard: http://ksau.de/temp/hx711.png

Every color is one of the load cells. The data is simply 8 integers with a Timestamp. The diagram shows an idle situation. All measurements are near 0, everything else are false measurement results. I want to filter those.

The real data CAN reach the values of the wrong spikes but real data never has this gradient. The bad spikes are only ONE record surrounded by normal values. I need some kind of a low pass filter to filter the data. Or a query that considers let’s say 3 consecutive values and drops the middle one of it exceeds MAX_DIFF from the other two…

Does that make sense?

Anaisdg · October 3, 2019, 6:26pm

Hello @sauvant,
Yes. Thank you for the detail. Are you using 1.x or 2.0? Now that I understand what you’re trying to do, can you help me understand the bigger picture?/thing you’re trying to accomplish?

sauvant · October 6, 2019, 12:44pm

Hi,

Thanks again for your attention
I am using latest 1.x. Meanwhile I found out that “median” nearly does what I need. Currently I group time based and aggregate the values using median(). That eliminates the “bad” values as they are way off. But it decreases resolution. Is there a way to group the “latest 3” points with these latest three travelling through all the data…? Some kind of “moving median”? Is that possible with the advanced grouping possibilities?

Best regards
Keith

Topic		Replies	Views
Count data showing properly in influxdb table, but in grafana visualisation sometimes not correct InfluxDB 2	0	171	February 20, 2024
Building InfluxQL queries with Grafana crashes chrome Dashboards influxdb , grafana , influxql	1	570	April 17, 2019
How to filter noise from a time-series graph? Dashboards influxdb , grafana , influxql , query	2	345	November 30, 2023
Influx filtering not returning any data, without filter too much Dashboards	3	457	May 8, 2023
Grafana query value error not sure if my query statement is wrong InfluxDB 2 grafana	7	38	September 4, 2024

Filter data glitches

Related topics