DROP measurement too slow & Blacklist measurements

hellobhaskar · May 26, 2019, 6:13am

We are using telegraf statsd plugin. We have a legacy code which is emitted tons of measurements which are not optimized for influxdb and we dont use it. As we dont have dev resources we dont want it to bomboard influxdb. currently it is generated 7,00,000 measurements. I am running a python program which iterates over all the 7, 00 ,000 measurements and run DROP MEASUREMENT against individual measurement. It is running very slow. Is there any way to blacklist the measurements from server side ? we did consider namedrop in telegraf (under inputs.statsd) it seems to be not working. Please advise. We did try namedrop in [[inputs.statsd] some how its not working

hellobhaskar · May 28, 2019, 3:08pm

Any suggestions ? We badly need to address our problem

Anaisdg · May 29, 2019, 4:59pm

@davidgs any thoughts?

daniel · May 29, 2019, 11:23pm

@hellobhaskar Can you post your statsd configuration and maybe we can figure out why the namepass setting isn’t working for you. We might also be able to show you how to setup the config templates to send in a more InfluxDB friendly way. Include an example of the stats that are being sent too.

abdul.gaffar · June 1, 2019, 11:03am

@daniel I have the following setup

OS: Ubuntu 18.04.2 LTS
Telegraf: 1.10.2-1
InfluxDB: 1.1.1

Telegraf config

[[inputs.statsd]]

[[outputs.influxdb]]
  namedrop = ["uwsgi.worker.*.core.*.write_errors", "uwsgi.worker.*.respawns"]

Telegraf logs after restarting

(venv) vagrant@ubuntu:~$ sudo service telegraf status
● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
   Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2019-06-01 10:47:02 UTC; 4s ago
     Docs: https://github.com/influxdata/telegraf
 Main PID: 959 (telegraf)
    Tasks: 11 (limit: 4703)
   CGroup: /system.slice/telegraf.service
           └─959 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d

Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z I! Loaded processors:
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z I! Loaded outputs: influxdb
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z I! Tags enabled: host=ubuntu
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ubuntu", Flush Interval:10s
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z D! [agent] Connecting outputs
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z D! [agent] Attempting connection to output: influxdb
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z D! [agent] Successfully connected to output: influxdb
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z D! [agent] Starting service inputs
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z I! Statsd UDP listener listening on:  [::]:8125
Jun 01 10:47:02 ubuntu telegraf[959]: 2019-06-01T10:47:02Z I! Started the statsd service on :8125

Measurements in InfluxDB

> show measurements
name: measurements
name
----
alerta_alerts
cpu
disk
diskio
kernel
mem
processes
swap
system

Now, I tried emitting some statsd metrics

echo "uwsgi.worker.0.core.0.write_errors:1|c" | nc -w 1 -u localhost 8125
echo "uwsgi.worker.1.core.0.write_errors:1|c" | nc -w 1 -u localhost 8125
echo "uwsgi.worker.0.delta_requests:1|c" | nc -w 1 -u localhost 8125
echo "uwsgi.worker.1.delta_requests:1|c" | nc -w 1 -u localhost 8125
echo "uwsgi.worker.0.avg_response_time:10|c" | nc -w 1 -u localhost 8125
echo "uwsgi.worker.1.avg_response_time:10|c" | nc -w 1 -u localhost 8125

Measurements in InfluxDB after emitting statsd metrics

> show measurements
name: measurements
name
----
alerta_alerts
cpu
disk
diskio
kernel
mem
processes
swap
system
uwsgi_worker_0_avg_response_time
uwsgi_worker_0_core_0_write_errors
uwsgi_worker_0_delta_requests
uwsgi_worker_1_avg_response_time
uwsgi_worker_1_core_0_write_errors
uwsgi_worker_1_delta_requests

>

My expectation is telegraf should drop the measurement which matches the pattern but it is not

abdul.gaffar · June 10, 2019, 5:37am

@daniel @davidgs can you please help me out?

davidgs · June 10, 2019, 2:27pm

First, are you really using InfluxDB 1.1.1? That’s a very old version, and an upgrade there might help.

I see that telegraf is creating the measurements by replacing the . with _ so have you tried adding
namedrop = ["uwsgi_worker_*_core_*_write_errors", "uwsgi_worker_*._espawns"]
to your config and see if it then drops those? I haven’t had time to check the code to see if it evaluates the namedrop rule before or after converting the .s too _s but it’s worth checking out.

Best Regards,
dg

hellobhaskar · June 12, 2019, 11:52am

@davidgs/@daniel that was error in test setup. We are actually running influxdb 1.7.6 it was corrected inital test is working fine. But we will do through check on actual metrics. Can you please help me with
how to speed up drop measurement. i have 13 million measurements to delete. each measurement taking 2 min to drop. we are running in nvm ssd and have good ram and cpu.

hellobhaskar · June 18, 2019, 3:43pm

@davidgs/@daniel please let me know how can i speed up measurements drop. we are using influxdb 1.7.6 .

MarcV · June 18, 2019, 4:32pm

Hi @hellobhaskar,

Is it an option to duplicate the “good” measurements to a new database ?

And then drop the database with the 13.000.000 measurements ?

Then continue working on the new database or create the “old” one again and copy the data back ?

And ofcourse as David suggested , replace the . with _ in the namedrop …

hellobhaskar · June 26, 2019, 3:56pm

@MarcV/ @davidgs/ @daniel unfortuantely that is too tedious as we have tons of measurements.
Can you please tell if i do
delete * from /mymeasurementregex/;

will drop points as well as measurements ?

our immediate problem is chronograf/grafana freezes when populating measurment list. We have implemented namedrop suggestion.

Topic		Replies	Views
Telegraf to InfluxDB V2 using multiple buckets InfluxDB 2	3	3361	May 8, 2020
TIG implementation over multiple raspberry Pi’s to moniter NFS (part 2) Telegraf influxdb , telegraf , grafana , flux	2	459	November 8, 2021
Multiple Collectd instances Telegraf influxdb , collectd , influxdata , grafana	1	2668	April 27, 2017
Poor write performance to influxd - events getting dropped Telegraf	3	1454	June 29, 2018
Data Loss happens at telegraf side influxdb , telegraf	4	1444	August 8, 2019

DROP measurement too slow & Blacklist measurements

Related topics