How to improve write performance

We have 50k sensors, 100 measurements per sensor. The sensor communicating every 20 seconds.

Series Cardinality = 50k * 100 = 5m
Data points per second = 250k/sec

I’ve created one umbrella measurements to store all the measurements and here is the sample point

{
    "SensorId":1176, // Tag       
    "MeasurementId":2001, // Tag
    "Value":100.234,  // Field
    "Timestamp":1508843693711475301
}

Telegraf takes input from RabbitMQ ( amqp ) plugin and writes to InfluxDB.

Telegraf.config ( only mentioned the updated fields, omitted the defaults )

[agent]
metric_batch_size = 10000

[[inputs.amqp_consumer]]
prefetch_count = 10000
data_format = "json"
tag_keys = [
  "MeasurementId",
  "SensorId"
]

[[outputs.influxdb]]
content_encoding = "gzip"

Machine configuration:
RAM : 16GB
CPU : Intel® Core™ i7-3770 CPU @ 3.40GHz
DISK Type : SSD

What should I do additionally to improve the write performance?

One idea you could try is to change the MeasurementId to be the field name for the Value, this would reduce the series cardinality back to 50k. You could also potentially transfer multiple values per line with this schema, but it would require changing the input format.

I’m curious, how many data points are you able to insert per second with a single Telegraf instance?

I’ve multiple measurements and measurementId is the unique key, If I change it to field it’ll override because multiple measurements have the same timestamp.

Here is how it would look in line protocol, you can have many measurements on the series and they override, because the field name would be unique:

sensor,sensor_id=1176 2001=100.234 1508843693711475301
1 Like

Hi Daniel, your suggestion helped us a lot. Both read and write performance improved significantly. I had designed the schema with MySQL-mind, I totally forgot that different sensor can have a different number of fields.

1 Like