InfluxDB Performance when handling duplicate entries

Robbert · May 5, 2022, 5:12pm

Dear Influx Community member,

I am currently implementing a project, where we collect data from machines via MQTT and a Unified Name Space. Influx will be connected via a data bridge to the MQTT broker where the UNS data model lives, from where it will ingest timeseries data.

In this particular case, data that should be loaded as fields into a single influx measurement (row) could come in over a period of a couple of minutes. My question is as follows:

Could I just send the individual fields into Influx with their timestamp and tags, letting Influx taking care of adding the new fields to the already existing measurement.
Or, should I organize this data in measurements myself, before inserting them into InfluxDB to minimize the need for Influx to merge the duplicate measurements?

I am mainly asking this in relation to InfluxDB performance related to merging duplicate measurements.

Some googling did get me answers about how influx handles such duplicates, but nothing about the performance impact.

I look forward to your insights!

Robbert

Anaisdg · May 10, 2022, 5:59pm

Hello @Robbert,
You can’t merge measurements. You can use tasks to transform data and rewrite it to InfluxDB. But I don’t recommend this.
InfluxDB is schema on write.
I recommend organizing the data in measurements before inserting them into InfluxDB.

Topic		Replies	Views
Duplicate metrics when inserting into InfluxDB InfluxDB 2	2	317	October 24, 2023
Data in a Measurement influxdb , influxdata , schema , date , performance	2	552	December 12, 2023
Is it better to store data in multiple measurements?	1	510	May 24, 2019
Influxdb performance tuning question influxdb , time-series , schema	0	849	March 6, 2019
How does InfluxDB handle duplicate points?	9	21451	May 30, 2017

InfluxDB Performance when handling duplicate entries

Related topics