Adding tags at a later time in v3

f1outsourcing · April 9, 2024, 9:11am

Is it possible in v3 to add tags at a later time, eg. inserting ip addresses ‘real time’ and then later add a tag in a batch with their host name?

scott · April 9, 2024, 2:21pm

@f1outsourcing No, this isn’t possible in v3. Each point is uniquely identified by its primary key (its timestamp and tag set). If you add a tag, the primary key is different and will not update the previously written point, but write an entirely new point. For example, using the following line protocol:

m,ip=0.0.0.0 field1=1i 1712620800000000000
m,ip=0.0.0.0,host=host1 field1=1i 1712620800000000000

This line protocol will write two entirely separate points. In storage, it would look like this:

field1	host	ip	time
1		0.0.0.0	2024-04-09T00:00:00Z
1	host1	0.0.0.0	2024-04-09T00:00:00Z

However, because fields are not part of the primary key, if you were to add the host value as a field, it would update the existing point:

m,ip=0.0.0.0 field1=1i 1712620800000000000
m,ip=0.0.0.0 host=host1,field1=1i 1712620800000000000

Which would result in this in storage:

field1	host	ip	time
1	host1	0.0.0.0	2024-04-09T00:00:00Z

With the v3 storage engine, when it comes to storage, there’s very little difference between tags and fields, however there are implications at query time and if you ever want to custom-partition your data (which you can only do using timestamps and tags).

I personally would not go this route. Rather than a batch process that updates data after it’s been written, I’d try to process the data on it’s way in and add the host tag. This is something you could use Telegraf to do.

f1outsourcing · April 10, 2024, 9:01am

Thanks for this detailed answer Scott I have to read up a bit on the different applications for fields and tags. If I have to do this real time, I probably waste a lot of resources, I guess 80% of this data would be unnecessary.

scott · April 10, 2024, 2:09pm

Something you might consider doing is writing all the “raw” data to one database that you can later query, process, and then write to a different database. The “raw” database would only need to retain data long enough to process it, so the retention period could be pretty short. That way, you’re not trying to update a point that’s already stored, you’re using stored data to build new points and store them in a separate location.

Topic		Replies	Views
Adding new tag to existing series influxdb	6	17757	September 6, 2018
Dynamic Tags in Telegraf telegraf	4	1506	August 30, 2019
Add new tags in continuous query Store influxdb	3	3695	January 23, 2020
How to create tag and add multiple host in it influxdb , telegraf	2	3211	December 26, 2018
Adding Tags to an existing measurement InfluxDB 2 influxdb	1	544	July 4, 2024

Adding tags at a later time in v3

Related topics