Adding tags at a later time in v3

Is it possible in v3 to add tags at a later time, eg. inserting ip addresses ‘real time’ and then later add a tag in a batch with their host name?

@f1outsourcing No, this isn’t possible in v3. Each point is uniquely identified by its primary key (its timestamp and tag set). If you add a tag, the primary key is different and will not update the previously written point, but write an entirely new point. For example, using the following line protocol:

m,ip=0.0.0.0 field1=1i 1712620800000000000
m,ip=0.0.0.0,host=host1 field1=1i 1712620800000000000

This line protocol will write two entirely separate points. In storage, it would look like this:

field1 host ip time
1 0.0.0.0 2024-04-09T00:00:00Z
1 host1 0.0.0.0 2024-04-09T00:00:00Z

However, because fields are not part of the primary key, if you were to add the host value as a field, it would update the existing point:

m,ip=0.0.0.0 field1=1i 1712620800000000000
m,ip=0.0.0.0 host=host1,field1=1i 1712620800000000000

Which would result in this in storage:

field1 host ip time
1 host1 0.0.0.0 2024-04-09T00:00:00Z

With the v3 storage engine, when it comes to storage, there’s very little difference between tags and fields, however there are implications at query time and if you ever want to custom-partition your data (which you can only do using timestamps and tags).

I personally would not go this route. Rather than a batch process that updates data after it’s been written, I’d try to process the data on it’s way in and add the host tag. This is something you could use Telegraf to do.

Thanks for this detailed answer Scott :slight_smile: I have to read up a bit on the different applications for fields and tags. If I have to do this real time, I probably waste a lot of resources, I guess 80% of this data would be unnecessary.

Something you might consider doing is writing all the “raw” data to one database that you can later query, process, and then write to a different database. The “raw” database would only need to retain data long enough to process it, so the retention period could be pretty short. That way, you’re not trying to update a point that’s already stored, you’re using stored data to build new points and store them in a separate location.