Download (using Flux) and then re-upload modified data?

Hello everybody,

occasionally I have the need to correct single or multiple datapoints in my Influx2 DB, add some, or remove others. Since the builtin UI doesn’t have this capability, I want to script this. But before I start and go into the wrong direction: is it possible to UPDATE values in Influx2 databases at all? Can I retrieve annotated CSV values using Flux, and then take this annotated CSV (with possibly modified _value columns, or added and deleted rows) and upload it again and Influx will update existing data?

If this is not directly possible, how do I accomplish this?

Thank you!

Does anybody have an idea? :slight_smile:

it is possible.

In order to update a point you need to insert that very same point with different field values, this will cause an update.
In order for it to succeed it must have the same series and timestamp.
You can only update field values, any other “update” (i.e. on tag values) requires a delete and insert

You can add rows as you like, just use the proper series in order to find them later.

OK, thank you :slight_smile: so the first potential show stopper is solved.
When editing, I can make the non-field values readonly.

Followup question: Is there a way to retrieve data, in a format that could be re-inserted without change?
As far as I have understood the docs, a query can only return CSV data, but this doesn’t include annotation headers. But an insert requires either line protocol or fully annotated CSV. So it’s not possible to do query a set of records in a format that would be directly insertable into another database / influx instance / whatever. If this is true, one could say “Influx cannot speak to itself” … :slight_smile:

What am I missing? This must be possible, right?

I’m not knowledgeable enough in InfluxDB 2 to answer that… (I’m still using 1.8).

The following docs section might help you as it explains how to export/import csv data
https://docs.influxdata.com/influxdb/v2.1/reference/syntax/annotated-csv/

@Jens, you can write annotated CSV to InfluxDB, but currently only with the influx write command or through the InfluxDB UI.

However, if you’re doing the value updates in Flux, you can just use to() to write the modified values back to InfluxDB. to() structures the data as line protocol and writes it back to InfluxDB.

from(bucket: "example-bucket")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "example-measurement")
    // Update field values
    |> map(fn: (r) => ({ r with _value: r._value * 10.0 }))
    |> to(bucket: "example-bucket")

This will overwrite the existing values with the updated values.

To add lines of data, you don’t really need to query data out. You can just write the new points with appropriate timestamps and tags and InfluxDB will figure out the rest.

To delete data, you need to use the /api/v2/delete API or influx delete command. More information is available in the documentation – Delete data from InfluxDB

@Giovanni_Luisotto, @scott, thank you!

My intention actually is to write an interactive “table editor” somewhat like phpMyAdmin or SQL Studio for Influx, where users can edit and resubmit field values, and (possibly later) also add new values or delete incorrect values. To be able to do that, I need to retrieve data from a measurement and submit the values for update.

My impression is that I can not keep the tabular format, but I’d have to convert the data to line protocol to be able to resubmit them to Influx for an update, and be careful that all tags are kept so that no accidental new values are created.

Right?
Or is there an alternative?

Thanks,
Jens

Any help on whether my impression (see above) is true?
Or is there an alternative?
Thanks!

I think you are right, but that’s also up to how you interact with the DB as some client libraries “hide” the line protocol.
I have no idea about what’s your use case, but InfluxDB does not encourage updates and deletes on data.

In a “standard” use case (like monitoring) I’d consider it madness to access the data at single point granularity as there are millions of data points (in my case, which is not that big I write ~315000 per minute)

Hello,
my use case cannot be so special. IoT devices make mistakes, sometimes I need to delete a day of data (or an hour or one specific value) because it’s garbage. This destroys all average and aggregation calculations.
One example is a water consumption meter that reads an analog dial using a camera and an OCR engine trained by a neural network: GitHub - jomjol/AI-on-the-edge-device

I think with any database, editing/updating existing data is a central feature and should at least be possible.

Jens

After doing some more research and really finding nothing, I whipped up something to make it at least possible. Because nothing should be impossible. :slight_smile:

Here’s the result, for anybody who may be interested.