Hi, currently we faced an problem.
We’ve been uploading a lot of data into one measurement, but now it turned out that the selected timezone in out app was wrong, so this data in this measurement has incorrect time. It should be corrected by adding extra two hours for each datapoint.
So, what’s the easist way to update currently stored data?
Hello @Sky_Gt,
You could use flux to add the time in your query with the timeShift():
timeShift(duration: 2h, columns: ["_start", "_stop", "_time"])
Otherwise, if you’re looking to implement this change on the db level, you will need to pull the data out and rewrite it with the proper timestamps. The timestamp acts as the primary key for the database so you can’t update it. Also updates in InfluxDB are difficult. This was a design decision to allow for greater performance.
Hi Guys, I am in situation where I don’t know exact time the data should be written, which I will come to know later point of time. I am dumping the data in a bucket lets say “bucket1”. How can I correct the time . I thought we can write a task that can keep checking some API (rest) , which can give the time offset for the data when available., after getting offset , it can re write the data in new bucket with corrected time. Has anyone tried that. ? We are not getting optimum way of doing it can you point me some example.
More detail is that the task will query the rest API lets say start window (t_start) and end window (t_end) of time and offset time t_offset. Now I can query the bucket for all data between t_start and t-end and need to re write in another bucket, by adding t_offset to its time column. Now nee dto remove this data from existing bucket,
Hello @yash.singh,
You’ve got the correct idea.
You’d want to crate a task where you query your data and use timeShift() to change the timestamps.
However using a task assumes that you want to be continuously querying data from your “bucket1” and applying the same manipulation to the timestamp. Is that correct?
It sounds like to me like this isn’t something you’re looking to perform on a schedule because the timeshift you want to perform is changing. Where are you getting information about how you want to change the timestamps? You might be able to gather that value and apply it to your data in “bucket1” to conditionally transform your data in task. However it sounds to me more like this is something that needs to be done adhoc, in which case you don’t need a task. You can just use a query. If that’s true, your query might look something like:
from(bucket: "bucket1")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "mymeasurement")
|> filter(fn: (r) => r["_field"] == "myfield")
|> filter(fn: (r) => r["mytag"] == "mytagvalue")
|> timeShift(duration: 10h, columns: ["_start", "_stop", "_time"])
|> bucket(name:"bucket2'")
To remove that data you would use the CLI or API…delete with predicate:
influx delete --bucket example-bucket \
--start '1970-01-01T00:00:00Z' \
--stop $(date +"%Y-%m-%dT%H:%M:%SZ") \
--predicate '_measurement="example-measurement" AND exampleTag="exampleTagValue"'
Thanks, Ana, This is how we are doing now because we need a loop we are using an external python script to run this.