New to InfluxDB, questions about writing and retention

Hi there!

We are currently evaluating InfluxDB to be used as process historian for our SCADA product.
We came across the following questions we were not able to answer reading the online doc, so possibly someone can assist:

  1. When writing data using “/write” we get a response from the database. At this point (i.e. when we receive the response from the server) - assuming it is an “OK”, no error - it is sure that the data is persisted in the DB (at least in the WAL)? Background: We buffer data locally and want to be able to clear the buffer after the write returns. This implies that if the system crashes at exactly the point in time (i.e. after receiving the response) no data is lost.

  2. We define a retention policy for let’s say 1 year. Now we backup a shard which then gets deleted after this year. Two year later we want to import this shard again, as the data in it is needed again. What happens, will the shard automatically be deleted, as it’s time period is beyond the retention period? Or is it better for this kind of use case to set retention to unlimited and do the deletion after one year “manually” by our application?

  3. If I change the retention policy of a series, are the points written with the previous retention policy still queryable?
    Example: We write points into a specific series with retention policy “2weeks.” Starting at a given point im time we write data of this series using another retention policy, lets say “4weeks”. As the retention policy cannot be specified in a query, will InfluxDB automatically find data for this series in both “policies”?

Thank you very much,
Ewald

For #1, I believe Yes, once you get a OK response (200 or 204) response code back from the write interface, you can safely assume it’s written, (or at least, should be written). An influx engineer may be able to provide more technicalities around this situation.
#2, not sure what to recommend for you here, as I haven’t thought about archival of data in influx before.
#3, you can specify the retention policy in the query.
You specify it before the measurement name using a dot notation.
I.E. select * from “2 weeks”."my measurement name"
or select * from “4 weeks”."my measurement name"
and yes, writing data into different retention policies, are totally separate buckets of data, and your query will need to pull the data from the correct place. For that reason, it’s probably better to name the retention policies a single name I.E. short term, or long term, etc, and if you need to change the duration of that retention policy, you just do that, but the name of the retention policy doesn’t change, so all the data is still in the same bucket, and the queries will still be able to pull whatever data is in that RP and measurement.

1 Like

You could consider using OPC-UA and the Factry connection to influxdb ([https://www.factry.io/])(https://www.factry.io/). Works uninterrupted and and can handle large volume and fast data logging.