InfluxQL returning partially replicated records

We’re currently undergoing an uplift of InfluxDB to 2.7. As a result to load balancing and limitations to basic authorisation with 2.7, we can no longer use Flux as the /v2 API will only support a token which is specific to a single server. This has resulted in the below differences;

  • Replacement of the influxdb-relay with v2.7’s Replication.
  • Querying the original (v1) API, rather than /v2
  • Query language moved to InfluxQL from Flux

Our applications where going to /v2 using a query as below - intending to get the last record matching the criteria;

  |> range(start: -30m)
  |> filter(fn: (r) =>
    r._measurement == "measurement" and
    r.my_tag == "tag_val" and
    r.my_other_tag == "another_tag_val"
  )
  |> group(columns: ["_field"], mode: "by")
  |> last()

With our new implementation, we’ve replicated this query using InfluxQL, also no longer going to the /v2 API & to our new v2.7 Influx servers.

SELECT * FROM "retention_policy"."measurement" WHERE ("my_tag" = 'tag_val' AND "my_other_tag "::tag = 'another_tag_val') ORDER BY time DESC LIMIT 1

In effect, this gives us the same results but intermittently we receive a record with one or more (usually ~50%+) of the field values missing. Reattempting this query straight after gives the record back with all the fields.

We have ran a TCP dump and followed the Writes of the data through to where we read them and when this happens we notice the below, where our query is happening between the replication of the same record;

  • REPL HTTP REQ
  • QUERY HTTP REQ
  • QUERY HTTP RESP
  • REPL HTTP RESP

As part of our troubleshooting, we have re-enabled the Influxdb-relay with v2.7 and turned off the replication to see if this is the cause of the issue however, seemingly less frequent we are observing the same issue.

We would expect the database to natively provide records which have been fully written/replicated rather than one that’s in the process of being replicated (like a transaction). Does anyone have any suggestions or known issues relating to this?

Hello @influxnoob,
If you’re using InfluxQL why move to v2?
I would stay at v1 as it will be easier to migrate to v3 in the future and then you can query with either InfluxQL or SQL.
I haven’t heard of an InfluxQL query against v2 dropping records. That’s wild.
Ah that makes sense that its happening during replication though I still haven’t heard of this happening. I don’t have any suggestions for fixing this though, and Im afraid I won’t be able to find someone who does (though ill ask around).

@Jay_Clifford do you know?

Hi @Anaisdg, we’ve had to move away from /v2 due to load balancing. with this seemingly no longer supporting basic auth, we have had to move to the original API.

With two or more servers behind the load balancer, we also couldn’t use an authorization token as we can never know which server we’re going to hit and therefore would have to keep trying until a token<>server matches.

just to re-emphasise, we’re getting all fields returned but just not the values until we re-attempt

Hi @Anaisdg,

Just to add, I’ve also found another post which seems to be similar;