InfluxDB subscriber disk size differs from "main"

Good day,

We are using a double influxDB 1.8 setup:
1 main influxDB
1 replica influxDB

The replica has a subscription on the main: so the main influxDB node pushes all data to the replica influxDB node. The main and replica are using the exact same settings.
Now, after 3 months of using this setup, I noticed that the sizes on disk were not equal. In fact, the influxdata directory of the main node is about 1.1 GiB larger than that of the replica (9.6 GiB vs 8.5 GiB). This of course worries me: I am afraid that I am missing that much data in the replica.
I have checked the “writeFailures” on the main node and it is 3561, doesn’t seem to me like 3561 write failures would equal 1.1 GiB of data.

Does anyone know what the cause of this size difference could be? Am I correct in assuming that this means we are missing 1.1 GiB of data on the replica? How could I detect which data is missing? Could this difference also be caused by data being compressed differently?
I am aware that I could do a simple count, however, such a count is a very expensive query to run and these are production nodes, so a very bad idea.

Any help would be appreciated.

Hello @mvgastel,
Welcome! I’m not sure. Let me forward your question to the InfluxDB team. You’re patience is appreciated.


We seem to have found the issue; apparently the subscription was somehow set up on both the main and the replica. As in: the main was sending data to the replica, but the replica was also sending data to itself. This is obviously wrong.
After removing the subscription of the replica to itself, the data on the main and replica has been equal. You can consider this issue solved.


Given your explanation, I am surprised that the main node contained more data
than the replica node.