InfluxDB subscriber disk size differs from "main"

mvgastel · July 8, 2021, 7:28am

Good day,

We are using a double influxDB 1.8 setup:
1 main influxDB
1 replica influxDB

The replica has a subscription on the main: so the main influxDB node pushes all data to the replica influxDB node. The main and replica are using the exact same settings.
Now, after 3 months of using this setup, I noticed that the sizes on disk were not equal. In fact, the influxdata directory of the main node is about 1.1 GiB larger than that of the replica (9.6 GiB vs 8.5 GiB). This of course worries me: I am afraid that I am missing that much data in the replica.
I have checked the “writeFailures” on the main node and it is 3561, doesn’t seem to me like 3561 write failures would equal 1.1 GiB of data.

Does anyone know what the cause of this size difference could be? Am I correct in assuming that this means we are missing 1.1 GiB of data on the replica? How could I detect which data is missing? Could this difference also be caused by data being compressed differently?
I am aware that I could do a simple count, however, such a count is a very expensive query to run and these are production nodes, so a very bad idea.

Any help would be appreciated.

Anaisdg · July 8, 2021, 7:05pm

Hello @mvgastel,
Welcome! I’m not sure. Let me forward your question to the InfluxDB team. You’re patience is appreciated.

mvgastel · July 21, 2021, 9:45am

Hey,

We seem to have found the issue; apparently the subscription was somehow set up on both the main and the replica. As in: the main was sending data to the replica, but the replica was also sending data to itself. This is obviously wrong.
After removing the subscription of the replica to itself, the data on the main and replica has been equal. You can consider this issue solved.

Regards,
Michael

Pooh · July 21, 2021, 10:05am

Given your explanation, I am surprised that the main node contained more data
than the replica node.

Antony.

Topic		Replies	Views
Data Size Questions - influx much bigger than raw data InfluxDB 1 performance , retention-policy	5	3856	August 26, 2021
Replication stream buffer size InfluxDB 2 influxdb	5	170	April 23, 2024
Why does shard size vary so wildly? InfluxDB 1 influxdb	1	551	August 8, 2022
Replication dropping data influxdb	1	109	May 21, 2024
Replication has hickups InfluxDB 2 influxdb , influxdata , backup , performance	0	328	December 13, 2023

InfluxDB subscriber disk size differs from "main"

Related topics