Replication stream buffer size

Hello,

I’m using the replication stream feature to keep one bucket in sync with another remote one, no complain here, works very well !

I would like to get some technical information about the way the buffer is stored on disk:

  • is the data encoded? Compressed? Is it raw binary data?

The goal is to get an estimate of the disk space my queries would take if the remote is down for a certain amount of time. My write queries are always the same in terms of number of fields, measurement and tags. I can of course make an experiment and measure the size after a while but I’d like technical, precise information if possible :slight_smile:

Thank you very much,

Hello @LC_BTS,
Welcome to the community! Thanks for your question. Out of curiosity what are you doing with InfluxDB?
I dont know. I’m asking around.
As far as estimating disk space of buffer, there isn’t a tool for this. I assume you’ll have to run some test where you make the destination unavailable and monitor disk usage. I’m sorry I can’t be more helpful.

@LC_BTS
Here’s what I gathered:

IIRC the EDR implementation is very similar to hinted handoff in Enterprise (and I assume storage format is similar). I think the Enterprise docs might have some guidance on sizing the hinted handoff queue you could pull from.
Configure InfluxDB Enterprise data nodes | InfluxDB Enterprise Documentation
InfluxDB Enterprise features | InfluxDB Enterprise Documentation

Hello, thanks for your answer, I’m managing a device that may loose internet connection for a few hours/days and I’d like to know if the replication stream mechanism can handle the downtime, given our current rate of measurements.

Hello, thanks for the links!

I also took a peak at the source code directly and found more information, the buffer is compressed in gzip using a go library, meaning I should be able to make more accurate tests using the same code.

1 Like

@LC_BTS Thanks for sharing!