Tool for InfluxDB batch ETL

ArthurKretzer · April 8, 2025, 4:42pm

Hello everyone,

I’m currently facing challenges with replicating InfluxDB data to a data lake.

I’m using InfluxDB v2 OSS on-premises and have developed a fault-tolerant Python script to replicate its raw data. While functional, the script has become a headache due to the large volume of data—it takes too long to process.

My goal is to achieve full data replication to Parquet files stored in a MinIO instance. However, I haven’t found a tool that fits my needs. The Airflow operator isn’t sufficient, Telegraf and Kapacitor don’t seem to solve the problem, and Quix Streams would require me to set up Kafka, which I’d prefer to avoid since I’m not dealing with streaming right now.

Is there any tool that can handle this, similar to how Airbyte manages batch replication for Postgres and other databases?

Thanks in advance!

Anaisdg · April 15, 2025, 6:24pm

Hello @ArthurKretzer,
Might I interest you in InfluxDB v3? You can use the python processing engine that’s embedded in it and someone in the org created a parquet exporter so you can export parquet in influxdb v3 to iceberg so you can read iceberg tables from datalakes like duckdb or snowflake. Unfortunately its not on a public repo but I’m working to move it there asap. You might be interested in learning about the python processing engine though for influxdb 3 core and enterprise.

oooooh also fyi Quix doesn’t require you to setup kafka, it handles all the kafka under the hood for you so it might be easier than you think. It’s whole selling point is trying to relieve you of the pain you just mentioned.

Topic		Replies	Views
Replicate data from InfluxDB 1.8 to InfluxDB Cloud Store	2	1217	April 13, 2023
Is there any way to import datapoint/table from S3 influxdb , telegraf , collectd , influxdata	1	3035	June 6, 2017
Influxdb data migration from one server to another server InfluxDB 2 influxdb , influxdata , grafana , backup	4	4488	October 17, 2022
Future integration with influxDB 3.0 Site Feedback influxdb	1	328	November 12, 2023
Pull data from Influx into AWS S3 InfluxDB IOx	0	667	July 5, 2023

Tool for InfluxDB batch ETL

Related topics