Export big data from influxdb




I need to export some days of data from influxDB to files (csv and excel).
Can you to suggest me the right way to proceed?

I tried these solutions:


curl -G 'http://my.server.com:8086/query?db=mydb' --data-urlencode "q=select * from sensor where time > now() - 2d" > output_file

pros: only 6 seconds
cons: hard to format output in csv and excel

python InfluxDBClient

client = influxdb.InfluxDBClient( ... )
points = client.query(query, chunked=True, chunk_size=10000).get_points()
dfs = pd.DataFrame(points)
for d in dfs:
    d.to_excel(writer, "Sheet")

pros: easy to format output with pandas lib. No influx max-row-limit (default=10000)
cons: It takes 2 minutes and 24 seconds

python DataFrameClient

client = influxdb.DataFrameClient( ... )
dfs_dict = client.query(query, chunked=True)
ret = dfs_dict["sensor"]
for d in ret:
    d.to_excel(writer, "Sheet")

pros: easy to format output with pandas lib. It take 10 seconds
cons: influx max-row-limit (default=10000)

Cons in each solution is unacceptable for me :frowning:
Where I’m wrong?
I can’t wait so much time, I can’t change max-row-limit from influxdb.conf and I need to split output and write it in csv+excel format.
I hope in your help.



You can define the output type as json|csv|column

-format 'json|csv|column’
Format specifies the format of the server responses: json, csv, or column.

So your command will become
curl -G ‘http://my.server.com:8086/query?db=mydb’ --data-urlencode “q=select * from sensor where time > now() - 2d” -format csv> output_file



Thanks for reply.
I can’t find -format option for curl.
But anyway It couldn’t be used for export in excel format :disappointed:


Yeah ur right it is not a curl option but a influx execute option. You can directly run it on the server which has the data:


Ok thanks but i can’t access directly on server which has data.


Hi guys,
What’s the impact of this kind of query on the Influx db server?
We’re experimenting the same thing (trying to backup every day into a csv file) and we saw a saturation of the RAM memory and then continuous swapping to the disk (the query takes really long time).

We have a single instance node, 4 cores, 8-10 GB RAM (the total RAM is 16GB but we have other processes running in it).
We are inserting 2.400.000 points per hour and we query like that: select * from telemetry where time >= 2017-09-01 00:00:00 and time <= 2017-09-01 23:59:59


We also have the same problem of resteeee and at the moment we “cut” one query in many:

select * from sensor where time >= 0 and time < 10
select * from sensor where time >= 10 and time < 20
select * from sensor where time >= 20 and time < 30

and so on …

but how can i know in advance if i need to use many queries instead of one. And how many?
And what to do if influx return me only [max-row-limit] rows?


That’s a good question. Can anybody from InfluxData reply to these comments? Please?


Now i’m trying to extract 200 000 000 rows (7 days data) using this minimal script:

client = influxdb.DataFrameClient(host, port, user, pwd, db)
dfs_dict = client.query(query, chunked=True, chunk_size=100000)
measurement = next(iter(dfs_dict))
ret = dfs_dict[measurement]
ret.to_csv(file_path, sep=",", encoding="utf-8")

Result: saturation of 8Gb RAM + 8Gb swap
Possible solution: cut my query into 10000 smaller queries.
InfluxData team, this is the right and faster way to proceed?
Which client is the most suitable for this purpose?



Very interested in a solution. We’re facing the same problem; currently, our data is safely stored in InfluxDB with no way to get it out for analysis :frowning:


Any ideas on how to deal with the memory saturation? Having the same issue. Need to export 180.000.000 rows to a csv file but the memory just blows.


Also interested in a solution for challenges/problems like this. I am facing the same problem.


I currently dump month-by-month and that works. It’s incredibly slow, but it works.


Are you using Influx CLI for that or something else?


You may have a look to InfluxDB Fetcher, https://github.com/hgomez/influxdb

Very usefull to craft a select request and generate a writeprotocol file


The problem is not the format, but that requesting a data dump via query-and-save is unusably slow. The InfluxDB fetcher would have the same problem, as it’s the data production on the InfluxDB end that’s slow, not the saving.


There are new backup/restore features in the latest 1.5 release of influxdb that should address this problem:


The feature of interest here is to set time bounds for the export. This can be done with the new -start and -end flags. Read more here:



It’s ultra fast, but it backs up the data in a binary compressed format