Setup: Local InfluxDB running on Raspberry Pi 3, ~150 series, ~85 points per second. Local InfluxDB is not always on (gaps in data) and is not always connected to Internet. Also using Grafana on local device. Performance of this setup works great!
Now, I am trying to sync this data (and eventually, multiple devices) to a server. I wrote a Go program using the Go client library to query the remote server for the last data point’s time then query the local database from that time and write points to the server:
SELECT * FROM "measurement" WHERE time > '2017-03-16T21:37:37.594490112Z' GROUP BY * LIMIT 5000
The problem is that this query takes 12 seconds on the Pi , which seems very long. It takes 650ms on my computer which also seems long. If I query a single field for 5000 points, it only takes 600ms on the Pi. By monitoring the system statistics, the above query is processor bound (30% to 40% CPU usage) and not I/O bound.
So, here are my questions:
- Is this an issue with Influx, my strategy or my setup?
- I thought about re-writing my program to query each field separately using Go functions, then “fan-in” the results to a single batch write. Is this more efficient than what Influx is doing internally?
- I have also thought about using GROUP BY time, but this seems more difficult as the number of points could vary versus using LIMIT. I would also get lots of empty results with gaps in the data. Is this a better strategy?
- Are their any other strategies that I am missing?