influx 1.2.2, grafana 4.x (any version 4 i tried )
I’ve had this problem for a while and have been trying to figure out a solution. I’m passing a large amount of guids to a regex expression, and it is taking a long time for the query to run 7 to 8 minutes. This works fine grafana 3.2 I can’t see the difference when grafana makes the call between the versions. Doing a “show queries” in the influx the query looks a lot like this 10 guid query excepted it has 3000 guids
(When i do a show queries from influx i see this query)
SELECT sum(success) / sum(total) * 100.000 AS success FROM rates WHERE id =~ /(3897f423-a4c1-4db2-b597-b2aee9485857|f3470a5f-20e6-496b-a490-de4541796748|2578a079-3dc6-448f-93e0-18915f429566|827f3a06-c9a4-4355-93a9-2202798441d5|fg013f23-62a8-4e10-94cd-f49ed3364b4d|530aed30-a58e-4b1f-bffa-7e06e885a55c|c908bad0-527b-483b-83f2-c18ee5179318|67b5cf45-b205-4295-820c-eba3515e713d|g096c2c0-b4c0-4db1-8304-dcea87c6ff99|25497141-5af2-4f0d-89b5-bb83b003a8f0)/ AND time > now() - 7d GROUP BY time(1d)
Two reasons I have to do the regex. 1) I need to be able to do grouping on the data after the collection so I can’t do it with tags. There is no in clause in influx SQL. Even if there were, that would not solve my problem because of the first reason.
Any idea how to figure out what’s wrong with this? Is the query OK?
If id is a field, then influx can’t use the tag index to quickly look up data, and it will need to scan every point to compare the entire regex against the id field.
The first thing I’d try, if possible, would be to change the query from id =~ /(foo|bar|...)/ to (id = 'foo' OR id = 'bar' OR ...). It’s hard to say without benchmarking, but if you’re looking for exact values, direct string comparison may go faster than executing a regular expression against each point.
I’m using grafana and the query is what grafana sends to influx. The OR thing would be difficult to do. In the git hup issues, doing the regex was a recomendation to work around the “in” clause missing in the influx SQL
here is the grafana query
SELECT sum(success) / sum(total) * 100.000 AS success FROM rates WHERE id =~ /$idFilter/ AND time > now() - 7d GROUP BY time(1d)
funny thing is, it works fine in grafana 3.2, it works really slow in grafana 4.x. with the same version of influx 1.2.2
After many hours of trying to figure what is wrong with this, I finally figured out what is happening, in grafana 3.2 the list is sorted and in 4.6 the list of guids is unsorted.
I ran with 3.2 and 4.6 and found the query in each influx log. I cut out the queries and it ran this script
function timer()
{
if [[ $# -eq 0 ]]; then
echo $(date ‘+%s’)
else
local stime=$1
etime=$(date ‘+%s’)
if [[ -z "$stime" ]]; then stime=$etime; fi
dt=$((etime - stime))
ds=$((dt % 60))
dm=$(((dt / 60) % 60))
dh=$((dt / 3600))
printf '%d:%02d:%02d' $dh $dm $ds
the 3.2 completed in 2 seconds
the 4.6 completed in 2:08 (two minutes 8 seconds)
looking at the queries they appear to be identical.
As I look for differences I noticed that the 3.2 quids are sorted in ascending order. The 4.6 are not. That looks like the only difference. I wrote a tool that sorted the guid from the 4.6 query and re-submit it with the quids in sorted order. This new sorted list worked in 1 second.
This means that an unsorted list of guid in a regex run slow and a sorted list of guid runs fast.