Slow queries when joining with SQL

Hello,
I’m following the docs about querying SQL data sources.
Separately, the SQL and InfluxDB data retrieval take less than 1 second.
When I perform the join (on a single id field, as in the docs example) the query takes about 30 seconds.
Is there something I can do to improve performances?
Thanks

Hello @eloparco,
Cool question!
Can you please provide your Flux query as well as a description of the data you’re returning from your SQL source?
Also, what are you doing with SQL + Flux? Sounds interesting. I’d love to learn more.

Hi @Anaisdg,
it’s basically something similar to the example in the docs. The solution that I found in my case consists in adding an aggregation for the InfluxDB data before the join with the SQL data, to reduce the number of joins needed.

Considering the example with sensors in the docs and the goal of getting the average temperature per sensor: instead of doing the join and then calculate the average, I had to group per sensor, calculate the average temperature and only then perform the join on the sensor_id field.

Of course, in my specific case, it wasn’t that easy. I needed to add an additional tag to do some aggregation before the join, but the point was still to reduce the number of joins through some grouping operations performed ahead.

Anyway, thanks for the interest! I’ve been working for a few months with InfluxDB and I’m finding a lot of useful resources here in the forum :slight_smile:

1 Like

@eloparco,
Great! I’m happy you found a solution.

I’m running in to the same problem on InfluxDB 2.1. I have an SQL query and Influx query that both run very quickly, (<1s), but calling a join on them spikes CPU usage up majorly, and I haven’t had enough patience to even see it finish (after waiting 10+ minutes.) My situation is very close to the example in the docs. I have sensors recording their raw info to InfluxDB, then I want to join metadata from our SQL database to that to do rollup calculations every 5 minutes in an Influx task. I have already rolled up the timeseries data so I only have one point per each sensor before trying the join. I feel like something must be going wrong, because this is a simple single column join, which should not take this crazy amount of time. Not sure if anyone has any advice, or if I just need to do the processing outside of Influx and push the new points back in.