Hi,
The query that i facing execution time issue is
import "regexp"
import "strings"
Customer_assetIDs1_raw = from(bucket: "DATA_DB")
|> range(start: 1970-01-01T00:00:00Z, stop: 2022-10-09T11:29:01Z)
|>filter(fn: (r) => r["_measurement"] == "123_CUST")
|>filter(fn: (r)=> r["_field"]=="sub_start_date")
|>toTime()|>filter(fn: (r)=> ( r["_value"] >= 2020-07-01 ))
|>last()
|>group()
|>keep(columns: ["assetId"])
|>distinct(column: "assetId")
|>findColumn(fn: (key) => true,column: "_value")
# Here we are getting 2700 assetIds
Customer_assetIDs1 = if length(arr: Customer_assetIDs1_raw)>0 then regexp.compile(v: "(${strings.joinStr(arr: Customer_assetIDs1_raw, v: ")|(")})") else regexp.compile(v: "''")
Customer_assetIDs2_raw = from(bucket: "DATA_DB")
|> range(start: 1970-01-01T00:00:00Z, stop: 2022-10-09T11:29:01Z)
|>filter(fn: (r) => r["_measurement"] == "123_CUST")
|>filter(fn: (r)=> r["_field"]=="sub_start_date")
|>toTime()|>filter(fn: (r)=> ( r["_value"] <= 2022-06-01 ))
|>last()
|>group()
|>keep(columns: ["assetId"])
|>distinct(column: "assetId")
|>findColumn(fn: (key) => true,column: "_value")
# Here we are getting 2000 of assetIds
Customer_assetIDs2 = if length(arr: Customer_assetIDs2_raw)>0 then regexp.compile(v: "(${strings.joinStr(arr: Customer_assetIDs2_raw, v: ")|(")})") else regexp.compile(v: "''")
Customer_assetIDs3_raw = from(bucket: "DATA_DB")
|> range(start: 2020-11-08T11:29:01Z, stop: 2022-10-09T11:29:01Z)
|>filter(fn: (r) => r["_measurement"] == "123_CUST")
|>filter(fn: (r)=> r["_field"]=="sub_status")
|>last()
|>filter(fn: (r)=> r["_value"] == "OPEN")
|>group()
|>keep(columns: ["assetId"])
|>distinct(column: "assetId")
|>findColumn(fn: (key) => true,column: "_value")
# Here we are getting 2400 of assetIds
Customer_assetIDs3 = if length(arr: Customer_assetIDs3_raw)>0 then regexp.compile(v: "(${strings.joinStr(arr: Customer_assetIDs3_raw, v: ")|(")})") else regexp.compile(v: "''")
Customer_598732134 = from(bucket: "DATA_DB")
|>range( start:0, stop: 2022-10-09T11:29:01Z )
|>filter(fn: (r) => r["_measurement"] == "123_CUST")
|>filter(fn: (r) => r["_field"]== "plan_t" )
# Finally we are filtering the data with those 3 above created assetIds regex strings. If we filter with 1 regex string that created its also taking too much of time
|>filter(fn: (r)=> ( r["assetId"] =~ Customer_assetIDs1 ) and ( r["assetId"] =~ Customer_assetIDs2 ) and ( r["assetId"] =~ Customer_assetIDs3 ))
|>drop(columns: ["prop","_measurement"])
|>group(columns: ["_field","assetId"])
property_plan_type_11 = Customer_598732134
|> range(start: 1970-01-01T00:00:00Z, stop: 2022-10-09T11:29:01Z)
|>filter(fn: (r) => r["_field"]=="plan_t")
|>filter(fn: (r)=> ( r["_value"] != "Basic" ))
|>last(column: "_value")
|>pivot(columnKey:["_field"], rowKey:["assetId"], valueColumn: "_value")
|>drop(columns: ["_start","_stop","_measurement"])
|>rename(columns: {"plan_t":"plan_t"})
property_plan_type_11|>drop(columns: ["_measurement","_start","_stop","fkey"])
|>group()
|> limit(n:100000)
Firstly we were using for filteration using
contains()
But for small amount of filteration or a long, it was taking too much time. After that we read about Query Optimization link
Then we tried
regexp
It is working good for small amount of filteration regex string but taking too much time for long regex string.
So we are creating 3 regex string for filteration with final query and its taking more than 10 min and counting. I f we remove that filteration part its taking 6-10 sec. to execute.
So my question is how can we make this query fast. Do you have any other function or library .Please suggest