Usage of If Else with map or reduce

I am trying to write a query to summarize the monitoring status of an alert task but I keep getting ‘expected int but found bool’. This was the same error whether I tried map or reduce functions. I saw this documentation but it is opposite of what I am doing, turning floats into strings. What am I missing? I plan on writing these values to another bucket because I cannot access _monitor without giving Grafana a full access token. It also helps me determine if there is a lack of records (with the count) and a weighted value which hopefully won’t flap.

from(bucket: "_monitoring")
  |> range(start: -1d)//-task.every)
  |> filter(fn: (r) => r["_measurement"] == "statuses")
  |> filter(fn: (r) => r["site_code"] == "FM")
  |> filter(fn: (r) => r["stage"] == "2")
  |> group(columns: ["site_code", "stage", "model", "component", "data_point"])
  |> reduce(
        fn: (r, accumulator) => ({
        statusValue:
            if r["_level"] == "OK" then accumulator.sum + 3
            else if r["_level" == "WARN"] then accumulator.sum + 2
            else if r["_level" == "CRIT"] then accumulator.sum + 1
            else accumulator.sum + 0,
        count: accumulator.count + 1
        }),
        identity: {statusValue: 0, count: 0 }     
     )

OK so I solved part of the problem…reduce will not work with that syntax below.

I now have:

 if r._level == "OK" then accumulator.sum + 3

and I am getting:

 type error @9:6-19:7: expected A but found {statusValue:int, count:int} for return type (argument fn)

I have the identity information defined…

OK, so one step forward. You cannot name the accumulator variables whatever you like with the identify definition. I cannot find any documentation that specifies what names are available to me though either. The ones I have seen in examples are sum, count, total and product.

I am not sure if Map or Reduce would be better here yet. Also, the reduce has to come before the grouping.

Ok so this is what I have. I want to write this to the database as it will make for a very easy query from grafana. It will also insulate my custom panels from column name changes etc if need be.

option task = {name: "Testing", every: 30s, offset: 30s}

from(bucket: "_monitoring")
    |> range(start: -5m)//-task.every)
    |> filter(fn: (r) => r["_measurement"] == "statuses")
    |> filter(fn: (r) => r["site_code"] == "FM")
    |> filter(fn: (r) => r["stage"] == "2")
    |> filter(fn: (r) => r["_field"] == "value")
    |> map(fn: (r) => ({
        r with
        statusValue:
            if r._level == "ok" then 3
            else if r._level == "warn" then 2
            else if r._level == "crit" then 1
            else 0
    }))
    |> group(columns: ["site_code", "stage", "model", "component", "data_point"])
    |> reduce(fn: (r, accumulator) => ({
        sum: accumulator.sum + r.statusValue,
        count: accumulator.count + 1
        }),
        identity: { sum: 0, count: 0 }
    )
    |> map (fn: (r) => ({
        r with
        _time: now(),
        _measurement: "status" 
     }))
    |> to(
        bucket: "crec_wte", 
        org: "crec", 
        fieldFn: (r) => ({"sum": r.sum, "count": r.count})
    )

The query takes .71s according to multiple runs in the UI. Dirty way of performance testing. Two maps aren’t helping. The first is the largest of course. Anyone want to provide any feedback?

Hello @kramik1,
Sorry can you please help me out a bit?
Are you just looking for advice on query optimization?

Yes, I figured out the if else problem. I am thinking if I want to optimize it (remove the first map), I should move the task away from the monitor feature and just write to my own bucket with the level already set numerically instead of with a string. I am expecting to have hundred of these types of tasks in the future so removing the first map will compound the savings in cpu usage.

@kramik1,
Do you mean write levels other than “ok”,“crit”,“info” etc? And assign numerical values instead?

Yes, numerical values instead, don’t really need the string representations.

Hello @kramik1,
yes then I agree you might want to bypass using the monitor.check() functions and write data to a separate measurement in the _monitoring bucket (or wherever you want). This is an interesting use case though.

This issue is related to what you’re requesting. I encourage you to comment on it:

@kramik1,
I’ve shared your query with the flux team to see if they can recommend any obvious optimizations that I’m not seeing. To help that effort can you please include some information:

  • How many rows of data are you querying for?
  • What is your execution time expectation?

Finally, have you tried using the Flux profiler to optimize your query?
https://docs.influxdata.com/influxdb/v2.0/reference/flux/stdlib/profiler/

Thanks for the follow ups. I commented on the first issue above and I will need to take a look at the profiler.

Right now I run the above query every 5 seconds with a 5 minute window which should have 10 records per a monitored item. I expect to have many hundreds of monitored items as the project grows. I query the last record for a large group of items on Grafana to get the current status which of course is very fast since I preprocess it using this query. So I could be mapping 5,000 values every 5 seconds in the near future.

I will run it through the optimizer but in the future I could move to using my own monitoring tasks to get rid of the map. I don’t need to early optimize though.