Number Taxi rides

A taxi makes 10 trips a day. Engine RPM is detected. The journey must be numbered/marked so that each journey can be calculated separately.

Whenever the engine RPM goes from 0 to >0, a new journey begins.

How can I create a key/field with Flux that is incremented for each trip? Or how can I identify each ride?

from(bucket: “taxirides”)
|> range(start: today() )
|> filter(fn: (r) => r._measurement == “Taxi” and r._field == “EngineRPM” r._ride < 1)
???

One could also think of machine times to determine the running times.

Hello @Ulrich,
I would assign a state to the values.

|> map(fn: (r) => ({
    r with
    _level:
      if r._value >= 0.0 then "crit"
      else "ok"
    })
  )
|> monitor.stateChanges(toLevel: "crit")
|> count() 

Let me know if that helps.
Thanks!

Hello Anaïs

many thanks for the help. Now I have the number of trips, but I can’t determine any other information.

I would like to count up one _tag per trip so that I can evaluate one trip at a time. How long did it last? How many km have been driven? What was the average speed?

It would be perfect if each state was numbered.

0 - engine rpm == 0
1 - motor rpm != 0
1 - motor rpm != 0
1 - motor rpm != 0
2 - engine rpm == 0
2 - engine rpm == 0
3 - motor rpm != 0

Then every odd number would be a ride and every even number a rest.

I would have to compare the current value with the previous value in a loop and increment a counter at 0 and not 0 and write it back to the table.

I’m surprised no one has already made this requirement. So you could evaluate machine times or flight plans…

I hope I have made myself clear. My English is not perfect.

@Ulrich
In this post, I helped the person calculate the time spent in a given state. Did you see it / try it already?

I am sure what you are trying to do is possible in Flux. I will spend some time later if I can.

Hello Grant,

that would be very nice. Thanks in advance.

I want to do another example:

A light is switched on and off again and again. The current power consumption is measured. In the table, the times and amperes are recorded.

I would like to show in a table (Grafana) how often the lamp was on today, when it was on and how much electricity it used. And I want to know how long it was out.

But not aggregated, but individually for each phase.

0 amp is off and something >0 is on.

I am very happy if you push me in the right direction.

Hi @Ulrich

Can you clarify what do you mean by this?

Do you want a table like this (for a given time selection window in Grafana, e.g. past 24 hours, past 7 days, etc).?

Time On (seconds) Time Off (seconds) kWh used
SwitchA 20160 9000 35
SwitchB 18600 46500 22

Table:

time ampere volt
2023-03-01T09:00:00Z 0 0
2023-03-01T10:00:00Z 0 0
2023-03-01T10:00:01Z 1 12.0
2023-03-01T10:00:02Z 1 12.2
2023-03-01T10:00:03Z 0 0

2023-03-01T10:12:00Z 0
2023-03-01T10:12:01Z 1 12.2
2023-03-01T10:12:02Z 1 12.2
2023-03-01T10:12:03Z 1 12.5
2023-03-01T10:12:04Z 0

Result:
start stop sum(ampere) mean(volt)
2023-03-01T09:00:00Z 2023-03-01T10:00:00Z 0 0
2023-03-01T10:00:01Z 2023-03-01T10:00:02Z 2 12.3
2023-03-01T10:00:03Z 2023-03-01T10:12:00Z 0 0
2023-03-01T10:12:01Z 2023-03-01T10:12:03Z 3 12.3

OK, that helps. How is the above value (12.3) calculated? The mean of 12.0 and 12.2 does not equal 12.3

Oh, sorry. My fault. Of course it should be 12.1

:slight_smile:

Hi @Ulrich

Before we do the light switch example, let’s go back to the taxi ride.

I created some fake data in my system. Does this look OK for demonstration purposes? If yes, we can write queries to evaluate each trip’s distance (km driven) and the average speed for that trip.

Perfect. Both examples are the same in that over time something supplies data multiple times and one cycle of the data should be used in each case.
The columns are sometimes calculated with difference(), sometimes with mean() or with sum(). Everything is summarized with a column != 0.
I look forward to your example. I think that should definitely be included in the documentation.

Hi @Ulrich

You are right…this is a tough nut to crack. All the solutions I can find dealing with state changes are about aggregating the whole dataset. What we want to do here is separate out each chunk of records where the EngineRPM == 1 and perform some math on the other columns (Odometer, VehicleSpeed).

@Anaisdg or @scott

Is there a way in Flux to do this? See my annotated screenshot below. How does one “ungroup” the data into two tables (the yellow and the green data)?

Maybe it’s only possible with a special function. Am I really the first to have this requirement? I can hardly imagine that, because especially with IoT there are many devices that want to be viewed according to their operating status.

I really hope there is a solution. Many thanks for your support.

Similar use cases come up a lot and I would summarize them as “detecting cycles and grouping by cycles.” Something I’ve been advocating for in Flux for a long time, but it hasn’t been added yet is a version of map() or something similar that can use values from previous rows.

There is a proposal for a scan function that would do just this. In the proposal, there’s an example of how you could use scan() to detect phases and increment a phase ID when a new phase starts. EPIC: scan function · Issue #4671 · influxdata/flux · GitHub

Unfortunately it’s not available at this time.

Thanks @scott

I told @Ulrich Ulrich that I was 95% sure this could be done…glad I hedged and assumed there was a 5% chance that it could not.

Thanks to you both for the help and clarification.

Sometimes 5% can decide everything. :slight_smile:

The SCAN() function described on Github prepares exactly what I described at the beginning. Once the data is grouped about a license plate, everything else can be determined.

If variables could be reassigned, it would already be possible today. So you could remember _value and increment a counter and write it back. This could be implemented as a trigger.

What can we do to ensure that this proposal is heard and implemented as soon as possible?

ChatGPT:

"Oh, I see what you mean now! In that case, you could adjust the query to calculate the number of trips based on the duration of the non-zero values as follows:

scss

from(bucket: "myDatabase")
  |> range(start: -1h)  // Query the last hour
  |> filter(fn: (r) => r._measurement == "taxiData" and r._field == "rpm")  // Select only the RPM data
  |> map(fn: (r) => if r._value > 0 then 1 else 0)  // Return a 1 if RPM > 0, otherwise 0
  |> derivative(unit: 1s, nonNegative: true)  // Calculate the rate of change to determine the duration of the ones (i.e. the trips)
  |> filter(fn: (r) => r._value == 1)  // Select only the values with a rate of change of 1
  |> map(fn: (r) => 1)  // Return a 1 to count the number of trips
  |> sum()  // Calculate the total number of trips

In this query, we first query the data from the last hour in the “myDatabase” bucket and filter down to only the datapoints with the “rpm” field. We then map each value to a 1 if it is greater than 0, or a 0 otherwise, which creates a sequence of 1’s and 0’s indicating whether the taxi was driving at that point in time or not. We then calculate the rate of change (i.e. the duration) of the 1’s per second using the derivative function and select only the values with a rate of change of 1 (i.e. a transition from 0 to 1). Finally, we return a 1 to count the number of trips and calculate the total number of trips using the sum function."

ChatGPT:

To calculate the average speed per trip, you can use the following query:

from(bucket: "myDatabase")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "taxiData" and (r._field == "rpm" or r._field == "speed"))
  |> group(columns: ["_start", "_stop", "tripId"])
  |> map(fn: (r) => ({
      r with
      tripDuration: float(v: duration(v: r._stop) - duration(v: r._start)),
      tripDistance: last(r._value) * float(v: duration(v: r._stop) - duration(v: r._start)) / float(v: 3600),
      avgSpeed: if last(r._field) == "speed" then last(r._value) else r._value / last(r._value) * tripDistance / tripDuration
    }))
  |> keep(columns: ["tripId", "avgSpeed"])
  |> group(columns: ["tripId"])
  |> mean()

In this query, we first filter down to the “rpm” and “speed” fields for the “taxiData” measurement. We then group the data by a unique identifier for each trip, which we can derive by assigning a unique ID to each trip based on the transitions in the “rpm” data (similar to the previous query).

We then map each group to include the trip duration, trip distance (calculated as the last speed value multiplied by the trip duration), and the average speed (calculated as the distance divided by the duration). We use an if statement to handle cases where the last field is “speed” (i.e. the taxi was still moving at the end of the trip).

We then group the data by the trip ID and calculate the mean of the average speeds for each trip.

Note that this query assumes that the speed data is recorded in km/h. If it’s recorded in a different unit, you will need to adjust the calculation accordingly.

Hi @Ulrich

Did the ChatGPT solution work?

To count the number of trips, I have received a working solution. Slowly but surely, ChatGPT understands what I am looking for and provides me with solutions. I still need to test them all. I am amazed at what ChatGPT can do.

from(bucket: v.defaultBucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "97912487" and r._field == "S1_PID_0C_EngineRPM")
  |> map(fn: (r) => ({ r with _value: if r._value > 0 then 1 else 0 }))
  |> derivative(unit: 1s, nonNegative: true)
  |> filter(fn: (r) => r._value == 1)
  |> map(fn: (r) => ({ r with _value: 1 }))
  |> sum()