Last() returns first point instead of last

I have a problem with the last() function (InfluxDB 2.0.4-2, Arch Linux) when executed on a bucket stored on disk. The data ranging from 14 Feb 2021 - 14 Mar 2021 is from my smart energy meter.

The last data point was requested,

from(bucket: "last-test")
  |> range(start: -1mo)
  |> last()

but output of the incorrect first data point with timestamp 14 Feb 2021 (_time column) returned:

,result,table,_start,_stop,_time,_value,_field,_measurement
,,0,2021-02-13T10:38:06.059009644Z,2021-03-15T21:08:06.059009644Z,2021-02-14T23:00:00Z,1.565374469739827,bill,daily
,,1,2021-02-13T10:38:06.059009644Z,2021-03-15T21:08:06.059009644Z,2021-02-14T23:00:00Z,4.465795000000071,energy,daily
,,2,2021-02-13T10:38:06.059009644Z,2021-03-15T21:08:06.059009644Z,2021-02-14T23:00:00Z,0.2426,price,daily
,,3,2021-02-13T10:38:06.059009644Z,2021-03-15T21:08:06.059009644Z,2021-02-14T23:00:00Z,14.66,rate,daily
,,4,2021-02-13T10:38:06.059009644Z,2021-03-15T21:08:06.059009644Z,2021-02-14T23:00:00Z,4405.794864,total,daily

However, the correct last data point with timestamp 14 Mar 2021 is selected when the same dataset was not stored on disk, but directly given in a csv file embedded in the flux query:

import "csv"
data = csv.from(csv: "
...
...
")

data
  |> last()

Output:

,result,table,_start,_stop,_time,_value,_field,_measurement
,,0,2021-02-13T06:23:59.047896607Z,2021-03-15T16:53:59.047896607Z,2021-03-14T23:59:59.88Z,2.9910504875398063,bill,daily
,,1,2021-02-13T06:23:59.047896607Z,2021-03-15T16:53:59.047896607Z,2021-03-14T23:59:59.88Z,10.342448000000331,energy,daily
,,2,2021-02-13T06:23:59.047896607Z,2021-03-15T16:53:59.047896607Z,2021-03-14T23:59:59.88Z,0.2426,price,daily
,,3,2021-02-13T06:23:59.047896607Z,2021-03-15T16:53:59.047896607Z,2021-03-14T23:59:59.88Z,14.66,rate,daily
,,4,2021-02-13T06:23:59.047896607Z,2021-03-15T16:53:59.047896607Z,2021-03-14T23:59:59.88Z,4701.490756,total,daily

I have prepared an tar.gz archive (40 KB) with the dataset and all queries here (link valid until 31 Mar 2021).

Any ideas what could have gone wrong?

Found the reason, but still don’t understand why. It depends which column last() operates on. When I change

from(bucket: "last-test")
  |> range(start: -1mo)
  |> last()

to

from(bucket: "last-test")
  |> range(start: -1mo)
  |> last(column : "_time")

Then the correct last point in time is selected. I guess last() operates on _value by default.

However, I expect ´last()´ to operate on the ´_time´ column by default and ´min()´ and ´max()´ on the ´_value´ column. It also doesn’t explain why there is different behaviour of data stored in a bucket vs data stored in a variable. ´last()´ should return the same data point in both cases.

I suspect a subtle bug in the influxdb storage engine. Could someone please check this out.

Issue has been discussed here: https://community.influxdata.com/t/recent-edition-of-influxdb-2-0-2-last-function-not-work-as-expected/16942/5?u=pohl6532.

Seems last() function is broken since V2.0.2

Thank you for bringing this to our attention @pohl6532. A patch has been submitted that fixes this. The fix should be in the next release. Patch can be found here: https://github.com/influxdata/influxdb/pull/21140