Garbage performance for getting n amount of last elements from fields

I am running Influx Cloud 2.0.

I have a measurement where each field is a sensor posting data at differing intervals. For each field I want to get n amount of latest records.

For one record the flux query is:

from(bucket: "<bucket>")
    |> range(start: 0, stop: v.timeRangeStop)
    |> filter(fn: (r) => r._measurement == "<measurement>")
    |> last()

This executes instantly
but doing the same replacing last() with tail(n: 1) or a |> sort(columns: ["_time"], desc: true) |> limit(n: 1)takes over 7 seconds.

In InfluxQL (via the influx v1 shell and a DBRP) I get the same 7s performance when using:

SELECT LAST(*) FROM ProcessTimesPV

I cannot narrow the time window down. I cannot aggregate the values or lower the volume of data. I thought this to be a rather simple use case for a modern TSDB.
I assume the data is sorted by time in memory so why is it so hard getting the top elements.

According to Optimize Flux queries | InfluxDB OSS v2 Documentation the sort descending and limit should be a pushdown but apparently it’s not.

To further explain the goal is to convert this code to influx cloud V3 so the real solution should be in InfluxQL

Run EXPLAIN ANALYZE on the InfluxQL version, and get a sense of the amount of data it’s processing. You could post the output here so folks can take a look and perhaps make suggestions.

Querying for the last element in on one field gets me similarly poor performance:

select                                             ┃
┃      3┃    ├── execution_time: 16.439µs                       ┃
┃      4┃    ├── planning_time: 1.724303429s                    ┃
┃      5┃    ├── total_time: 1.724319868s                       ┃
┃      6┃    └── create_iterator                                ┃
┃      7┃        ├── labels                                     ┃
┃      8┃        │   └── measurement: ProcessTimesPV            ┃
┃      9┃        ├── cursors_ref: 1                             ┃
┃     10┃        ├── cursors_aux: 0                             ┃
┃     11┃        ├── cursors_cond: 0                            ┃
┃     12┃        └── planning_time: 51.216µs

Without the full EXPLAIN output, I can’t tell how much data you are processing. When you look at the EXPLAIN output, how many blocks are being decoded, for instance? What are the total bytes you are reading?