Hello everyone,
my first post here
Trying to figure out the things I can do with the Iox engine, and the very first thing that I think I’m missing is how to set up the session time zone for a query.
For example, I’d like to select data (with SQL) with a windowed function, let’s say with a 1 day window.
Or again, I write a simple select, including the timestamp in the select list, and a where clause on the timestamp as well.
I’d like them to automatically work with a desired timezone (and return the timestamps in that timezone as well), so that I don’t need to make awkward calculations with the UTC offset and the DST adjustment.
Is that possible? For example, in Postgres I would write something like “SET SESSION TIME ZONE ‘Europe/Berlin’”, or in Flux, a command like “option location = timezone.location(name: “Europe/Rome”)” was available.
Thank you!
Hello @Torakiki73,
Welcome! Thanks for giving 3.x a shot!
Unfortunately I don’t think there is that functionality. We use the datafusion SQL implementation under the hood and I don’t see that existing there in the reference:
https://arrow.apache.org/datafusion/user-guide/sql/index.html
I recommend just using to SQL to query your data and then transforming it with the python client library which supports pandas and polars.
Hello Anais,
thanks a lot for your answer!
Retrieving all the data points from Influx, and then transforming them with Python or something, is surely a viable option, even though I was hoping for something built-in
I have some concerns over the performances when big time spans are involved, covering maybe tens of thousands datapoints, but that’s something I’ll have to test.
Thanks again for your time!
Regards,
Alberto
Hello Anais,
I’m a “little” late :), but as I was checking the SQL documentation, at the page:
https://arrow.apache.org/datafusion/user-guide/configs.html
there’s something interesting about the time zone at session level, the “datafusion.execution.time_zone” parameter. I’m not sure how its value could be set, as it states that the default value is +00:00 (if it’s just an offset from UTC there would still be problems when the query crosses DST / Standard Time periods), anyway, can it be somehow set when working with InfluxDb queries?
Thanks again.
Alberto
@Torakiki73 The SQL documentation for InfluxDB v3 is still a work in progress and is missing a few things (that I’m working on updating). One of which is actually a solution to your problem!
The InfluxDB SQL implementation lets you update timestamps with AT TIME ZONE <tz string>
. So in a query, it would look like this:
Absolute time range
SELECT
time AT TIME ZONE 'America/Los_Angeles' AS time,
temp
FROM
home
WHERE
time >= ('2022-03-01T09:00:00Z' AT TIME ZONE 'America/Los_Angeles')
and time <= ('2022-04-01T09:00:00Z' AT TIME ZONE 'America/Los_Angeles')
Relative time range
SELECT
time AT TIME ZONE 'America/Los_Angeles' AS time,
temp
FROM
home
WHERE
time >= (now() AT TIME ZONE 'America/Los_Angeles') - INTERVAL '30 days'
Hello Scott,
thanks a lot for the precious info!
I’ve tried some queries but, either I’m missing something, or there’s something fishy with the “AT TIME ZONE” clause; when querying with ‘Europe/Rome’ for example. It seems like it applies the offset from UTC the opposite way, so for example, for today’s data, it takes away 2 hours from UTC, instead of adding them.
Could you help me with that?
I’m taking advantage of this, to ask you if the queries crossing DST/Standard Time periods consider the different offsets properly, for the data points before and after the DST/Standard or Standard/DST change.
Thank you very much.
Alberto
@Torakiki73 Can you provide your query?
@scott
Hello Scott, here’s a simple query.
I would expect, for today’s data for example, to get a time02_cet being two hours ahead, instead it’s two hours behind. Seems like the current logic behind the “AT TIME ZONE ‘timezone’” is something like “you’re telling me that the time on influx is in ‘timezone’ instead of UTC, and I convert it to UTC”.
SELECT
time as time01_utc
, time at time zone 'Europe/Rome' as time02_cet
, value_numeric
FROM
datapoints
WHERE
time >= now() - interval '1 hour'
ORDER BY
time
LIMIT 100
Can you also share your result set?
Ok, I’ve had to do a little experimenting, but I think I understand the behavior a bit better. AT TIME ZONE
essentially adds the tz offset to the timestamp, but still returns the UTC time stamp. For example:
SELECT *
FROM
(VALUES ('2024-01-01 12:00:00'::TIMESTAMP),
('2024-01-01 12:00:00 +01:00'::TIMESTAMP),
(('2024-01-01 12:00:00'::TIMESTAMP AT TIME ZONE 'Europe/Rome')::TIMESTAMP )
) AS data(time)
Returns:
time |
---|
2024-01-01T12:00:00.000Z |
2024-01-01T11:00:00.000Z |
2024-01-01T11:00:00.000Z |
So the timestamp it returns represents the UTC time it was at the time of the specified timestamp in the specified timezone. So with your query (using the first row in the result as an example), when you apply AT TIME ZONE
to 2024-04-17T14:59:56Z
, it replaces the Z
with the offset of the time zone (+02:00
), which results in the UTC timestamp of 2024-04-17T12:59:56Z
. Meaning, at 14:59:56
in Rome, it was 12:59:56
UTC.
With this understanding, the AT TIME ZONE
operator is probably really only useful in the where clause, letting you pass local time into the where clause. But InfluxDB will always return UTC timestamps.
Hello Scott,
this is what I “suspected”, and wrote in one of my previous message.
Unfortunately, this “at time zone”, unlike the same clause present in SQL Server, is useless for my needs, but I also believe it would be useless for the majority of users, as I think everybody saves his data in UTC already. I can’t think of any real case scenario in which you save data in non-UTC timezone, and then query them needing the UTC, or needing to add a where clause involving UTC dates.
Anyway, thanks a lot for your answers and the effort Scott!
Hopefully a new time zone feature will be added, allowing to set the query location, like the command available in Flux “option location = timezone.location(name: “Europe/Rome”)”.
Thanks again.
Alberto