Timestamp Issue When Using Epoch Dates in Pandas Dataframes With write_api

I want to use Epoch dates with second precision as timestamps in an InfluxDB2 database.

I am using the influxdb_client write_api to write a pandas dataframe.

When I examine the bucket that results from the write it seems that the timestamps are not accurate.

For example, here are six Epoch dates that I have used with the corresponding dates in readable format and then timestamps as reflected in the bucket when I explore the data

Epoch date 	Human-readable date (GMT) 	Influx _time
1614525576	2021-02-28 15:19:36		2021-02-28T16:11:40.564Z
1614520964	2021-02-28 14:02:44		2021-02-28T16:11:40.564Z
1614464526	2021-02-27 22:22:06		2021-02-27T23:55:40.556Z
1614276361	2021-02-25 18:06:01		2021-02-25T19:03:40.53Z
1609688700	2021-01-03 15:45:00		2021-01-03T18:11:39.904Z
1609682156	2021-01-03 13:55:56		2021-01-03T14:07:39.902Z

Here is the code I used to write the data to Influx with token, org and bucket strings redacted:

import pandas as pd
from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS

token = "redacted"
org = "redacted"
bucket = "redacted"

data = [[1614525576, 'Running', 3],  [1614520964, 'Cycling', 30], [1614464526, 'Running', 2], [1614276361, 'Cycling', 20], [1609688700, 'Running', 1],  [1609682156, 'Cycling', 10]]

df = pd.DataFrame(data, columns = ['Date', 'Activity', 'Distance']) 
df.set_index('Date', inplace=True)

client = InfluxDBClient(url="influxdb.itinker.net:8086", token=token, org=org)
write_api = client.write_api(write_options=SYNCHRONOUS)

write_api.write(bucket, org, df, data_frame_measurement_name='exercise', data_frame_tag_columns=['Activity'], write_precision='s')

write_api.close()
client.close()

print(df)
print(df.dtypes)

And here is the output from running the script

          Activity  Distance
Date
1614525576  Running         3
1614520964  Cycling        30
1614464526  Running         2
1614276361  Cycling        20
1609688700  Running         1
1609682156  Cycling        10
Activity    object
Distance     int64
dtype: object

Why is there a discrepancy between the Influx timestamp and the date I would expect from the Epoch dates I supplied?

Hi @RERobbins,

I just tried your code and everything seem to work correctly. Here is my python code:

import pandas as pd

from influxdb_client import InfluxDBClient, WritePrecision
from influxdb_client.client.write_api import SYNCHRONOUS

url = "http://localhost:8086"
token = "my-token"
org = "my-org"
bucket = "my-bucket"

data = [[1614525576, 'Running', 3], [1614520964, 'Cycling', 30], [1614464526, 'Running', 2],
        [1614276361, 'Cycling', 20], [1609688700, 'Running', 1], [1609682156, 'Cycling', 10]]

df = pd.DataFrame(data, columns=['Date', 'Activity', 'Distance'])
df.set_index('Date', inplace=True)

client = InfluxDBClient(url=url, token=token, org=org)
write_api = client.write_api(write_options=SYNCHRONOUS)

write_api.write(bucket, org, df, data_frame_measurement_name='exercise', data_frame_tag_columns=['Activity'],
                write_precision=WritePrecision.S)

write_api.close()

"""
Query with converted time to epoch
"""
query = f"""
from(bucket: "{bucket}")
  |> range(start: 0)
  |> filter(fn: (r) => r["_measurement"] == "exercise")
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
  |> map(fn: (r) => ({{ r with epoch: int(v: r._time) / 1000000000 }}))
"""

print(f'Query: \n {query}')

query_api = client.query_api()
tables = query_api.query(query=query, org=org)

# Flatten output tables into list of activities
activities = [{"epoch": row.values["epoch"], "Activity": row.values["Activity"], "Distance": row.values["Distance"]}
              for table in tables for row in table]

client.close()

print(f'DataFrame:')
print(df)
print(df.dtypes)
print()

print(f'Query Result:')
print()
line = map(lambda row: f"{row['epoch']}\t{row['Activity']}\t{row['Distance']}\t\n",
           sorted(activities, key=lambda x: x['epoch'], reverse=True))
print(''.join(line))

and here is an output:

Query: 
 
from(bucket: "my-bucket")
  |> range(start: 0)
  |> filter(fn: (r) => r["_measurement"] == "exercise")
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
  |> map(fn: (r) => ({ r with epoch: int(v: r._time) / 1000000000 }))

DataFrame:
           Activity  Distance
Date                         
1614525576  Running         3
1614520964  Cycling        30
1614464526  Running         2
1614276361  Cycling        20
1609688700  Running         1
1609682156  Cycling        10
Activity    object
Distance     int64
dtype: object

Query Result:

1614525576	Running	3	
1614520964	Cycling	30	
1614464526	Running	2	
1614276361	Cycling	20	
1609688700	Running	1	
1609682156	Cycling	10	

How do you perform an Query to InfluxDB?

Regards

I was exploring the bucket directly from the Influx admin interface. I didn’t write any code to do the query. I will use your work and see if I can replicate your results.

I was able to replicate your success with the query but I am still confused why the GMT time strings that show up when I browse the bucket from the Influx web interface are different than the time strings (UTC) that I started with.

I would like to experiment some more. Is the Epoch time stamp integer the only data format I can use for the time field with the write client when using Pandas data frames? Is it possible to use an ISO 8601 or RFC 3339 string instead?

I think I have answered my own question. When I examined the queries that were being generated by the query builder in the gui InfluxDB data explorer tool I noticed the use of window periods in the script editor that had the effect of truncating time stamps. So, for example, if I set the window period to 1s, it was clear that there was no problem with my time stamps being seen as written.

@bednar I noticed that you used

in your code in place of

and was wondering what the difference was between those two approaches.

Thank you very much.

– Rich

One thing to remember is that Python treats a time string with Z as local (server) time not UTC.

It is because the fromisoformat method is the inverse of isoformat and that method does not generate a date time string with a Z.

Whilst Z is correct according to the standard, InfluxData would be better to use +00:00.