gorkem
December 20, 2021, 2:24pm
1
Hello again,
So, I have pandas dataframe with field sets and time column as index. There is no tag column in the dataframe. Here is an example;
open high low close volume
open_time
2021-12-20 14:13:00 1.2594 1.2594 1.2565 1.2571 55401.0
2021-12-20 14:14:00 1.2567 1.2592 1.2566 1.2585 32964.0
2021-12-20 14:15:00 1.2588 1.2675 1.2587 1.2652 51830.0
2021-12-20 14:16:00 1.2654 1.2668 1.2642 1.2651 28631.0
2021-12-20 14:17:00 1.2651 1.2666 1.2646 1.2666 21729.0
I want to write this df with tag sets obviously. I couldn’t find a way with write_api. Here is an example of what I’m trying to do;
I used ‘fixedtags’ as an example here.
influx_write_api.write('testBucket', influx_org, record=df, data_frame_measurement_name='price', fixedtags={'symbol': 'ABC','tag2':'tagValue2'})
Is this doable? I don’t want to modify dataframe by adding the same tags into thousands of rows since it is a waste of resources.
gorkem
December 20, 2021, 3:29pm
3
Hey man,
Yeah I was just looking at the default tags but the script is running in a for loop, so tags are changing every iteration.
for kline in klines:
symbol = kline['symbol']
df = pd.DataFrame(kline['klines'], columns=KLINE_LABELS)
influx_write_api.write('testBucket', influx_org, record=df, data_frame_measurement_name='price', fixedtags={'symbol': symbol})
should I just modify the dataframe and then write it? I was hoping there might be a way without modifying
Franky1
December 20, 2021, 4:36pm
4
Here some untested pseudo-code:
for kline in klines:
# add 'symbol' to KLINE_LABELS
df = pd.DataFrame([kline['klines'], kline['symbol']], columns=KLINE_LABELS)
_write_client.write('testBucket', influx_org, record=df,
data_frame_measurement_name='price', data_frame_tag_columns=['symbol'])
bednar
January 3, 2022, 10:01am
5
Hi @gorkem ,
you cannot define default tags for separate writes. You can use something like following, but it is not efficiency:
for kline in klines:
point_settings = PointSettings()
point_settings.add_default_tag("symbol", "ABC")
point_settings.add_default_tag("tag2", "tagValue2")
influx_write_api = self.client.write_api(write_options=SYNCHRONOUS, point_settings=point_settings)
symbol = kline['symbol']
df = pd.DataFrame(kline['klines'], columns=KLINE_LABELS)
influx_write_api.write('testBucket', influx_org, record=df, data_frame_measurement_name='price', fixedtags={'symbol': symbol})
Please create a new issue if you are interest with possibility to specify default tags for every write. - Sign in to GitHub · GitHub
Regards
1 Like
gorkem
January 8, 2022, 5:32pm
6
I decided to not to write tags at all. With limited computational resources, it is not feasible to add millions of 3-5 characters (for symbols) into pandas dataframe for every write.
First consider if you really need to iterate over rows in a DataFrame. Iterating through pandas dataFrame objects is generally slow. Iteration beats the whole purpose of using DataFrame. It is an anti-pattern and is something you should only do when you have exhausted every other option. It is better look for a List Comprehensions , vectorized solution or DataFrame.apply() method for iterate through DataFrame .
Pandas DataFrame loop using list comprehension
result = [(x, y,z) for x, y,z in zip(df['Name'], df['Promoted'],df['Grade'])]
Pandas DataFrame loop using DataFrame.apply()
result = df.apply(lambda row: row["Name"] + " , " + str(row["TotalMarks"]) + " , " + row["Grade"], axis = 1)