Python Client Writing Pandas DataFrames with STATIC tag sets

Hello again,

So, I have pandas dataframe with field sets and time column as index. There is no tag column in the dataframe. Here is an example;

                       open    high     low   close   volume
open_time                                                   
2021-12-20 14:13:00  1.2594  1.2594  1.2565  1.2571  55401.0
2021-12-20 14:14:00  1.2567  1.2592  1.2566  1.2585  32964.0
2021-12-20 14:15:00  1.2588  1.2675  1.2587  1.2652  51830.0
2021-12-20 14:16:00  1.2654  1.2668  1.2642  1.2651  28631.0
2021-12-20 14:17:00  1.2651  1.2666  1.2646  1.2666  21729.0

I want to write this df with tag sets obviously. I couldn’t find a way with write_api. Here is an example of what I’m trying to do;

I used ‘fixedtags’ as an example here.

influx_write_api.write('testBucket', influx_org, record=df, data_frame_measurement_name='price', fixedtags={'symbol': 'ABC','tag2':'tagValue2'})

Is this doable? I don’t want to modify dataframe by adding the same tags into thousands of rows since it is a waste of resources.

Default Tags

GitHub - influxdata/influxdb-client-python: InfluxDB 2.0 python client

Hey man,

Yeah I was just looking at the default tags but the script is running in a for loop, so tags are changing every iteration.

    for kline in klines:
        symbol = kline['symbol']
        df = pd.DataFrame(kline['klines'], columns=KLINE_LABELS)
        influx_write_api.write('testBucket', influx_org, record=df, data_frame_measurement_name='price', fixedtags={'symbol': symbol})

should I just modify the dataframe and then write it? I was hoping there might be a way without modifying

Here some untested pseudo-code:

for kline in klines:
    # add 'symbol' to KLINE_LABELS
    df = pd.DataFrame([kline['klines'], kline['symbol']], columns=KLINE_LABELS) 
    _write_client.write('testBucket', influx_org, record=df, 
        data_frame_measurement_name='price', data_frame_tag_columns=['symbol'])

Hi @gorkem,

you cannot define default tags for separate writes. You can use something like following, but it is not efficiency:

for kline in klines:
        point_settings = PointSettings()
        point_settings.add_default_tag("symbol", "ABC")
        point_settings.add_default_tag("tag2", "tagValue2")
        
        influx_write_api = self.client.write_api(write_options=SYNCHRONOUS, point_settings=point_settings)
        
        symbol = kline['symbol']
        df = pd.DataFrame(kline['klines'], columns=KLINE_LABELS)

        influx_write_api.write('testBucket', influx_org, record=df, data_frame_measurement_name='price', fixedtags={'symbol': symbol})

Please create a new issue if you are interest with possibility to specify default tags for every write. - Sign in to GitHub · GitHub

Regards

1 Like

I decided to not to write tags at all. With limited computational resources, it is not feasible to add millions of 3-5 characters (for symbols) into pandas dataframe for every write.

First consider if you really need to iterate over rows in a DataFrame. Iterating through pandas dataFrame objects is generally slow. Iteration beats the whole purpose of using DataFrame. It is an anti-pattern and is something you should only do when you have exhausted every other option. It is better look for a List Comprehensions , vectorized solution or DataFrame.apply() method for iterate through DataFrame.

Pandas DataFrame loop using list comprehension

result = [(x, y,z) for x, y,z in zip(df['Name'], df['Promoted'],df['Grade'])]

Pandas DataFrame loop using DataFrame.apply()

result = df.apply(lambda row: row["Name"] + " , " + str(row["TotalMarks"]) + " , " + row["Grade"], axis = 1)