DataFrameClient Aggregated query how to group by Tags and assign it to pandas dataframe

Ashish_Sikarwar · January 17, 2020, 4:50pm

Hello,

How can i set time as an index column and assign grouped by tag data to pandas data-frame?

I need a following output:

time, Percent_Processor_Time, host
2020-01-17 15:38:00, 10, server01

but df.head(5) did not show column time and host only showed Percent_Processor_Time.

Code snippet:

cli = DataFrameClient(‘influxserver’, 8086, “username”, “influxdb”, “password”, ssl=True)

Groupedby_Query = ‘’‘SELECT mean(Percent_Processor_Time) AS “Percent_Processor_Time” FROM “Processor”
WHERE time > now() - 1h AND instance = ‘_Total’ group by time(1m),host’‘’
datasets = cli.query(Groupedby_Query, chunked=True,chunk_size=100000)
column = next(iter(datasets))
df = datasets[column]
df.index = df.index.tz_localize(None)

Thanks

Anaisdg · January 18, 2020, 12:13am

Hello @Ashish_Sikarwar,
Thanks for your question. First, I just want to know–have you considered using v2.0? The python pandas client for 1.x is no longer maintained. I recommend checking out this tutorial for pandas and Influx v2.x. If you are married to using 1.x, can you please share the output of:

the raw query in Influxdb
print(datasets)
Thanks!

Ashish_Sikarwar · January 20, 2020, 11:14am

Hello @Anaisdg,
Thank you very much for your reply.

Yes, i am considering v2.0 on my lab instance.
I am using conda environment so i guess i just need to install “influxdb_client”.

At the same time if i can make the query work with 1.x then it will help completing the work i started with.

Raw query in influxdb

SELECT mean("Percent_Processor_Time") AS "mean_Percent_Processor_Time" FROM "influx_server"."autogen"."Processor" WHERE time > now() -15d GROUP BY time(1h), "host" FILL(none)

Example:

print(datasets)
print_datasets.txt (3.3 KB)

Thanks
Ashish

Anaisdg · January 22, 2020, 4:37pm

Hello @Ashish_Sikarwar,
I need some time to try and figure this out. I can’t get that client to work for me rn–running into Pandas 0.24.0 TypeError TzInfo · Issue #676 · influxdata/influxdb-python · GitHub. Did you run into that? What environment are you working in?

Ashish_Sikarwar · January 23, 2020, 9:59am

Hi @Anaisdg,

Sure take your time.
I am using Windows environment.
Python 3.7.4
Pandas 0.24.2
Influx 1.7

Thanks
Ashish

Ashish_Sikarwar · January 24, 2020, 11:31am

fyi

It has led to another issue.
I must be able to access time column after we get data using DataFrameClient and assign frequency to time column, Without freq one cannot fully realize the potential of statsModels.

df.index.freq = 'MS' OR df.index.freq = 'D'

anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:219: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
  ' ignored when e.g. forecasting.', ValueWarning)

Without frequency x axis will be based on intergers instead of date which will prevent me to test it with training_data and test_data

Before commenting prediction

After commenting the prediction:

Thanks
Ashish

Topic		Replies	Views
How data from influxdb using python-influxdb client can be vizualized? Dashboards influxdb , time-series	0	1111	August 14, 2018
Why _time column has two values and how to get just one Fluxlang influxdb , client-libraries , flux , python	7	999	July 6, 2023
Influxdb group by time() not working	3	1797	July 30, 2019
GROUP BY time leaving tag - mixing aggregate and non-aggregate queries is not supported InfluxQL influxdb , influxql	7	2834	July 8, 2021
Index problem with Pandas dataframe and writing to the Influx API InfluxDB 2	1	435	September 5, 2023

DataFrameClient Aggregated query how to group by Tags and assign it to pandas dataframe

Related topics