DataFrameClient Aggregated query how to group by Tags and assign it to pandas dataframe

Hello,

How can i set time as an index column and assign grouped by tag data to pandas data-frame?

I need a following output:

time, Percent_Processor_Time, host
2020-01-17 15:38:00, 10, server01

but df.head(5) did not show column time and host only showed Percent_Processor_Time.

Code snippet:

cli = DataFrameClient(‘influxserver’, 8086, “username”, “influxdb”, “password”, ssl=True)

Groupedby_Query = ‘’‘SELECT mean(Percent_Processor_Time) AS “Percent_Processor_Time” FROM “Processor”
WHERE time > now() - 1h AND instance = ‘_Total’ group by time(1m),host’‘’
datasets = cli.query(Groupedby_Query, chunked=True,chunk_size=100000)
column = next(iter(datasets))
df = datasets[column]
df.index = df.index.tz_localize(None)

Thanks

Hello @Ashish_Sikarwar,
Thanks for your question. First, I just want to know–have you considered using v2.0? The python pandas client for 1.x is no longer maintained. I recommend checking out this tutorial for pandas and Influx v2.x. If you are married to using 1.x, can you please share the output of:

  1. the raw query in Influxdb
  2. print(datasets)
    Thanks!

Hello @Anaisdg,
Thank you very much for your reply.

Yes, i am considering v2.0 on my lab instance.
I am using conda environment so i guess i just need to install “influxdb_client”.

At the same time if i can make the query work with 1.x then it will help completing the work i started with.

  1. Raw query in influxdb
SELECT mean("Percent_Processor_Time") AS "mean_Percent_Processor_Time" FROM "influx_server"."autogen"."Processor" WHERE time > now() -15d GROUP BY time(1h), "host" FILL(none)

Example:

  1. print(datasets)
    print_datasets.txt (3.3 KB)

Thanks
Ashish

Hello @Ashish_Sikarwar,
I need some time to try and figure this out. I can’t get that client to work for me rn–running into Pandas 0.24.0 TypeError TzInfo · Issue #676 · influxdata/influxdb-python · GitHub. Did you run into that? What environment are you working in?

Hi @Anaisdg,

Sure take your time.
I am using Windows environment.
Python 3.7.4
Pandas 0.24.2
Influx 1.7

Thanks
Ashish

fyi

It has led to another issue.
I must be able to access time column after we get data using DataFrameClient and assign frequency to time column, Without freq one cannot fully realize the potential of statsModels.

df.index.freq = 'MS' OR df.index.freq = 'D'

anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:219: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
  ' ignored when e.g. forecasting.', ValueWarning)

Without frequency x axis will be based on intergers instead of date which will prevent me to test it with training_data and test_data

Before commenting prediction

After commenting the prediction:

Thanks
Ashish