Why client returns 2 dataframes?

klyk · August 11, 2022, 11:18am

The question is why a query returns multiple dataframes instead of one.
Each dataframe contains several tables (table column`, where a term Table, as far as I see, is to group timeseries.
Consider an example:

import influxdb_client

bucket = "aaa"
org = "bbb"
token = "<my token from secrets>"
# Store the URL of your InfluxDB instance
url="https://us-west-2-2.aws.cloud2.influxdata.com/"

client = influxdb_client.InfluxDBClient(
   url=url,
   token=token,
   org=org
)

query_api = client.query_api()

query = """from(bucket: "<bucket>")
  |> range(start: -10m)
  |> filter(fn: (r) => r["_measurement"] == "xxx")
  |> filter(fn: (r) => r["_field"] == "mean_yyy")
  |> aggregateWindow(every: 5m, fn: mean, createEmpty: false)
  |> yield(name: "mean")
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")"""
dataframes = query_api.query_data_frame(query)

What I get is two dataframes with the following columns:

Index(['result', 'table', '_start', '_stop', '_time', '_value', '_field',
       '_measurement', 'host_id'],
      dtype='object')
# and
Index(['result', 'table', '_start', '_stop', '_time', '_measurement',
       'host_id', 'mean_yyy'],
      dtype='object')

More than that, _value from the first dataframe seems to have the same values as mean_yyy from the second.

If I request 2 fields instead of one, this will generate 1400 dataframes. Looks really confusing

Anaisdg · August 12, 2022, 6:08pm

Hello @klyk,
You return a dataframe for each table stream result.
You have one result with the name “_mean” from the |> yield(name: "mean") and a second one with the name “_result” (default name) from the last line.
Does that make sense? Comment out the yield() function and I believe you’ll have what you expect.

klyk · August 16, 2022, 9:25am

Kind of a bit clearer now, but why does it emits 1400 dataframes if I specify something like |> filter(fn: (r) => r["_field"] == "mean_yyy" o r["_field"] == "mean_zzz")? So I’ve added or for the field name for one of the filters

Anaisdg · August 16, 2022, 5:25pm

I believe each table in the stream returns a dataframe.
You must have a ton of tags? How many tags do you have for those two fields?

Topic		Replies	Views
Python Client 'query_data_frame' does not return multiple DataFrames Client SDKs client-libraries , flux , python	1	879	December 14, 2021
Multiple fields returned in single table from pandas InfluxDB 2 influxdb , flux , pandas	0	741	August 12, 2021
Return data query_api.query_data_frame() are vary from time to time given the query only use absolute values InfluxDB 2 influxdb , influxql , query , influxdb-cloud-2-0 , python	1	1168	June 14, 2022
Why _time column has two values and how to get just one Fluxlang influxdb , client-libraries , flux , python	7	1001	July 6, 2023
Help with data aggregation InfluxDB 2 query , flux	3	314	January 10, 2024

Why client returns 2 dataframes?

Related topics