Newbie Question : Unable to do a GROUP by day count

Hello:

I’m creating a time series database to track the number of image pulls in our private docker repository.
I was able to import the logs from a PostgreSQL DB to InfluxDB2 using InfluxDBClient python library.

The schema is very simple. Below is basically how my data point looks like:

p = Point("image_pull")
p.tag("image_tag", image_tag)
p.time(pull_time)
p.field(cnt, 1)            # Always 1 and only 1 field

The data was successfully populated. I want to query for how many docker images were pulled for the past 2 days. I tried the following query (place in a file called query.txt):

from (bucket: "test_repos")
|> range(start: -2d)
|> window(every: 1d)
|> filter(fn: (r) =>
    r._measurement == "image_pull" and
    r._field == "cnt"
)
|> sum()
|> limit(n:5)

I queried by running:

influx query --file query.txt

What I don’t understand is:

  1. Why the influx return so many lines despite of the range(start: -2d) and the limit() statement
  2. I was still seeing 1 for _value:int field. I’m expecting it to be the total number of image pulls for day.

I’m expecting a result similar to the SQL statement below

SELECT
                       date(timestamp_field), count(*)
FROM
                       logs
GROUP BY
                      date(timestamp_field)

Thank you in advance

I made it work by adding drop() function to filter out the columns that I don’t need . Is this the standard way of doing things in InfluxDB? Or what is did is workaround? A little of counterintuitive coming from the background of SQL :slight_smile:

from (bucket: "test_repos")
|> range(start: -2d)
|> window(every: 1d)
|> drop("columns": ["user", "project", "version", "resource"])

|> filter(fn: (r) =>
    r._measurement == "image_pull" and
    r._field == "cnt"
)
|> sum()
1 Like