How to get the most frequent record in influxdb

peaksnail · August 8, 2017, 9:59am

hi all,

In my case, there are many users and every one will upload their actions to influxdb.
and the influxdb data model like

event province city sex uid

Login provinceA cityA male userA
Buy provinceA cityA male userA
PageView provinceA cityA male userA
Login provinceA cityB female userB
Logout provinceB cityC male userC

the event, province, city, sex is tag key ,and uid is field key.
The uid will be uniq, so I put it in the field key to avoid series candidate issue.

Now I want to get the users who are frequent to upload their action data. and how to realize it in influxdb or kapacitor

in this case ,The userA is the most frequent users

thanks

sbains · August 8, 2017, 2:55pm

The count function will give you the list of the users (frequently visited users) and you can use the where clause to find the users in certain time period. The max function if clubbed with count will give you the top user.

peaksnail · August 9, 2017, 2:34am

hi,I have try to use count to get the users (sql: select count(“uid”) from measurement and time > now() - 1h group by time(1h) ), but failed.
With this example ,I want to get the result like

user count
userA 3
userB 1
userC 1

but the count function just get the result
time count
1502242645775000000 5

It can not get the users list

Could you show me the sql with this example ? Or the grammar of my sql is wrong ?

thanks

peaksnail · August 9, 2017, 2:46am

hi

The document of influxdb in Functions | InfluxDB OSS 1.3 Documentation

has explained that count " Returns the number of non-null field values."

So I doubt whether the count could get the list of the users
Or is there some other way to get it?

thanks

peaksnail · August 9, 2017, 3:12am

I found a similar topic in Multiple count and where in query

but it just has three value 0,1,3, so that can use tag with “group by” operation.
but in my case,the value will be millions,so it can not use tags .

thanks

sbains · August 9, 2017, 3:54pm

Try a nested query with count like below:

select top(count,,1) from (select count(“uid”) from measurement where time > now() - 1h group by time(1h))

Just replace the with a tag which is present in all the results.

peaksnail · August 10, 2017, 10:57am

hi
I am sorry that it can not get the right results.

The count function can not get user list,it can just count all record.

If I have 4 data belong to the same hour, like below

time	event	uid
12:01	PageView	userA
12:02	PageView	userB
12:03	PageView	userA
12:04	PageView	userC

the count function will get

time	count
12:00	4

But i want to get the result like

time	event	uid
12:00	userA	2
12:00	userB	1
12:00	userC	1

And the key is that uid is a field.

thanks

sbains · August 10, 2017, 3:08pm

Can you paste the result using the below query:

select top(count,event,1) from (select count(“uid”) from measurement where time > now() - 1h group by time(1h))

peaksnail · August 11, 2017, 1:30am

hi
It will panic when execute the sql ,error info below

[I] 2017-08-11T01:24:40Z SELECT top(count, source, 10) FROM (SELECT count(uid) FROM spider.autogen.app WHERE time > now() - 1h GROUP BY time(1h)) service=query
panic: runtime error: index out of range

goroutine 717 [running]:
panic(0xa05e00, 0xc4200120f0)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/influxdata/influxdb/influxql.filterIntegerByUniqueTags(0xc4213b0240, 0x2, 0x2, 0xc4222e5be8, 0x1, 0x1, 0xbaf1b8, 0x4139ce, 0x7fb3672d6690, 0x1100000038)
/root/go/src/github.com/influxdata/influxdb/influxql/call_iterator.go:1050 +0x2b6
github.com/influxdata/influxdb/influxql.NewIntegerTopReduceSliceFunc.func1(0xc4213b0240, 0x2, 0x2, 0xc421d03ba8, 0xc421d03c98, 0xc421d03c98)
/root/go/src/github.com/influxdata/influxdb/influxql/call_iterator.go:860 +0x45b
github.com/influxdata/influxdb/influxql.(*IntegerSliceFuncReducer).Emit(0xc420acbba0, 0xc421d03ba8, 0xc4207ca810, 0x3)
/root/go/src/github.com/influxdata/influxdb/influxql/functions.gen.go:595 +0x44
github.com/influxdata/influxdb/influxql.(*integerReduceIntegerIterator).reduce(0xc420b5adc0, 0x18, 0xc422862790, 0x7716b7, 0xc4228e70c0, 0xc42214f201)
/root/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:4131 +0x807
github.com/influxdata/influxdb/influxql.(*integerReduceIntegerIterator).Next(0xc420b5adc0, 0xc422862780, 0x0, 0xc42214f2c0)
/root/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:4044 +0xa1
github.com/influxdata/influxdb/influxql.(*integerInterruptIterator).Next(0xc420d66d80, 0xffffffffffffffff, 0x0, 0xc42214f260)
/root/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:3432 +0x52
github.com/influxdata/influxdb/influxql.(*bufIntegerIterator).Next(0xc420d66da0, 0x67ebf1297c25d8, 0x0, 0x0)
/root/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:2750 +0x3c
github.com/influxdata/influxdb/influxql.(*integerAuxIterator).stream(0xc420d66de0)
/root/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:3529 +0x32
created by github.com/influxdata/influxdb/influxql.(*integerAuxIterator).Start
/root/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:3515 +0x3f

And I try execute sql select count(“uid”) from app where time > now() - 1h group by time(1h)
the result like below
name: app
time count

1502409600000000000 7
1502413200000000000 27

there has no any tag in result ,how can the top function work for tag?
thanks

peaksnail · August 11, 2017, 2:48am

Hi

The info of panic is from version 1.2.I have upgrade to 1.3.2. Now the sql work normally.But the result may be wrong.

From the start,I insert 5 points into measurement called t, every point has tag p and s, and field uid, like

select * from t
name: t
time                p    s   uid
----                -    -   ---
1502419072967540554 pv   yzl usera
1502419075188749394 pv   yzl userb
1502419078856887159 pv   yzl usera
1502419080484044059 pv   yzl userc
1502419190701236366 open yzl userc

Second,I use the sql you give me

select top(count, p, 1) from (select count("uid") from t where time > now() - 1h group by time(1h))
name: t
time                top p
----                --- -
1502416800000000000 4

or

select top(count, s, 1) from (select count("uid") from t where time > now() - 1h group by time(1h))
name: t
time                top s
----                --- -
1502416800000000000 5

But I just want the users list who visit frequently ,like

time                 uid     count
----                 -----   -----
1502416800000000000  usera   2
1502416800000000000  userb   1
1502416800000000000  userc   2

thanks

sbains · August 11, 2017, 4:43am

I don’t this it can be achieved with this schema. You need to move this userid field as a “tag” value. Since the grouping option is only available via that

Topic		Replies	Views
Trying to get top 3 records on the basis of user and value? InfluxDB 1 influxdb	2	34	September 25, 2024
Trying to get Count Distinct across multiple columns Fluxlang influxdb , influxql , query	0	1446	March 22, 2019
Influxdb subquery to count top X	14	10953	August 11, 2017
InfluxDB Flux - get count of distinct occurrences Fluxlang influxdb	4	8457	July 30, 2020
How can I get only the total number using count InfluxDB 2	2	585	January 2, 2020

How to get the most frequent record in influxdb

Related topics