Problem with tasks and downsampling

aksonov · January 19, 2021, 4:36pm

Please help me to create following statistic - I need to count daily user registrations and output percentage of difference between current day and the average of 4 the previous weekdays.

So I’m trying to create task to downsample data first, like this:

But I see only one record with count instead of 30 or more rows (one per day) for existing data…

Anaisdg · January 20, 2021, 5:58pm

Hello @aksonov,
Welcome!

Hmm that’s very strange.
When I execute the same query for the last 30. days, I get 30 rows in each table where there is a table for each measurement.

Can you please share a screen shot of your raw data view? or table view?

Thanks

Anaisdg · January 20, 2021, 5:59pm

@aksonov ,
Alternatively, can you export some data to annotated csv? Maybe for just a couple of days and I can try to look at that?

Thank you.

aksonov · January 20, 2021, 7:27pm

How to do the export of data? When I run this query I’m getting 30 rows too (from bucket “Test”), but running task with that query returns just one row (“d” bucket)

aksonov · January 20, 2021, 7:28pm

It would be very useful for me to see some data example with similar use case (like user registration) and sample of downsampling task for another bucket where I can see number of registration per day

aksonov · January 22, 2021, 5:00pm

Okey, looks like I’ve found the problem - I have to add _measurement, _field fields to new bucket to make it work… But another problem - the daily count now is duplicated every time task is launched. Is it possible to update daily count every 10minutes for example and don’t create new record each time?

Anaisdg · January 22, 2021, 5:48pm

Hello @aksonov ,
Without some example input and output data and your flux query I’m afraid I can’t do much to help you. Can you please share your input data and expected output data and your flux queries?
Thanks

aksonov · January 22, 2021, 6:28pm

Data:
paid,usr_id=1 username=“test1”
paid,usr_id=2 username=“test2”
paid,usr_id=3 username=“test3”
paid,usr_id=4 username=“test4”
paid,usr_id=5 username=“test5”
paid,usr_id=6 username=“test6”
paid,usr_id=7 username=“test7”

Query for task:
option task = {
name: “DailyCount”,
every: 1m,
offset: 0m
}

All I want is to have one record count per day (i.e. 7), not new record every minute (what I see now, 7 is inserted every minute)

Anaisdg · January 22, 2021, 8:04pm

Hello @aksonov,
So the line protocol you shared is the output of the task?
Can you please share your input data from(bucket: “Test”)
Can you try and use last() before the to() function?

aksonov · January 22, 2021, 8:41pm

Line protocol was input data (Test). Output is a new row with _value 7 (count) every minute.

Last() doesn’t have any effect

Anaisdg · January 22, 2021, 10:02pm

@aksonov,
You might have to specify the right column

last(column: "daily_count")

Also why are you using the following line? Since your measurement is already “paid”?

|> set(key: “_measurement”, value: “paid”)

You might also need to group() your data into one table instead of a stream of tables before applying the last() function.

aksonov · January 23, 2021, 8:25am

I had to set _measurement because otherwise it says “_measurement” field is not found. Have you run that query by yourself? I tried to use last(column: “daily_count”) but it says “daily_count field is not found”. Could you give me exact query?

Anaisdg · January 25, 2021, 2:05am

I would love to give you the exact query, but I’m having some trouble with the data you gave me. Yes I was able to run it by myself, but since you didn’t provide timestamps I had to write my own timestamps. I made the assumption that all of the data you gave me occurred in one day. And it works for me. Only one value is returned with the query you returned. So I can use the to() function to write one data point to any bucket of my choosing.

please note that the count is = 5 because I only wrote a subset of your data from id = 1 to id = 5. as I felt it was sufficient to try and understand your problem.

Can you please provide me with timestamps for your data? Or alternatively use the export to CSV button to export your raw input data to CSV so I can try it for myself?

Communicating data transformations can be hard! Thanks for sticking with me.

aksonov · January 25, 2021, 9:56am

Thank you for your answer! Looks like here is some misunderstanding. That query really returns one row. But the task with that query inserts that row every time it runs. But I need something like “UPDATE” SQL, not “INSERT”. I need just to have ONE record per day (when task is executed every 5m to have actual data). Maybe I don’t understand Flux query language well…

aksonov · January 29, 2021, 4:46pm

Any response? Is it possible not to create new record every task launch?

Topic		Replies	Views
Downsampling data into another bucket Tasks cq , tasks , downsample	5	2067	June 4, 2021
Downsampling with task and flux InfluxDB 2 flux	0	732	January 28, 2022
Creating a Downsampling Task with conditions Tasks	4	245	February 16, 2024
Aggregate measurement with tasks Tasks downsample , aggregate	3	561	August 15, 2022
More downsampling woes InfluxDB 2	0	351	May 24, 2022

Problem with tasks and downsampling

Related topics