Using kapacitor for calculating time in multiple states for different resources

Hi,

I have a stream that I’m trying to configure as I have a system that collects data points in approximately 1 minute intervals to see what state a system is in. Each minute the state of that system is logged so the data looks like so:

2019-09-04T18:55:09.380664171Z 432 Resource AVAILABLE
2019-09-04T18:56:09.055967266Z 432 Resource AVAILABLE
2019-09-04T18:57:23.129462143Z 432 Resource AVAILABLE
2019-09-04T18:58:19.164020423Z 432 Resource AVAILABLE
2019-09-04T18:59:13.652928799Z 432 Resource OCCUPIED
2019-09-04T19:00:06.396011709Z 432 Resource OCCUPIED
2019-09-04T19:01:31.327628547Z 432 Resource OCCUPIED

2019-09-04T19:28:51.544204886Z 432 Resource OCCUPIED
2019-09-04T19:29:46.047909301Z 432 Resource OCCUPIED
2019-09-04T19:30:39.107772395Z 432 Resource AVAILABLE
2019-09-04T19:33:39.997265596Z 432 Resource AVAILABLE

I would like to use kapacitor to create another table that just shows me the time spent in each state and am using the following tick script:

dbrp “data”.“autogen”

stream
    |from()
        .database('data')
        .measurement('resource_data')
    |groupBy('resource_id')
    |stateDuration(lambda: "status" == 'OCCUPIED')
        .unit(1s)
    |influxDBOut()
        .database('data')
        .retentionPolicy('autogen')
        .measurement('resource_state')

This seems to be working, the issue I’m having is how can I expand this to cover all states (statuses) that I have, like AVAILABLE, OCCUPIED, UNKNOWN, etc. across multiple monitored resources?

Also, this stream will work for incoming data, but is there a way to post-process all the previous data I’ve already collected?

Thanks!

Hi,

You can give yout stateDuration node an .as()

so it would become

|stateDuration(lambda: "status" == 'OCCUPIED')
        .unit(1s)
        .as('OCCUPIED_DURATION')

Then you can create a stateDuration node for each of your statuses.

For your second question, you should look into using batch tick scripts with Kapacitor. They can query your historical data. However this could change the time stamp to the time that the data is reinserted.

Kapacitor Batch Node

Hope that helps.

Thanks Phil! That helps quite a bit. I’m still slightly confused as to how I would represent multiple states this way though because the stateDuration node is streaming from the main data source. In other words, I don’t think something like this would work, right?

stream
|from()
    .database('data')
    .measurement('resource_data')
|groupBy('resource_id')
|stateDuration(lambda: "status" == 'OCCUPIED')
    .unit(1s)
    .as('OCCUPIED_DURATION')
|stateDuration(lambda: "status" == 'AVAILABLE')
        .unit(1s)
        .as('AVAILABLE_DURATION')

...

Also I haven’t been able to find an example of a batch job that is made to go through all of the past data. Would you be able to point me at a good example?

Thank you so much!

Hi,

I’m not sure i fully understand. Can the service be in multiple states? Also, those states are tag values right? you should also group by any tags you want to include. So you have resource_id, you should also group by status to leverage the values, or what ever you have named your “status” tag.

The script like that should work, we use something similar to generate warning and critical alerts for memory usage. warning if the state duration is over 5 minutes then critical if it reaches 10 minutes.

I would advised against querying ALL data at once, if you have a lot. but this should help

You would need to specify the time frame in your query, but i haven’t tried to query large amounts of old data to be honest. I use the Kapacitor batch script method to down sample my data not long after it has arrived so i can purge the raw data with a 40 days RP.

To view the results by status then you could query the Influx CLI with something like

select * from measurement where "resource_id" = ID group by "state"

I’m about to head out for the day so won’t be back until tomorrow but let me know how you get on

Thanks so much, this is super helpful!

The service can be in OCCUPIED, AVAILABLE, UNKNOWN, or FAULTED. So the way it is right now with just OCCUPIED means it will give me the duration that any of the resources spend in the OCCUPIED state, but I’d like it to also calculate the duration of the other states as well. I realize I could use multiple tick scripts but wasn’t sure if there was a way to do it cleanly in one. Does that help clear things up?

I’ll take a look at the other resources you posted and let you know what I’m able to come up with for downsampling of old data. Thanks again!

EDIT: I should also clear up that status isn’t a tag, its a field. The status is what I’m measuring / polling for every few minutes