Execution Order of TICK Scripts

Hi,

I can not find any documentation about the execution order of TICK Scripts.
I need a specific order of the scripts, how can I achieve this? Does the order depend on the script name, or the order when the task was enabled etc.?

Best regards,

Simon

@sj14 Stream TICK Scripts continually process data and batch scripts execute depending on their every() property. The execution depends entirely on exactly what data they are running on and the configuration and has nothing to do with when the task was enabled.

The only time execution order would matter is if you are running two stream tasks on the same data with different alerts. Is that what you are trying to do?

The only time execution order would matter is if you are running two stream tasks on the same data with different alerts. Is that what you are trying to do?

Yes, thats what I try to do. Not only different alerts, the first tick script is preparing the data and writing it into influxdb, and the second tick script should use the prepared data (which is based on the second last streamed value). From my understanding, it is not possible to process these two steps in one TICK script, when I want to predict values with the Holt-Winters function and compare the predicted with the real value (see TICK script: use from() values after chaining functions).
Thats even a little bit hard to describe, I hope it was kind of understandable :slight_smile:

Best regards,
Simon

@sj14 Ah! That makes much more sense! If you are using stream, then the second script will only fire once the first has written to InfluxDB and the data is sent back to kapacitor. Your order should be guaranteed there.

Thanks, but I can not see, how the order is guaranteed.
Both scripts are executed, when the next value from the same stream comes in.
The first script compares the incoming value with the last predicted value (generated by the previous streamed value)
The second script predicts the next value (has to be the 2nd script, to run).

If the prediction would be the first script, the compare script would compare the current value with the predicted value based on the current value. But the predicted value has to be compared with the value after the current value.

I think, your answer suggests to use a stream on the incoming predictions, but then I have the same problem, to find the corresponding original value. Thus, the easiest option in my opinion is to use a fixed script execution order.

Maybe I am a little bit too deep into the problem, probably there is a very easy solution to my problem. Anyone with a suggestion how to use Holt-Winters and compare the predicted value with the real value, and alerting if the values are too much apart (also preferable with a stream replay file)?

Best regards,
Simon

There is no order guaranteed across different TICK scripts. Are both of your TICK scripts stream or is one stream and the other batch?

If they are both stream they can be combined into the same task. Otherwise you can play tricks with the times on the batch data to wait for the stream task to complete first. Something like the .offset parameter would work.

We worked on a single stream script here, but I came to the conclusion, that it does not work. The .shift parameter only changes the time, but I need to also cache the predicted value for the given time, until the real value comes in. But maybe you have another approach, or I misunderstood something.

Currently I have stream replay files, but I have already looked into converting them to batch files without success, yet. Don’t know how to set the expected number of batch collectors in the replay file:

running replay: unexpected number of batch collectors. exp 3 got 1

Edit: forgot to give to content of the batch replay file and tick script with the given error above:

replay file:

{"name":"table_test","points":[
{"fields":{"value":1000},"time":"2015-10-18T00:00:00Z"},
{"fields":{"value":1001},"time":"2015-10-18T00:00:02Z"},
{"fields":{"value":1002},"time":"2015-10-18T00:00:04Z"},
{"fields":{"value":1003},"time":"2015-10-18T00:00:06Z"},
{"fields":{"value":1004},"time":"2015-10-18T00:00:08Z"}]}

tick script:

batch
|query('''
    SELECT value
    FROM "mydb"."autogen"."table_test"
''')
    .period(1s)
    .every(1s)
|alert()
    .crit(lambda: "value" > 0.5)
    .log('/var/lib/kapacitor/alerts/batch.log')

Both scripts are executed, when the next value from the same stream comes in.
The first script compares the incoming value with the last predicted value (generated by the previous streamed value)
The second script predicts the next value (has to be the 2nd script, to run).

Took me a while to get it but I think I have it now. Your first description threw me off, but this one makes a bit of sense. So you have two signals? One real, one predicted? Like this, when current time = 1?

time real pred
  0    5   5.5
  1    6   6.2
  2        7.1

The predicted signal is in the future. Then time = 2 arrives and a new real value comes in. You want two things to happen:

#1. The incoming real value is compared with the last predicted value. Eg. given the following data, the comparison script records a 0.1 difference between real and predicted.

time real pred
  0    5   5.5
  1    6   6.2
  2    7   7.1

#2. Then, the next predicted value is calculated and inserted. Eg.

time real pred
  0    5   5.5
  1    6   6.2
  2    7   7.1
  3        7.9

Am I close?

If so, unless there’s some real time communication between the two scripts, I wouldn’t rely on any particular ordering. Instead, maybe you can use the timestamps to guarantee the correct computation, regardless of the order?

Yes, thats exactly the process :+1:. Sorry, I should have posted such a nice overview like this, but sometime you are just to close to a problem to give a good overview .

I have thought about using timestamps, too, but InfluxDBOut does not allow to set a custom timestamp? https://docs.influxdata.com/kapacitor/v1.2/nodes/influx_d_b_out_node/ I will think about other approaches using the time and let you know. :slight_smile:

Best regards,
Simon

@Heath_Raftery @sj14 Thanks for the clarification, this is possible within a single TICKscript.

Below is a batched based example but it should be easily converted into a streaming TICKscript if so desired.

// How much history to use when predicting values
var history = 10d
// How often to run the prediction
var every = 1d

// Select batches of data with length 'history' and at every 'every'  interval.
var real_with_history = batch
    |query('SELECT value FROM ...')
         .period(history)
         .every(every)
         .align()

// Grab just the most recent the value 
var real = real_with_history
    |last('value')
        .as('value')

// Predict just one value into the future 
var pred = real_with_history
    |holtWinters('value', 1, 0, every)
        .as('value')

// Part #1 compare most recent real value with predicted value.
// NOTE: the very first real value will not have a predicted value to join with,
// that is OK, it will be dropped.
// The join node bufferes data till the next point arrives with the correct timestamp.
// In this case it will buffer the prediceted point for 1d.
real
    |join(pred)
       .as('real', 'pred')
    |eval(lambda: "real.value" - "pred.value")
        .as('diff')
   // do something with the diff

// Part #2 store predicted value back in InfluxDB
pred
    |influxDBOut()

Does that help? What questions do you have?

Also of note since this is a batch task it become trivial to replay this task for past data.

kapacitor replay-live batch -task real_pred_task -rec-time -past 200d

@nathaniel, thank you very much. I will test it and let you know.

@nathaniel, thanks for your patience, it works!
I had to change:

var pred = real_with_history
|holtWinters('value', 1, 0, every)
    .as('value')

to:

var pred = real_with_history
    |holtWinters('value', 1, 0, every)
        .as('value')
    |last('value')
        .as('value')

otherwise, there was an error enabling the task:

enabling task removeNoise: cannot add child mismatched edges: holtWinters4:batch -> join6:stream

Thank you very much!

I didnt work for mw. I can see only few data points in join log.
DOT:
digraph holtwinters {
graph [throughput=“0.00 batches/s”];

query1 [avg_exec_time_ns="3.126451ms" batches_queried="46" errors="0" points_queried="523" working_cardinality="0" ];
query1 -> holtWinters4 [processed="46"];
query1 -> last2 [processed="46"];

holtWinters4 [avg_exec_time_ns="42.417154ms" errors="0" working_cardinality="0" ];
holtWinters4 -> last5 [processed="46"];

last5 [avg_exec_time_ns="5.675µs" errors="0" working_cardinality="0" ];
last5 -> log6 [processed="45"];

log6 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
log6 -> join8 [processed="45"];

last2 [avg_exec_time_ns="17.184µs" errors="0" working_cardinality="0" ];
last2 -> log3 [processed="46"];

log3 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
log3 -> join8 [processed="46"];

join8 [avg_exec_time_ns="6.258µs" errors="0" working_cardinality="1" ];
join8 -> log9 [processed="10"];

log9 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
log9 -> eval10 [processed="10"];

eval10 [avg_exec_time_ns="0s" errors="0" working_cardinality="1" ];
eval10 -> log11 [processed="10"];

log11 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
}

Tick Script

var real_with_history = batch
    |query('select kw_total from "db"."autogen"."measurement" where "name" = \'cr1\'')
         .period(2m)
         .every(10s)
         .align()

// Grab just the most recent the value
var real = real_with_history
    |last('kw_total')
        .as('value')
    |log()
// Predict just one value into the future
var pred = real_with_history
    |holtWinters('kw_total', 1, 0, 10s)
        .as('value')
    |last('value')
        .as('value')
    |log()
// Part #1 compare most recent real value with predicted value.
// NOTE: the very first real value will not have a predicted value to join with,
// that is OK, it will be dropped.
// The join node bufferes data till the next point arrives with the correct timestamp.
// In this case it will buffer the prediceted point for 1d.
real
    |join(pred)
       .as('real', 'pred')
    |log()
    |eval(lambda: "real.value" - "pred.value")
        .as('diff')
    |log()

Thanks,
Kranthi