Compute SLA's with flux as %time OK vs % time NOK

Hello everybody.

I would like to use flux to compute SLA’s from events, OK,NOK (by example assumming 1 = OK , 0 = NOK ) as %time OK vs % time NOK in one time interval (start/end).

I’ve found a python lib traces does witch does this calculation with the method traces.distribution(start,end)

https://traces.readthedocs.io/en/latest/#quickstart-using-traces

there is any flux function/library or any combination of several functions which can help us to compute %time OK vs %time NOK as the previous function does?

Thank you very much!

Hello @toni-moreno,
Can you please help me help you by providing some input data and your expected output?
At first glance traces.distribution(start,end) seems to just output the quantiles.
You can create histograms with InfluxDB v2 as well as compute quantiles with:
https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/transformations/aggregates/quantile/

Which seems to be different than wanting to calculate % time OK vs % time NOK.
For that type of calculation I would look into using these flux functions:
elapsed() returns the time between subsequent records of time OK vs time NOK.
In conjunction with map() to actually calculate the percentage.

However, the easiest way for me to help you if is you can give me some sample data and what you want your query to return.

Thanks :slight_smile:

Alternatively this might be interesting to you. We’re working on supporting trace data but it’s not ideal atm.

Hello @Anaisdg and sorry for the late response.

My use case is easy. I need compute %OK vs % NOK from some k8s deployed services monitored by somebody ( not me) with prometheus and alertmanager.

The only data I have Is events service_id: timestamp: OK or service_id: timestamp: NO-OK in a sql database where I can access.

I think the example in the traces lib is enouth good , changing 0 by NO OK and 1 by OK.

>>> time_series = traces.TimeSeries()
>>> time_series[datetime(2042, 2, 1,  6,  0,  0)] = 0 #  6:00:00am
>>> time_series[datetime(2042, 2, 1,  7, 45, 56)] = 1 #  7:45:56am
>>> time_series[datetime(2042, 2, 1,  8, 51, 42)] = 0 #  8:51:42am
>>> time_series[datetime(2042, 2, 1, 12,  3, 56)] = 1 # 12:03:56am
>>> time_series[datetime(2042, 2, 1, 12,  7, 13)] = 0 # 12:07:13am
>>>
>>> time_series.distribution(
>>>   start=datetime(2042, 2, 1,  6,  0,  0), # 6:00am
>>>   end=datetime(2042, 2, 1,  13,  0,  0)   # 1:00pm
>>> )
Histogram({0: 0.8355952380952381, 1: 0.16440476190476191})

In this case this service was OK 16% time and 83 time NO-OK b/etween 6:00am and 1:pm.

If I can schedule this calculation periodically over time with predefined time slots (1h/1d/1week/1month) I will be able to compute time based SLA’s to any external service in this way.

The alternative with elapsed/map seems too difficult, I think a good idea if flux could have a special function to compute this kind of distribution data.

thank you very much.

Hello again @Anaisdg.

I’ve just released a (really alpha release) for a kind of traces golang port here

Still planning how to work with these kind of Unevenly spaced events and time windows to compute SLA’s.

1 Like