How to store multiple time series (for blocks of recurring predictions)

#1

Hi all,

I’m looking at storing some weather forecast data that involves getting regular data sets that contain predictions for future data. I’ve read a few of the questions here, however I haven’t been able to find a solution that lets you both compare predictions as well as grab the series of just the latest predictions.

In my case I am getting a few hundred data points in each update, from the current reading now until the predicted value six hours into the future. I am getting refined versions of these every half hour, 24/7. So every 30 minutes I get a block of data with the current reading and the predictions for the next six hours. I’d like to put all this into Influx if I can.

If I have a measure for “now”, I’ll also need a measure for “now+1 minute”, “now+2 minutes” and so on, so I will end up with hundreds of measures to cover the six hour time period. This doesn’t seem right to me, although it will make it easy to compare the accuracy of the predictions because I can set the timestamp to the time of the prediction, making the timestamps directly comparable between all the measures which is good. (In other words, graphing the “now” measure against the “now+60min” measure will show how accurate the one-hour predictions were.) Unfortunately I also want to be able to graph the latest predictions (with the X axis starting “now” and ending at +6h) which I don’t think will be possible if each data point is in a different measure, as I’d need to graph the last value in each of the hundred measures on the Y axis, with the name of the measure itself dictating where on the X axis the point is drawn.

Another suggestion was to use multiple fields, which seems like it would have a similar issue if I tried to graph the single latest value across many fields.

The last option I saw, which was suggested that it may not be the best, was to use tags. This means I’ll need a few hundred tags - “now”, “now+1min”, “now+2min”, etc. which will let me keep a single measure and single field. However the timestamps of the “now+60min” tag will be one hour behind, because the value at t=X will be a prediction for an hour in the future, which seems like it could complicate things. It will also require a few hundred different tags, but the number of tags would at least be constant over time.

However even with this way I’m still not sure how I could graph the latest series of predictions, when each data point has a different tag. It seems like I need a field that contains a time-of-prediction timestamp (to go along with the time-of-retrieval), but Influx doesn’t seem to have fields with dates or times in them.

It would be great if someone else more experienced than me could offer any suggestions about how this data could be stored while allowing the two types of queries to be performed.

Many thanks for any hints!

#2

Hi,

Have you considered using something like LoudML machine learning?

Influx webinar
LoudML tutorial by one of the guys at LoudML, the guy in the webinar

I’ve dabbled with it a bit, i want to predict 30 days in the future for potential failures but you should be able to predict 60 minutes with with something like LoudML.

The community/free edition allows you to run upto 5 models.

#3

Thanks for the reply. I’m not doing any of the predictions myself, I’m getting the data from another organisation at regular intervals and looking at some place to store it so that decisions can be made on it. I thought Influx would be the way to go but it doesn’t look like it’s able to cope with this sort of multiple dimension time series data?