hi there,
is influxdb suitable for the following type of data? if so, could you please give me some hints on how shall I store them in order to query them efficiently?
id, measure_date, prediction_date, value
1, 2017-01-01 00:00:00, 2017-01-01 00:00:00, 100.00 <- actual measured value
2, 2017-01-01 00:00:00, 2017-01-01 00:15:00, 110.00 <- prediction what could be measured 15 minutes later
3, 2017-01-01 00:00:00, 2017-01-01 00:30:00, 120.00
4, 2017-01-01 00:15:00, 2017-01-01 00:15:00, 110.00 <- measured value 15 mins later… we predicted right at id 2
5, 2017-01-01 00:15:00, 2017-01-01 00:30:00, 122.00 <- more accurate prediction then at id 3 for the same point in time
could one solution look something like…
id, ts_id, prediction_id, date, value
1, 1, null, 2017-01-01 00:00:00, 100 <- actual measured value
2, 1, 2017-01-01 00:00:00, 2017-01-01 00:15:00, 110 <- so the first timestamp would be a string
greetings
MariaCobretti
Yes, InfluxDB is well suited for that type of data.
You’ll want to get familiar with InfluxDB’s key concepts, which will help when deciding the best format to write your data. In particular, what should be stored as a tag
(indexed) vs what should be stored as a field
(not indexed).
In the example data you gave, it might be tempting to store IDs in a tag so they would be indexed and predictions could be quickly matched with the actual measurements in queries. However, the ID values would grow unbounded and so would the index if the IDs were stored as tags.
If possible, it would be best to have the client store the prediction until it also has the associated measurement and then write both together. Using InfluxDB’s line protocol, that might look something like:
env pred_temp_f=110.0,meas_temp_f=110.0
env pred_temp_f=120.0,meas_temp_f=119.5
thank you for the answer,
I think I understand your suggestion, but since I want to store around 200 predictions for each measurement it would make the table pretty wide I guess and I also would have to calculate the correct time of the prediction based on its key (e.g. every 15 mins)
Instead I will use fields to associate a prediction with its corresponding measurement.
A unique id is probably not that usefull since you dont want to have it as a tag nor do you wanna query fields.