Doubts about creating shema

Hello there,

(I am totally new in influxDb)

I am working on a project where we have high frequency data (around 1800 data points per second).
Ideally, we want to have this data separated into “experiments”. Each experiment has unique ID which can last for 1 hour up to 3 hours per experiment.

My question is, would it make sense to create a bucket for each experiment, or it is better save data to the same bucket under different tags using unique IDs.

Any tip would be appreciated. Cheers

Welcome to the forum.

Fellow beginner here, but some questions first…

  1. Do you have (or need) tags on the data points that you are ingesting? Or is it enough to just collect all 1800 data points per second x 3 hours and then do calcs on that dataset?
  2. Do you ever plan to do comparisons between each set of data (e.g. between each bucket, if you indeed stored each experiment in a different bucket)?

Hello,

  1. In my case tags are not needed. Perhaps I could use different “measurements” value per experiment containing in the same bucket? (It is enough to store the data, plot it probably with grafana and later dump it do CSV or json to extract different features from it and perhaps use it in machine learning)

  2. To compare the data directly no, there is no requirement for that so far.

Hi again,

Based on your answers, I would create one bucket. Each experiment would be assigned a different measurement name that reflects the unique ID you mentioned (e.g. Experiment_8YT).

Also, beginner tip (from another beginner)…when I am setting up this stuff at first, I often goof up something here or there. I usually put everything in a bucket that I created called “JunkBucket”, populate it with my data, build the query in InfluxDB Explorer, paste into Grafana, etc. Once I am happy with the outcome, I simply change the bucket name to the actual bucket name I want to use. Then delete JunkBucket.

1 Like