Which one of the following is recommended and any issues with this approach

We are designing a schema for a new product. I would like to know which one of the following is recommended and any issues you see in the options #2

  1. option #1: row based

my-measurement rig=100,sourceTag=t1 value=100 1465839830100400200
my-measurement rig=100,sourceTag=t100 value=200 1465839830200400200
my-measurement rig=100,sourceTag=t500 value=300 1465839830300400200

  1. option #2: wide structure

my-measurement rig=100 t1=100,t100=200 1465839830100400200
my-measurement rig=100 t500=300 1465839830100400200

Note that:

  1. sourceTag values can be from t1 to t2000 (1 to 2000)
  2. in option #1, there is only one insert for a given time, rig, and sourceTag (no updates)
  3. in option #2, there will be multiple inserts for a given time and rig(updates). In the subsequent inserts, only new fields will be added(t1 to t2000) and the rig and the time will be same for few inserts

Currently, this is a single node environment.

Please let me know.

It really depends on what kind of analysis you’ll be using this data for. Option 2 will include some overrides if the only difference in the data is the field value, so you won’t have all data points available over time. What you do with the data hugely affects the way you organize it.

I would like to know which way is recommended and any issues you see in the options #2 other than overrides.

We had this question in my project and wound up trying it both ways. We had more problems with Option 1 overall.

Some problems we had with Option 1:

  • We use Grafana as a front-end, and with Option 1, Grafana doesn’t know what your “tags” are. As a result it can’t hint to you what tag names you might want to filter on.
  • You can’t group-by the values of a given tag like t100. You can’t add a clause that says “group by the different values that you see when sourceTag is t100.”
  • I wasn’t able to do calculations between fields when the fields are in different “points” as in your Option 1.

Conversely, we haven’t had any problem with extremely-wide points that have many, many tag-values on them as in Option 2 (hundreds or thousands of tags in our case).

Thank you for the reply. Excellent points to consider.