Multidimensional Data to Influx

I would like to insert the data of a linux Taskmanager (top) into influx db.

Setup

  • 100 snapshots of top taken and parsed by an existing python program.
  • write them into influx
  • Summarize every value of the 100 Snapshots to their average, min and max value
  • Those 100 snapshots will be taken monthly for every new released Software version
  • In the end I want a graph with the software versions on X and the e.g. CPU utilization on Y. The single graphs are the cpu utilization of the running binaries.

Data structure
Softwareversion → Binary → CPU, Mem, …

Question

  • How should I design the data scheme to get all info into the database?
  • can I actually have more values for one measurement?
  • The timestamp is irrelevant, what counts is Softwareversion+interation index.
  • How many buckets do I need? One per sample?

In Pictures

Hello @Armin,

  • How should I design the data scheme to get all info into the database?
    It depends on which version you’re using if you’re using 3.x you don’t have to worry about the schema as you can have unlimited cardinality.
    If you’re using 2.x…
    I would have all your numerical values be fields.
    And maybe your user be a tag? It depends if you need to filter by a specific user frequently. Tags are indexed and fields arent. So if you want to filter by a specific value frequently, make that key a tag.

  • can I actually have more values for one measurement?
    Yes you should have multiple fields and tags for one measurement. This might be helpful for 2.x Designing Your Schema | Time to Awesome

  • The timestamp is irrelevant, what counts is Softwareversion+interation index.
    If the timestamp is irrelevant I don’t recommend using a time series database. You might benefit more from relational database.

  • How many buckets do I need? One per sample?
    I would use one bucket for all like data.