InfluxDB disk usage

Hi Guys,

I’m very new to influxDb and have a question about the disk usage of my influx installation.
First, let me describe my scenario:
I have a bunch of articles in my mysql database and all of them have a price and an available quantity. I would like to now about the price and quantity changes. Since I want to store the data every hour (I lag the possibility to do it on price / quantity change and it’s nothing I can do in the near future) I decided to use an influxdb.

I wrote a simple importer, storing all the ~1 million datasets into the database, having 5 columns:
The timestamp, a productId (tag, integer), articleId (tag, integer), the quantity (field, integer) and the price (field, float).

Why I’m writing this post:
The original mysql database table with 53 columns(!!) is like 250mb in size. Every import of those data into the influx db costs me at around 550mb of disk space. I really don’t understand why it is that much and why the database is that big.

I changed parts of the default configuration, e.g. reduce memory usage. What I changed:

  • Set index-version to tsi1
  • Set max-series-per-database to 0 because yet I don’t know how many series i’ll need
  • Set max-values-per-tag to 0 for the same reason

All the other configuration options are set to there default values (besides http of course). Is there any hint what i could do to reduce the size? I’m a bit confused, because I read a lot of articles about compression possibilities influxdb offers. Do I have to enable them somehow? Is it just because the 2 tags (ProductId, ArticleId) which increase (afaik) the cardinality? Or am I mixing things up?

Let me know if there is anything missing or in case you need any further information.

Best Regards
Chris

I’m not an expert, but I had the same situation at first which was due to the default shard duration being set to 7 days. In my case I was bulk loading data with a time span of about 40 years. InfluxDB ended up creating a large number of (small) shards and each one has some overhead. Once I was able to set the shard duration more appropriately, disk usage, memory, etc improved dramatically. This may or may not apply to your situation.

Thanks for your reply.
Indeed, the database size decreased in the meantime by round about 1GB.

Best, Chris