Wrong max-series-per-database exceeded error?

SWalter · October 7, 2019, 3:29pm

Hi,

we face right now the problem, that we see within our telegraf output the following errors:

2019-09-21T18:20:51Z E! [outputs.influxdb]: when writing to [http://10.10.xxx.xxx:8086]: received error partial write: max-series-per-database limit exceeded: (1000000) dropped=832; discarding points
2019-09-21T18:20:51Z E! [outputs.influxdb]: when writing to [http://10.10.xxx.xxx:8086]: received error partial write: max-series-per-database limit exceeded: (1000000) dropped=1000; discarding points

So we have checked within the chronograf the metric _intermal.monitor → database → max(numSeries) → 246k

So we should stay far below the limit of 1 million series per database. Nevertheless, we have reloaded the telegraf config and restarted the telegraf container without success.

What we haven’t done right now is to restart the influxdb, but we doesn’t assume, that this will help.

Any suggestions how to continue?

Best Regards,

Stephan Walter

hbs · October 7, 2019, 5:35pm

How many fields do you have? It seems there are two ways of counting series, one is counting unique combinations of measurement+tags, regardless of the number of fields, the other, used when checking cardinality limits, is counting measurement+tags+field, so if your 246k series have 4 or more fields, you might be in the second case.

I find this disturbing that there are those two definitions of what a series is, but as this is what appears in the doc there were probably some very good reasons that lead to this choice. Maybe someone from influx may elaborate.

SWalter · October 8, 2019, 7:51am

Hi,

thank you for this new detail. I wasn’t aware this detail.

We have within this database 19 measurements from the default telegraf input plugins.

They have between 2 and >10 fields with lots of tags.So I would assume, that we are far above 1Mil if we count measurements+tags+fields.

Do you have any idea how to calculate the number of this second definition of “series”?

We were not aware this problem so we may have missed it for quite some time, what we have to avoid in the future. So it would be great to have a Kapacitor based alert before this happens, so that we can react.

Best Regards,

Stephan

SWalter · October 8, 2019, 8:42am

Ok, one more thing.

I have checked the reported numSeries for the last 7 days now. We have had until 5. October 402k Series within the database with the problem. Then there was a drop to 159k series.

So I wonder how this could happen and how it fits to the possible explanation above?

SWalter · October 9, 2019, 2:00pm

We have increased the max series now by a factor of 10 and all error messages has gone.

Nevertheless, it is not clear to us how we could detect, that we ran again out of series.

So it would be great to get some advise.

Best Regards,

Stephan Walter

SWalter · October 9, 2019, 3:19pm

I have seen that there are quite old files at /var/lib/influxdb/data/telegraf/_series/

root@influxdb:/var/lib/influxdb/data# ls -hal telegraf///*
-rw-r–r-- 1 root root 4.0M Apr 29 14:06 telegraf/_series/00/0000
-rw-r–r-- 1 root root 8.0M May 8 05:43 telegraf/_series/00/0001
-rw-r–r-- 1 root root 16M Jun 3 16:21 telegraf/_series/00/0002
-rw-r–r-- 1 root root 32M Jul 16 12:58 telegraf/_series/00/0003
-rw-r–r-- 1 root root 64M Oct 8 20:08 telegraf/_series/00/0004
-rw-r–r-- 1 root root 8.1M Sep 11 05:45 telegraf/_series/00/index
-rw-r–r-- 1 root root 4.0M Apr 29 13:54 telegraf/_series/01/0000
-rw-r–r-- 1 root root 8.0M May 8 05:52 telegraf/_series/01/0001
-rw-r–r-- 1 root root 16M Jun 3 16:23 telegraf/_series/01/0002
-rw-r–r-- 1 root root 32M Jul 16 12:59 telegraf/_series/01/0003
-rw-r–r-- 1 root root 64M Oct 8 20:08 telegraf/_series/01/0004
-rw-r–r-- 1 root root 8.1M Sep 11 06:46 telegraf/_series/01/index

So is this maybe the source for our max-series-per-database problem?

At InfluxDB 1.7.4 fails after 9 months without issues - #7 by MarcV it was mentioned to manually delete shards. So maybe this is also true for series?

The point why I ask is, that we have modified the retention policy after a quite long time.

SWalter · November 12, 2019, 8:52am

Nobady anything to say about the series folder?

Should we run this utility to recreate the index?

bale836 · February 19, 2020, 7:39am

I have the same issue, rebuild the TSI index, then the count is correct

SWalter · March 17, 2020, 4:00pm

I will try to check this as soon as possible.

Topic		Replies	Views
Problem with sql server plugin and high number of series influxdb	6	1790	November 20, 2019
Configuration of max-val-per-tag InfluxDB 2	1	467	May 11, 2020
Max-series-per-database limit exceeded 1000000, error question InfluxDB 2 influxdb , telegraf , time-series , grafana	4	8432	December 16, 2019
Telegraf not sending APC PDU information to Influx Telegraf influxdb , telegraf , grafana	1	391	June 7, 2023
Telegraf → InfluxDB - Error 400 bad request - single line longer than the maximum of 65536 bytes Telegraf influxdb , telegraf , influxql	6	9306	February 18, 2020

Wrong max-series-per-database exceeded error?

Related topics