we face right now the problem, that we see within our telegraf output the following errors:
2019-09-21T18:20:51Z E! [outputs.influxdb]: when writing to [http://10.10.xxx.xxx:8086]: received error partial write: max-series-per-database limit exceeded: (1000000) dropped=832; discarding points
2019-09-21T18:20:51Z E! [outputs.influxdb]: when writing to [http://10.10.xxx.xxx:8086]: received error partial write: max-series-per-database limit exceeded: (1000000) dropped=1000; discarding points
So we have checked within the chronograf the metric _intermal.monitor → database → max(numSeries) → 246k
So we should stay far below the limit of 1 million series per database. Nevertheless, we have reloaded the telegraf config and restarted the telegraf container without success.
What we haven’t done right now is to restart the influxdb, but we doesn’t assume, that this will help.
How many fields do you have? It seems there are two ways of counting series, one is counting unique combinations of measurement+tags, regardless of the number of fields, the other, used when checking cardinality limits, is counting measurement+tags+field, so if your 246k series have 4 or more fields, you might be in the second case.
I find this disturbing that there are those two definitions of what a series is, but as this is what appears in the doc there were probably some very good reasons that lead to this choice. Maybe someone from influx may elaborate.
thank you for this new detail. I wasn’t aware this detail.
We have within this database 19 measurements from the default telegraf input plugins.
They have between 2 and >10 fields with lots of tags.So I would assume, that we are far above 1Mil if we count measurements+tags+fields.
Do you have any idea how to calculate the number of this second definition of “series”?
We were not aware this problem so we may have missed it for quite some time, what we have to avoid in the future. So it would be great to have a Kapacitor based alert before this happens, so that we can react.
I have checked the reported numSeries for the last 7 days now. We have had until 5. October 402k Series within the database with the problem. Then there was a drop to 159k series.
So I wonder how this could happen and how it fits to the possible explanation above?