Help! Influx will not add anymore telegraf hosts?

Hello all,

Really struggling with this, I’m wondering if anyone has had or knows of a solution for this - I have setup influxdb2 - with telegraf as the collector - Most of my Windows servers (all using the same version of telegraf 1.18.2 and config) work fine but I have some Windows 2012 R2 servers that although can telnet to the Influxdb server etc they are not registering within it as hosts.
I have tried uninstall / reinstall remotely / install manually local to the server / changed bucket and token to another bucket on server - nothing kind of pulling my hair out now and cant seem to get to the bottom of it - Currently we have 151 of the 167 telegrafs inputting data into InfluxDB - is there a limit to how many series there can be?
What’s the cardinality limit? and how can I check if its gone over?

Reason I ask is that I still have another 200 to put into this :frowning: any help would be greatly appreciated -

Stu

I am now wondering if this could be part of the problem:

On Friday I moved the datastore from usr drive to a data drive using the following:

Stopped influxdb service

Moved the data with Rsync
rsync -a /var/lib/influxdb/ /data/influxdb/

change the locations in config.toml

bolt-path = “/data/influxdb/influxd.bolt”
engine-path = “/data/influxdb/engine”

sudo chown -R influxdb:influxdb /data/influxdb/
Started Influxdb service

All seemed to be working as was showing “Hosts”:

I have ran a test on one of the servers telegrafs with the following:
D:\Program Files\telegraf>telegraf.exe --config “D:\program files\telegraf\telegraf.conf” --once --debug

And the result is

2021-05-11T07:06:54Z E! [outputs.influxdb_v2] When writing to [http://xxx.xxx.xxx.xx:8086]: failed to write metric (403 Forbidden): 403 Forbidden
2021-05-11T07:06:54Z D! [outputs.influxdb_v2] Buffer fullness: 159 / 10000 metrics
2021-05-11T07:06:54Z E! [agent] Error writing to outputs.influxdb_v2: failed to write metric (403 Forbidden): 403 Forbidden
2021-05-11T07:06:54Z I! [agent] Stopping running outputs
2021-05-11T07:06:54Z D! [agent] Stopped Successfully

Could there be something I have missed with permissions?

Over night I have also tried -

Clear data from bucket
Take some servers out of the loop and push others in.

Nothing has worked - if this is indeed a permissions problem I really need some guidance for this please.

Once again help is very much appreciated

I can change buckets with hosts that are already within the InfluxDB successfully, Is there a “Store” for this information? something that could be locking the other hosts out?

I have renamed the /data/influxdb/ folder and reset backup so kinda starting from new,

I have added a new different server in but still get the 403 Forbidden when i try to add in the previous servers -

Hello @ThePeltonian,
What version of InfluxDB are you using?
Can you please share your telegraf config?
If you’re using OSS there shouldn’t be a cardinality limit (theoretically). Your size should be dependent on HW.
Do you know what your cardinality is?
You could have missed something with permissions. If you’re using 2.x I would recommend deleting your tokens and recreating all access tokens and using those.

Hi @Anaisdg

Version is influxdb2-2.0.4.x86_64.rpm
telegraf version is latest (attached config file)

telegraf.txt (18.5 KB)

The Influxdb server has 8cpu and 32GB ram.

Since writing this initial post I have uninstalled Influxdb and reinstalled so got all new tokens etc.
BUT since reinstalling more of the servers have been added to the ones that cannot write to Influxdb.
out of a total of 168 servers only 147 can actually push data to Influxdb. all are Windows servers and can telnet ok to the Influxdb server, all are using the same telegraf client and config but some just cannot write to Influxdb ( a couple used to be able to but since the reinstall cannot now).

I dont know what my cardinality is, how do I check this?
All telegraf clients run under a service account which is admin on all servers

The full error when running telegraf in debug test on a host that fails - the service access denied happens on all servers whether collecting or not

D:\Data\telegraf>telegraf.exe --config “D:\program files\telegraf\telegraf.conf” --once --debug
2021-05-14T08:45:51Z I! Starting Telegraf 1.18.2
2021-05-14T08:45:51Z D! [agent] Initializing plugins
2021-05-14T08:45:51Z D! [agent] Connecting outputs
2021-05-14T08:45:51Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2021-05-14T08:45:51Z D! [agent] Successfully connected to outputs.influxdb_v2
2021-05-14T08:45:51Z D! [agent] Starting service inputs
2021-05-14T08:45:51Z D! [inputs.win_eventlog] Subscription handle id:1
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘BrokerInfrastructure’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘CertPropSvc’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘DcomLaunch’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘DPS’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘EFS’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘gpsvc’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘LSM’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘MSDTC’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘RpcEptMapper’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘RpcSs’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘SCardSvr’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘ScDeviceEnum’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘Schedule’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘SCPolicySvc’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘SepLpsService’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘SepMasterService’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘sepWscSvc’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘sppsvc’:Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘SystemEventsBroker’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘TrustedInstaller’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘WdiServiceHost’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘WdiSystemHost’: Access is denied.
2021-05-14T08:45:51Z D! [inputs.win_services] could not open service: ‘WSService’: Access is denied.
2021-05-14T08:45:51Z E! [inputs.win_perf_counters] Error in plugin: No data to return.
2021-05-14T08:45:51Z E! [inputs.win_perf_counters] Error in plugin: No data to return.
2021-05-14T08:45:52Z D! [agent] Stopping service inputs
2021-05-14T08:45:52Z D! [agent] Input channel closed
2021-05-14T08:45:52Z I! [agent] Hang on, flushing any cached metrics before shutdown
2021-05-14T08:45:52Z E! [outputs.influxdb_v2] When writing to [http://xx.xxx.xx.xxx:8086]: failed to write metric (403 Forbidden): 403 Forbidden
2021-05-14T08:45:52Z D! [outputs.influxdb_v2] Buffer fullness: 156 / 10000 metrics
2021-05-14T08:45:52Z E! [agent] Error writing to outputs.influxdb_v2: failed towrite metric (403 Forbidden): 403 Forbidden
2021-05-14T08:45:52Z I! [agent] Stopping running outputs
2021-05-14T08:45:53Z D! [agent] Stopped Successfully
2021-05-14T08:45:53Z E! [telegraf] Error running agent: input plugins recorded 2 errors
D:\Data\telegraf>

The output of one that is ok

telegraf.exe --config “D:\program files\telegraf\telegraf.conf” --once --debug
2021-05-14T08:57:28Z I! Starting Telegraf 1.18.2
2021-05-14T08:57:29Z D! [agent] Initializing plugins
2021-05-14T08:57:29Z D! [agent] Connecting outputs
2021-05-14T08:57:29Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2021-05-14T08:57:29Z D! [agent] Successfully connected to outputs.influxdb_v2
2021-05-14T08:57:29Z D! [agent] Starting service inputs
2021-05-14T08:57:29Z D! [inputs.win_eventlog] Subscription handle id:1
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘BrokerInfrastructure’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘CertPropSvc’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘DcomLaunch’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘DPS’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘EFS’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘gpsvc’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘LSM’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘MSDTC’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘RpcEptMapper’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘RpcSs’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘SCardSvr’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘ScDeviceEnum’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘Schedule’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘SCPolicySvc’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘sppsvc’:Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘SystemEventsBroker’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘TrustedInstaller’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘WdiServiceHost’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘WdiSystemHost’: Access is denied.
2021-05-14T08:57:29Z D! [inputs.win_services] could not open service: ‘WSService’: Access is denied.
2021-05-14T08:57:29Z E! [inputs.win_perf_counters] Error in plugin: No data to return.
2021-05-14T08:57:30Z E! [inputs.win_perf_counters] Error in plugin: No data to return.
2021-05-14T08:57:31Z D! [agent] Stopping service inputs
2021-05-14T08:57:31Z D! [agent] Input channel closed
2021-05-14T08:57:31Z I! [agent] Hang on, flushing any cached metrics before shutdown
2021-05-14T08:57:31Z D! [outputs.influxdb_v2] Wrote batch of 177 metrics in 31.997ms
2021-05-14T08:57:31Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
2021-05-14T08:57:31Z I! [agent] Stopping running outputs
2021-05-14T08:57:31Z D! [agent] Stopped Successfully
2021-05-14T08:57:31Z E! [telegraf] Error running agent: input plugins recorded 2 errors

Any help would be great on this as it will be a showstopper for us using Influx for monitoring.

BTW to update influxdb to the latest release would it just be a case of just stopping the influx service then using “rpm --upgrade influxdb2-2.0.6.x86_64.rpm” ? then check the configs ok before starting up again?

@Anaisdg it looks like some of the hosts were connecting via a proxy which is affecting them so that issue is known now :slight_smile:

Hey Stu,

It looks like the main issue is the 403 Forbidden, which is a message from InfluxDB saying that the access token supplied does not have the permissions needed to write to the organization/bucket specified. It’s not clear to me why this would work for some and not for others, and I’d try to focus on the differences between those. It’s unlikely to be anything related to load or cardinality, so I’d discount those options. Are you specifying the token right in the config file, and is it a all-access token?

The win_services messages seem to be related to the process not having permissions to read service data about other services. I found this that might be helpful: https://help.pdq.com/hc/en-us/articles/220531207-Service-Manager-Access-Denied

Cheers,
Steven

Cheers for that info buddy, much appreciated - the issue lay with a system environment variable that should not be there - it was forcing a proxy that does not seem to exist, sometimes these things happen that are out of our control sadly :slight_smile:

Amazing. Good find. :slight_smile: Thanks for letting us know.