Hi,
So im using the inputs.prometheus plugin with RBAC configured in K8s. Im also using the prometheus annotations to discover and scrape the pod’s in kubernetes.
Now i would like to scrape some services in kubernetes. I see in the documentation that this is an option but only via Consul Catalog:
PLEASE VIEW UPLOAD:
SO scraping the kubernetes service endpoint works 100% correct, i would have thought that if i added the pod annotations to the corresponding pod (the pod that uses that service) it would return the same metrics right, but not.
Below is the input.prometheus plugin and ofcourse RBAC is working as it returns a path that its scraping as seen above in attachment.
Ok, so i just verified that my configs are correct, usually i scrape the service endpointgs from a list and this works 100%. What im doing now is scraping the pod (with the annotations).
So i just checked the service details via kubectl and verified that against the returned urls from inputs.prometheus and its exactly correct, but no metrics…
Every time i want to upload anything with more than 2 links this forum complains and says that because a new user im limited to only 2 links. Thats why im uploading screenshots.
2022-09-13T16:11:40Z E! [inputs.prometheus] Error in plugin: http://10.63.77.202:8080/actuator/prometheus returned HTTP status 404 Not Found
2022-09-13T16:11:40Z E! [inputs.prometheus] Error in plugin: http://10.63.77.25:8080/actuator/prometheus returned HTTP status 404 Not Found
2022-09-13T16:12:42Z E! [inputs.prometheus] Error in plugin: http://10.63.77.202:8080/actuator/prometheus returned HTTP status 404 Not Found
2022-09-13T16:12:42Z E! [inputs.prometheus] Error in plugin: http://10.63.77.25:8080/actuator/prometheus returned HTTP status 404 Not Found
2022-09-13T16:13:27Z D! [outputs.file] Buffer fullness: 0 / 20000 metrics
2022-09-13T16:13:32Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 20000 metrics
2022-09-13T16:13:33Z D! [outputs.prometheus_client] Buffer fullness: 0 / 20000 metrics
The prometheus plugin will go through all URLs and create a goroutine for each URL and collect data. So while there are a couple that are 404, the others should have been captured.
Are you certain that these endpoints are reporting valid prometheus metrics? I had expected an error if that was the case as well, but the fact that no metrics are returned makes me wonder if the metric end points are empty.
As mentioned above if i take those endpoints and put them in a list, they return metrics.
These endpoints are all returning metrics via prometheus, as that is the current monitoring app, im testing out telegraf and it works gr8, just this auto-discover version of the inputs.prometheus plugin is giving me an issue:
[[inputs.prometheus]]
metric_version = 2
monitor_kubernetes_pods = true
pod_scrape_scope = "cluster"
pod_scrape_interval = 60
response_timeout = "40s"
insecure_skip_verify = true
monitor_kubernetes_pods_namespace = "namespace"
namepass = [‘metrics1’, ‘metrics2’, ‘metrics3’]
I use that telegraf debug command on the container shell and it returns nothing. Its strange.
Maybe there is something in this config above (telegraf agent) that causing an issue:
I did find this in my other logs:
2022-09-13T07:49:33Z D! [inputs.prometheus] registered a delete request for “my-app-bc55d4954-hld5n” in namespace “my-namespace”
2022-09-13T07:49:33Z D! [inputs.prometheus] will stop scraping for “http://10.63.73.142:8080/actuator/prometheus”
Im not sure if there was an issue with this app at the time, but thats about the only error i can see related to a particular app, but thats about it, im scraping quite a few apps as you can see.
Is there any other debug command that i can use on the shell to verify this plugin.
Is there any other debug command that i can use on the shell to verify this plugin.
telegraf --config /etc/telegraf/telegraf.conf --input-filter prometheus --test --debug
Hmm --test does have some edge-cases when using a service input like the prometheus input with Kubernetes is. I would suggest running with --test-wait 120 to ensure you at least fulfill one collection interval.
Unfortunately, I do think we are reaching the end of my knowledge of this plugin. At this point I would file a bug and we can see about building a debug version with more log output to understand why data is not getting parsed.
So i eventually figured out what the problem is, the namepass filtering breaks the inputs.prometheus plugin. Any idea’s on what this could be? or is the namepass not compatible with this auto-discovery?
[[inputs.prometheus]]
metric_version = 2
monitor_kubernetes_pods = true
pod_scrape_scope = "cluster"
pod_scrape_interval = 60
response_timeout = "40s"
insecure_skip_verify = true
monitor_kubernetes_pods_namespace = "namespace" namepass = [‘metrics1’, ‘metrics2’, ‘metrics3’]
When i just remove the namepass metric filter, the metrics starts flowing in.
Sorry for the delay I’ve been taking some time off.
Ah! So namepass will determine what metric names are emitted. I had wrongly assumed you changed those for anonymity, but if you don’t have any metrics called “metrics1”, “metrics2” and/or “metrics3” then nothing would be emitted.
This is helpful in the cases where you want to slim down the metrics, or only want certain metrics to go to a specific output for example.