I am using Telegraf 1.12.6 (git: HEAD 6c7f2d62) on SLES 15 SP1, configured to offer metrics to an prometheus server, and I got some special case in which I only need to collect metrics either once an hour or once a day.
So I put
interval = “1h” and in a second try interval = “60m” in my inputs section.
With both configuration telegraf will not offer those metriks, even if I wait for mutiple hours. If I change the interval of this special input to 10s or 1m everything works fine.
Can anyone give me a hint what stops telegraf collecting metrics when using larger intervals ?
Your configuration looks fine to me.
Have you tried to test the configuration with the overridden interval?
(the best option is to test with the same user that runs the telegraf service/process)
telegraf --config <PathToConf> --test
To view/write the full output ensure that you are logging everything
(you may want to create a copy of the original conf to not mess up the original)
[agent]
{...}
## Log
debug = true
quiet = false
## empty string "" means to log to stderr, otherwise specifiy the path to the file.
logfile = ""
{...}
The test should return some data (in influx line format), as if telegraf run once.
looks like this one was an missunderstanding or, better said “wrong expectation”, by me.
In the meantime we found out that telegraf in fact delivers the metrics, but only for 1 scrape by promtheus. I thought the same metrics would be delivered by every scape during one hour.
I don’t know how Prometheus works, but just for curiosity, do you have an explicit datetime for your data?
if yes, is Prometheus updating/overriding the existing data points every time it reads the file?
great hint! Since yesterday we were struggeling with missing metrics (about 14.000 metrics that should be read from a file) and rising the expiration_interval fixed this.