Telegraf not able to write its state in statefile in file input plugin

Vivek_parody · October 26, 2023, 2:04pm

Hi, I am using file input plugin. After restarting the telegraf service all files are getting read duplicately.
I found a way using statefile configurations in agent tab.
But telegraf is not writing its state or checkpoint till where it read data from files in statefile.
Am I using the correct configuration, please suggest me solution for this
Configuration:
Telegraf.conf

[agent]
  collection_jitter = "0s"
  debug = true
  flush_interval = "60s"
  flush_jitter = "0s"
  interval = "60s"
  logfile = ""
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  omit_hostname = false
  precision = ""
  quiet = false
  round_interval = true
  hostname = "dev"
  statefile = "/usr/share/telegraf/data/statefile"

InputFile.conf

[[inputs.file]]
  alias = "lcmfile"
  files = ["/var/log/containers/lcm-service*"]
  data_format = "grok"
  grok_patterns = ['{"log":"%{GREEDYDATA:log_message}","stream":"%{WORD:stream}","time":"%{TIMESTAMP_ISO8601:timestamp}"}']
  #grok_patterns = ['%{TIMESTAMP_ISO8601:log_timestamp} %{DATA:log_source} %{WORD:log_level} %{GREEDYDATA:log}']
  tags = {host_ip="$HOSTNAME",lcm-log=""}
  file_tag = "filepath"

[[processors.regex]]
  namepass = ["file"]
  tagpass = ["lcm-log"]
  alias = "regex_lcm"
  [[processors.regex.tags]]
    key = "filepath"
    pattern = "^(.*?)_.*?log"
    replacement = "${1}"
    result_key = "podname"
    append = false

[[outputs.elasticsearch]]
  urls = ["http://opensearch-master:9200"]
  index_name = "{{podname}}-%Y.%m.%d"
  username = "admin"
  password = "admin"
  namepass = ["file"]
  metric_batch_size = 100
  tagpass = ["lcm-log"]

Logs:

2023-10-26T12:41:07Z I! Loading config: /etc/telegraf/telegraf.conf
2023-10-26T12:41:07Z I! Loading config: /etc/telegraf/telegraf.d/baremetal-metrics.conf
2023-10-26T12:41:07Z I! Loading config: /etc/telegraf/telegraf.d/bmc-log.conf
2023-10-26T12:41:07Z I! Loading config: /etc/telegraf/telegraf.d/lcm-log.conf
2023-10-26T12:41:07Z I! Loading config: /etc/telegraf/telegraf.d/telegraf.conf
2023-10-26T12:41:07Z I! Starting Telegraf 1.28.2 brought to you by InfluxData the makers of InfluxDB
2023-10-26T12:41:07Z I! Available plugins: 240 inputs, 9 aggregators, 29 processors, 24 parsers, 59 outputs, 5 secret-stores
2023-10-26T12:41:07Z I! Loaded inputs: cpu disk diskio file (5x) kernel mem net processes swap system
2023-10-26T12:41:07Z I! Loaded aggregators: 
2023-10-26T12:41:07Z I! Loaded processors: regex
2023-10-26T12:41:07Z I! Loaded secretstores: 
2023-10-26T12:41:07Z I! Loaded outputs: elasticsearch (2x) prometheus_client
2023-10-26T12:41:07Z I! Tags enabled: host=dev
2023-10-26T12:41:07Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"dev", Flush Interval:1m0s
2023-10-26T12:41:07Z D! [agent] Initializing plugins
2023-10-26T12:41:07Z W! DeprecationWarning: Value "false" for option "ignore_protocol_stats" of plugin "inputs.net" deprecated since version 1.27.3 and will be removed in 1.36.0: use the 'inputs.nstat' plugin instead
2023-10-26T12:41:07Z D! [agent] Initializing plugin states
2023-10-26T12:41:07Z D! [agent] Connecting outputs
2023-10-26T12:41:07Z D! [agent] Attempting connection to [outputs.prometheus_client]
2023-10-26T12:41:07Z I! [outputs.prometheus_client] Listening on http://[::]:9273/metrics
2023-10-26T12:41:07Z D! [agent] Successfully connected to outputs.prometheus_client
2023-10-26T12:41:07Z D! [agent] Attempting connection to [outputs.elasticsearch]

Please help me to understand the problem and provide me with correct configurations

Anaisdg · October 26, 2023, 3:41pm

Hello @Vivek_parody,
I apologize I’m not quite sure I understand the problem I don’t see any errors in the logs.

The file plugin parses the complete contents of a file. It doesn’t have any checkpoints.

Perhaps tail input plugin is more what you’re looking to use?

github.com

influxdata/telegraf/blob/master/plugins/inputs/tail/README.md

# Tail Input Plugin

The tail plugin "tails" a logfile and parses each log message.

By default, the tail plugin acts like the following unix tail command:

```shell
tail -F --lines=0 myfile.log
```

- `-F` means that it will follow the _name_ of the given file, so
that it will be compatible with log-rotated files, and that it will retry on
inaccessible files.
- `--lines=0` means that it will start at the end of the file (unless
the `from_beginning` option is set).

see <http://man7.org/linux/man-pages/man1/tail.1.html> for more details.

The plugin expects messages in one of the [Telegraf Input Data
Formats](../../../docs/DATA_FORMATS_INPUT.md).

This file has been truncated. show original

Vivek_parody · October 26, 2023, 3:53pm

Thanks @Anaisdg for the respons,
Yes I can use the tail plugin but I think tail plugin has one limitation that it does not read the files which creates after telegraf service get up. That is why I am using file input plugin. Is there any solution for tail plugin for this issue ?

jpowers · October 26, 2023, 4:20pm

@Vivek_parody not sure I follow what your goal is? Are you trying to read new files as they come in? If so I would look at the directory monitor plugin.

Vivek_parody · October 26, 2023, 4:27pm

Hey @jpowers, My goal is to read the kubernetes pod logs which writes on the directory /var/log/container/* . So if I use file plugin then its not maintaining the state. If I use tail plugin then its not reading the new file after the telegraf service gets up.
If you suggesting to use directory monitor plugin then will this plugin solve the above challenges?

jpowers · October 26, 2023, 5:00pm

I am still not clear why you are trying to use the state file. Are you constantly reloading telegraf?

So if I use file plugin then its not maintaining the state.

The file plugin reads the entire file at every collection interval. When reading logs this is
probably not what you want. It does not save state.

The tail plugin will effectively tail a file. Reading it from the beginning at first, and then each collection interval will only read new lines. This does save state, when the plugin is shutdown safely. A sudden shutdown will not save state.

The directory monitor plugin will monitor files as they come in, process them, and move on. I don’t believe this plugin monitors state as well.

Each have their own use-cases, even though they are very similar.

Vivek_parody · October 27, 2023, 6:20am

Thanks for your response @jpowers ,
For any cases telegraf may be down eg. For OOM killed or suppose need to update any config in the telegraf so need to restart the telegraf instance. I think saving the state is the basic functionality it should provide.
Since tail plugin does not read the new file I can not use it either.
Can I request any way to add the save state functionality in the telegraf file input plugin

jayesh_verma · October 27, 2023, 7:40am

I think you are looking for similar to filebeat/fluentbit to read logs.

Vivek_parody · October 27, 2023, 8:21am

Yes exactly, an alternative for filebeat/fluentbit

jpowers · October 27, 2023, 1:41pm

A restart would save state as we have time to clean up. An OOM, is not in scope. That would be a sudden shutdown where Telegraf gets killed and is not a scenario we are after supporting.

Since tail plugin does not read the new file I can not use it either.

Tail reads new files at each gather interval.

Vivek_parody · October 27, 2023, 2:55pm

Hi @jpowers,

A restart would save state as we have time to clean up. An OOM, is not in scope. That would be a sudden shutdown where Telegraf gets killed and is not a scenario we are after supporting.

I have deployed telegraf in kubernetes env as a pod. So when I am deleting the pod/container its not saving its state in the statefile. I have cross checked it manier times. May be telegraf does consider it as a sudden shutdown.

Tail reads new files at each gather interval.

Can you please provide me configurations for gather interval

jpowers · October 27, 2023, 3:17pm

That is probably the case

Can you please provide me configurations for gather interval

By default this is every 10 seconds. Otherwise it can be set in the agent settings:

[agent]
   interval = "10s"

Vivek_parody · October 27, 2023, 3:55pm

That is probably the case

@jpowers Okay so I can not use file plugin for my case

Sorry for asking the same thing again but just for the confirmation, are you sure that telegraf tail plugin read the new files after it gets up because I have tried tail plugin its not reading the new file.

jpowers · October 27, 2023, 4:06pm

The gather function is called at each interval. Which will go and look for new files:

The tailNewFiles will glob based on your regex you provided. One other item is if you are on windows the watch_method may need to be poll.

Vivek_parody · November 1, 2023, 10:50am

Thanks @jpowers , Tail plugin is able to solve my problem as its writing state in statefile and also tailing the new files as well

Topic		Replies	Views
Error running agent: Error: no inputs found, did you provide a valid config file Telegraf telegraf	6	13310	May 26, 2022
Multiple configuration files not working Telegraf telegraf , plugin	5	51	January 22, 2025
Telegraf does not work Telegraf	0	810	October 16, 2019
Telegraf logparser plugin is not working Telegraf telegraf	2	2503	April 12, 2018
Telegraf with logparser timeout insert to influxdb	5	2410	June 19, 2019

Telegraf not able to write its state in statefile in file input plugin

Related topics