Generating metrics from log file tail

Hello all, first post

I have a logfile in this format:
[ERROR] 2024-09-14T09:49:27Z master_1 runtime ORA_WRITE restofmessage
[INFO ] 2024-09-14T09:49:27Z master_2 runtime ORA_VALID restofmymessage
[FATAL] 2024-09-14T09:49:27Z master_1 runtime ORA_INVALID ofmymessage
[FATAL] 2024-09-14T09:49:27Z master_1 jobserver ORA_INVALID ofmymessage

my telegraf.conf looks like:

Global tags can be specified here in key=“value” format.

[global_tags]

dc = “us-east-1” # will tag all metrics with dc=us-east-1 ssss

Configuration for telegraf agent

[agent]

Default data collection interval for all inputs

interval = “10s”
round_interval = true
debug = true
metric_batch_size = 1000
metric_buffer_limit = 25

collection_jitter = “0s”

flush_interval = “10s”
flush_jitter = “2s”
precision = “”

Override default hostname, if empty use os.Hostname()

hostname = “”
omit_hostname = false

[[inputs.tail]]
files = [“/home/carlos/log1.log”]
from_beginning = true
watch_method = “poll”
tagexclude=[“path”]
grok_patterns = [‘[%{LOGLEVEL:level}\s*] %{TIMESTAMP_ISO8601:timestamp} %{WORD:jobname} %{WORD:server:drop} %{WORD:metricname:tag} %{GREEDYDATA:message:drop}’]
data_format = “grok”
name_override = “crdl”

[[outputs.file]]
files = [“stdout”]
data_format = “influx”

[[outputs.cloudwatch]]
region = “eu-west-2”
period = “15s”
namespace=“g6”

with access_key and secret key omitted but valid , as the cpu metrics are storing ok

I run telegraf and get the following output:
carlos@carlos-chill:~$ telegraf --debug
2024-09-17T11:57:33Z I! Starting Telegraf 1.21.4+ds1-0ubuntu2
2024-09-17T11:57:33Z I! Using config file: /etc/telegraf/telegraf.conf
2024-09-17T11:57:33Z I! Loaded inputs: cpu tail
2024-09-17T11:57:33Z I! Loaded aggregators:
2024-09-17T11:57:33Z I! Loaded processors:
2024-09-17T11:57:33Z I! Loaded outputs: cloudwatch file
2024-09-17T11:57:33Z I! Tags enabled: host=carlos-chill
2024-09-17T11:57:33Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:“carlos-chill”, Flush Interval:10s
2024-09-17T11:57:33Z D! [agent] Initializing plugins
2024-09-17T11:57:33Z D! [agent] Connecting outputs
2024-09-17T11:57:33Z D! [agent] Attempting connection to [outputs.file]
2024-09-17T11:57:33Z D! [agent] Successfully connected to outputs.file
2024-09-17T11:57:33Z D! [agent] Attempting connection to [outputs.cloudwatch]
2024-09-17T11:57:33Z D! [agent] Successfully connected to outputs.cloudwatch
2024-09-17T11:57:33Z D! [agent] Starting service inputs
2024-09-17T11:57:33Z D! [inputs.tail] Tail added for “/home/carlos/log1.log”
2024-09-17T11:57:45Z D! [outputs.cloudwatch] Wrote batch of 5 metrics in 19.64µs
2024-09-17T11:57:45Z D! [outputs.cloudwatch] Buffer fullness: 0 / 25 metrics
crdl,host=carlos-chill,metricname=ORA_WRITE jobname=“master_1”,level=“ERROR”,timestamp=“2024-09-14T09:49:27Z” 1726574253810746898
crdl,host=carlos-chill,metricname=ORA_VALID jobname=“master_2”,level=“INFO”,timestamp=“2024-09-14T09:49:27Z” 1726574253810763208
crdl,host=carlos-chill,metricname=ORA_INVALID timestamp=“2024-09-14T09:49:27Z”,jobname=“master_1”,level=“FATAL” 1726574253810775478
crdl,host=carlos-chill,metricname=ORA_INVALID level=“FATAL”,timestamp=“2024-09-14T09:49:27Z”,jobname=“master_1” 1726574253810785378
crdl,host=carlos-chill,metricname=ORA_END level=“WARN”,timestamp=“2024-09-14T09:49:27Z”,jobname=“master_1” 1726574253810794388
2024-09-17T11:57:45Z D! [outputs.file] Wrote batch of 5 metrics in 58.4µs
2024-09-17T11:57:45Z D! [outputs.file] Buffer fullness: 0 / 25 metrics
2024-09-17T11:57:56Z D! [outputs.cloudwatch] Wrote batch of 13 metrics in 138.622367ms
2024-09-17T11:57:56Z D! [outputs.cloudwatch] Buffer fullness: 0 / 25 metrics
cpu,cpu=cpu0,host=carlos-chill usage_nice=0,usage_iowait=0.2002002002001994,usage_irq=0,usage_guest_nice=0,usage_system=0.2002002002001283,usage_idle=98.89889889889325,usage_softirq=0,usage_steal=0,usage_guest=0,usage_user=0.7007007007005202 1726574270000000000
cpu,cpu=cpu1,host=carlos-chill usage_nice=0,usage_steal=0,usage_user=0.3006012024048342,usage_system=0.10020040080165885,usage_idle=99.59919839680305,usage_iowait=0,usage_irq=0,usage_softirq=0,usage_guest=0,usage_guest_nice=0 1726574270000000000
cpu,cpu=cpu2,host=carlos-chill usage_guest_nice=0,usage_system=0.9036144578312737,usage_nice=0,usage_softirq=0,usage_steal=0,usage_guest=0,usage_user=0.6024096385541825,usage_idle=98.49397590359999,usage_iowait=0,usage_irq=0 1726574270000000000
cpu,cpu=cpu3,host=carlos-chill usage_user=1.1022044088178202,usage_idle=98.29659318637194,usage_iowait=0.10020040080158765,usage_softirq=0,usage_guest=0,usage_system=0.5010020040080094,usage_nice=0,usage_irq=0,usage_steal=0,usage_guest_nice=0 1726574270000000000
cpu,cpu=cpu4,host=carlos-chill usage_system=0.4024144869215021,usage_iowait=0.1006036217303934,usage_steal=0,usage_guest=0,usage_guest_nice=0,usage_user=0.8048289738432902,usage_idle=98.69215291751291,usage_nice=0,usage_irq=0,usage_softirq=0 1726574270000000000
cpu,cpu=cpu5,host=carlos-chill usage_system=0.20080321285138328,usage_idle=99.49799196787875,usage_iowait=0,usage_guest_nice=0,usage_steal=0,usage_guest=0,usage_user=0.30120481927714626,usage_nice=0,usage_irq=0,usage_softirq=0 1726574270000000000
cpu,cpu=cpu6,host=carlos-chill usage_nice=0,usage_iowait=0.40120361083247774,usage_irq=0,usage_softirq=0,usage_steal=0,usage_guest=0,usage_guest_nice=0,usage_user=0.9027081243730481,usage_system=0.2006018054162567,usage_idle=98.49548645936372 1726574270000000000
cpu,cpu=cpu7,host=carlos-chill usage_user=0.6999999999999318,usage_idle=98.60000000000582,usage_softirq=0,usage_steal=0,usage_guest=0,usage_guest_nice=0,usage_system=0.5999999999999517,usage_nice=0,usage_iowait=0.10000000000001563,usage_irq=0 1726574270000000000
cpu,cpu=cpu8,host=carlos-chill usage_guest=0,usage_system=0.6018054162488086,usage_idle=97.39217652958641,usage_nice=0,usage_irq=0,usage_softirq=0,usage_user=2.0060180541625052,usage_iowait=0,usage_steal=0,usage_guest_nice=0 1726574270000000000
cpu,cpu=cpu9,host=carlos-chill usage_idle=99.09819639278423,usage_softirq=0,usage_guest=0,usage_guest_nice=0,usage_user=0.701402805611057,usage_system=0.20040080160313878,usage_irq=0,usage_steal=0,usage_nice=0,usage_iowait=0 1726574270000000000
cpu,cpu=cpu10,host=carlos-chill usage_guest=0,usage_guest_nice=0,usage_user=0.7056451612902487,usage_system=0.5040322580644838,usage_idle=98.79032258065544,usage_iowait=0,usage_nice=0,usage_irq=0,usage_softirq=0,usage_steal=0 1726574270000000000
cpu,cpu=cpu11,host=carlos-chill usage_steal=0,usage_idle=99.09729187562536,usage_nice=0,usage_softirq=0,usage_irq=0,usage_guest=0,usage_guest_nice=0,usage_user=0.7021063189568483,usage_system=0.20060180541629327,usage_iowait=0 1726574270000000000
cpu,cpu=cpu-total,host=carlos-chill usage_iowait=0.07523196522609002,usage_softirq=0,usage_user=0.7941151884976855,usage_idle=98.74613391289815,usage_irq=0,usage_steal=0,usage_guest=0,usage_guest_nice=0,usage_system=0.3845189333779307,usage_nice=0 1726574270000000000
2024-09-17T11:57:56Z D! [outputs.file] Wrote batch of 13 metrics in 145.51µs
2024-09-17T11:57:56Z D! [outputs.file] Buffer fullness: 0 / 25 metrics

the cpu metrics are getting generated in cloudwatch I can see on the aws console…

the telegraf output from the tail is not generating anything in CW despite the output saying that it has

could someone tell me why ?
Cheers

Hello @Carl_Rees,
It can be helfpul to use the following flags:

--debug	Enable debug logging.
--test	Gather metrics once and print them.

Right now the stdout plugin is converting to lineprotocol but we want to see the actual cloudwawtch metrics format :slight_smile:

Many Thanks Anaisdg, I did try those options but they gave me no new info that I can see: please find below

carlos@carlos-chill:~$ telegraf --debug
2024-09-18T08:19:01Z I! Starting Telegraf 1.21.4+ds1-0ubuntu2
2024-09-18T08:19:01Z I! Using config file: /etc/telegraf/telegraf.conf
2024-09-18T08:19:01Z I! Loaded inputs: cpu tail
2024-09-18T08:19:01Z I! Loaded aggregators:
2024-09-18T08:19:01Z I! Loaded processors:
2024-09-18T08:19:01Z I! Loaded outputs: cloudwatch file
2024-09-18T08:19:01Z I! Tags enabled: host=carlos-chill
2024-09-18T08:19:01Z I! [agent] Config: Interval:20s, Quiet:false, Hostname:“carlos-chill”, Flush Interval:10s
2024-09-18T08:19:01Z D! [agent] Initializing plugins
2024-09-18T08:19:01Z D! [agent] Connecting outputs
2024-09-18T08:19:01Z D! [agent] Attempting connection to [outputs.file]
2024-09-18T08:19:01Z D! [agent] Successfully connected to outputs.file
2024-09-18T08:19:01Z D! [agent] Attempting connection to [outputs.cloudwatch]
2024-09-18T08:19:01Z D! [agent] Successfully connected to outputs.cloudwatch
2024-09-18T08:19:01Z D! [agent] Starting service inputs
2024-09-18T08:19:01Z D! [inputs.tail] Tail added for “/home/carlos/log1.log”
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_WRITE level=“ERROR”,timestamp=“2024-09-14T09:49:27Z” 1726647541054851712
crdl,host=carlos-chill,jobname=master_2,metricname=ORA_VALID level=“INFO”,timestamp=“2024-09-14T09:49:27Z” 1726647541054868762
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_INVALID level=“FATAL”,timestamp=“2024-09-14T09:49:27Z” 1726647541054880012
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_INVALID level=“FATAL”,timestamp=“2024-09-14T09:49:27Z” 1726647541054888352
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_END level=“WARN”,timestamp=“2024-09-14T09:49:27Z” 1726647541054902002
2024-09-18T08:19:11Z D! [outputs.file] Wrote batch of 5 metrics in 71.229µs
2024-09-18T08:19:11Z D! [outputs.file] Buffer fullness: 0 / 25 metrics
2024-09-18T08:19:12Z D! [outputs.cloudwatch] Wrote batch of 5 metrics in 24.31µs
2024-09-18T08:19:12Z D! [outputs.cloudwatch] Buffer fullness: 0 / 25 metrics
2024-09-18T08:19:22Z D! [outputs.file] Buffer fullness: 0 / 25 metrics
^C2024-09-18T08:19:23Z D! [agent] Stopping service inputs
2024-09-18T08:19:23Z D! [inputs.tail] Tail removed for “/home/carlos/log1.log”
2024-09-18T08:19:23Z D! [agent] Input channel closed
2024-09-18T08:19:23Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-09-18T08:19:23Z D! [outputs.cloudwatch] Buffer fullness: 0 / 25 metrics
2024-09-18T08:19:23Z D! [outputs.file] Buffer fullness: 0 / 25 metrics
2024-09-18T08:19:23Z I! [agent] Stopping running outputs
2024-09-18T08:19:23Z D! [agent] Stopped Successfully
carlos@carlos-chill:~$ telegraf --test
2024-09-18T08:19:26Z I! Starting Telegraf 1.21.4+ds1-0ubuntu2
2024-09-18T08:19:26Z I! Using config file: /etc/telegraf/telegraf.conf
2024-09-18T08:19:26Z I! Loaded inputs: cpu tail
2024-09-18T08:19:26Z I! Loaded aggregators:
2024-09-18T08:19:26Z I! Loaded processors:
2024-09-18T08:19:26Z W! Outputs are not used in testing mode!
2024-09-18T08:19:26Z I! Tags enabled: host=carlos-chill
2024-09-18T08:19:26Z D! [agent] Initializing plugins
2024-09-18T08:19:26Z D! [agent] Starting service inputs
2024-09-18T08:19:26Z D! [inputs.tail] Tail added for “/home/carlos/log1.log”

crdl,host=carlos-chill,jobname=master_1,metricname=ORA_WRITE level=“ERROR”,timestamp=“2024-09-14T09:49:27Z” 1726647566853205960
crdl,host=carlos-chill,jobname=master_2,metricname=ORA_VALID level=“INFO”,timestamp=“2024-09-14T09:49:27Z” 1726647566853233060
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_INVALID level=“FATAL”,timestamp=“2024-09-14T09:49:27Z” 1726647566853249310
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_INVALID level=“FATAL”,timestamp=“2024-09-14T09:49:27Z” 1726647566853263059
crdl,host=carlos-chill,jobname=master_1,metricname=ORA_END level=“WARN”,timestamp=“2024-09-14T09:49:27Z” 1726647566853273119
2024-09-18T08:19:27Z D! [agent] Stopping service inputs
cpu,cpu=cpu0,host=carlos-chill usage_guest=0,usage_guest_nice=0,usage_idle=100,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=0 1726647567000000000
2024-09-18T08:19:27Z D! [inputs.tail] Tail removed for “/home/carlos/log1.log”
cpu,cpu=cpu1,host=carlos-chill usage_guest=0,usage_guest_nice=0,usage_idle=100,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=0 1726647567000000000
2024-09-18T08:19:27Z D! [agent] Input channel closed
2024-09-18T08:19:27Z D! [agent] Stopped Successfully
cpu,cpu=cpu2,host=carlos-chill usage_guest=0,usage_guest_nice=0,usage_idle=98.03921568627626,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=1.9607843137255705,usage_user=0 1726647567000000000
cpu,cpu=cpu3,host=carlos-chill usage_guest=0,usage_guest_nice=0,usage_idle=100,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=0 1726647567000000000

why would telegraf say

2024-09-18T08:19:12Z D! [outputs.cloudwatch] Wrote batch of 5 metrics in 24.31µs

when it has not ?

How can you tell it has not? Double-checked the code and there is no way the metrics get lost on Telegraf side without an error in the log. The only way those metrics can get lost is if the server drops them without returning an error to Telegraf…