Greetings,
I have a running InfluxDB/Telegraf environment. I modified one of the Telegraf input plugins to add an additional name and set of metrics/value pairs.
If I set output to file/stdout and run Telegraf I get the newly added name,metric/value pairs. It all looks as it should. If I set the output back to the InfluxDB instance, the InfluxDB stores all of the original name,metric/value strings but none of the newly added ones.
Example (InfluxDB lookup):
> SELECT * FROM "lustre2" WHERE time > now() - 2m
name: lustre2
time close getattr getxattr host mkdir mknod name open read_bytes read_calls setattr statfs sync unlink write_bytes write_calls
---- ----- ------- -------- ---- ----- ----- ---- ---- ---------- ---------- ------- ------ ---- ------ ----------- -----------
2022-02-03T21:46:40Z oss01.cluster OST0001 9080047906816 8659659 6105613017088 5825226
2022-02-03T21:46:50Z oss01.cluster OST0001 9080047906816 8659659 6105613017088 5825226
2022-02-03T21:46:55Z 4036 3426 918 mds00.cluster 10 861 MDT0000 4074 628 291 17 818
2022-02-03T21:47:00Z oss01.cluster OST0001 9080047906816 8659659 6105613017088 5825226
2022-02-03T21:47:00Z oss00.cluster OST0000 18073682530304 17275330 5874028920832 5605000
2022-02-03T21:47:00Z oss02.cluster OST0002 15569399943168 14849550 6153444524032 5869626
Example (Telegraf stdout):
# /usr/bin/telegraf --config ./telegraf.conf --input-filter lustre2 --output-filter file
2022-02-03T21:59:46Z I! Starting Telegraf 1.22.0-0c742868
2022-02-03T21:59:46Z I! Loaded inputs: lustre2
2022-02-03T21:59:46Z I! Loaded aggregators:
2022-02-03T21:59:46Z I! Loaded processors:
2022-02-03T21:59:46Z I! Loaded outputs: file
2022-02-03T21:59:46Z I! Tags enabled: host=oss01.cluster
2022-02-03T21:59:46Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"oss01.cluster", Flush Interval:10s
lustre2,host=oss01.cluster,name=lnet errors=0i,route_count=0i,drop_count=0i,route_lenght=0i,msgs_alloc=0i,send_count=29307482i,recv_count=29305067i,send_length=9086081338464i,recv_length=6245205633248i,drop_length=0i,msgs_max=96i 1643925590000000000
lustre2,host=oss01.cluster,name=OST0001 read_bytes=9080047906816i,read_calls=8659659i,write_bytes=6238057115648i,write_calls=5951636i 1643925590000000000
lustre2,host=oss01.cluster,name=OST0001 read_bytes=9080047906816i,read_calls=8659659i,write_bytes=6242376200192i,write_calls=5955758i 1643925600000000000
lustre2,host=oss01.cluster,name=lnet route_count=0i,drop_count=0i,send_length=9086083051808i,recv_length=6249520438320i,drop_length=0i,msgs_alloc=1i,msgs_max=96i,recv_count=29313307i,route_lenght=0i,errors=0i,send_count=29315723i 1643925600000000000
lustre2,host=oss01.cluster,name=lnet msgs_alloc=0i,msgs_max=96i,send_count=29323811i,errors=0i,recv_count=29321401i,route_count=0i,drop_count=0i,send_length=9086084734400i,recv_length=6253763557504i,route_lenght=0i,drop_length=0i 1643925610000000000
lustre2,host=oss01.cluster,name=OST0001 read_bytes=9080047906816i,read_calls=8659659i,write_bytes=6246611054592i,write_calls=5959800i 1643925610000000000
lustre2,host=oss01.cluster,name=OST0001 read_bytes=9080047906816i,read_calls=8659659i,write_bytes=6250276163584i,write_calls=5963298i 1643925620000000000
lustre2,host=oss01.cluster,name=lnet drop_length=0i,msgs_max=96i,send_count=29330811i,route_count=0i,drop_count=0i,recv_length=6257423033936i,msgs_alloc=1i,errors=0i,recv_count=29328392i,send_length=9086086191072i,route_lenght=0i 1643925620000000000
The original unmodified metrics (name=OST00*) appear in InfluxDB but the new metrics (name=lnet) do not, yet they are clearly generated by Telegraf as seen above.
Do I have to archive/destroy the existing Influx database and start new for it is accept the new database columns (lnet:send_count, msgs_alloc, etc)? I can’t figure out why Telegraf sends the data but Influx doesn’t store it.
I verified that the original data columns are stored by Influx with new data from datastreams where the lnet data is also sent. It’s like Influx is dropping or ignoring the newly added name/metric/value strings.
Any advice is greatly appreciated.
Thanks!