I created a conf file that has inputs.exec to run a python script every 5 mins. The result of that python script is a long list of cpu metrics followed by their timestamps from the source device. First I was unable to get the script to print anything in Telegraf until I realized that the timestamps were different lengths needed and I needed to add precision = “1ns” now I am able to run it with one device but when I give it a list of devices some of those show up and some do not. Is it possible that the script (which takes roughly 10seconds to run) would need a bigger precision timestamp? I tried looking around to understand what the precision flag was actually for but didn’t get a whole lot of info.
When you say different lengths what do you mean? Are some nanosecond precision? Some second precision?
when I give it a list of devices some of those show up and some do not
What do your metrics look like? If you print your results with outputs.file what do you see?
The timestamps given off the device look like: 1723213167009 ms so I started adding the additional 000000 to get it to match what I saw telegraf was making the length of timestamps it added for other traffic (like snmp queries), and then set the precision as 1ns.
I have tried printing to a file multiple times and every time it has looked correct. One theory that I came up with last night, is there a possible buffer that I am maxing out and it’s dropping some? I only bring that up because when I take the exact same script and set it to only 1 device in the list everything works correctly, however when I set it to 9 devices in the list we have some devices that just report no data into the database, and then the others are having huge gaps of data missing, etc.
This is why I asked for example metrics. It is a guessing game without seeing the actual metrics, preferably with actual telegraf logs.
I ran a --test with the .conf file we are using and that printed to cli which I logged with putty to get the entire output. Here are some example metrics (with semi sensitive information changed).
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=2,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.98 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=3,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=87.03 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=4,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.18 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=5,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=87.86 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=6,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=74.24 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=7,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=89.42 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=8,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.98 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=9,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.17 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=10,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=11,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.96 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=12,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=13,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.97 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=14,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=15,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.17 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=16,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=17,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=18,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=19,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=86.61 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=20,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=73.93 1723224965072000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=21,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=87.71 1723224965072000000
I was scrolling through and every single datapoint is there for all 6 devices in the list (which doesn’t happen when we send it to a database). I did notice that the very last device in the list had all the entires plus extras (I will see if i can figure out why) but I did see these items in the output:
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=38,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=39,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.79 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=40,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.99 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=41,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723225150051000000
2024-08-09T17:41:26Z D! [agent] Stopping service inputs
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=42,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.99 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=43,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.19 1723225150051000000
2024-08-09T17:41:26Z D! [agent] Input channel closed
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=44,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.14 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=45,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.35 1723225150051000000
2024-08-09T17:41:26Z D! [agent] Stopped Successfully
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=46,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=99.14 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=47,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.92 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=48,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=49.01 1723225150051000000
Is there is specific log file that I should look for? Run time with --test and --debug didn’t have any logs or errors.
It sounds like everything is parsing just fine, so I wouldn’t expect there to be an errors.
I was scrolling through and every single datapoint is there for all 6 devices in the list (which doesn’t happen when we send it to a database).
What tag is used to differentiate between the devices? source? Does each device have a unique tag?
I did notice that the very last device in the list had all the entires plus extras (I will see if i can figure out why)
What do you mean?
In general, when someone says not all my metrics are arriving in InfluxDB it means they have duplicate metrics, where the metric name, tag set, and timestamp are all equal. As a result only the last entry shows up.
I realized you never said InfluxDB is your output, is that in fact what you are using? or is it something else?
I have tried with victoria and influxdb, both have the same result.
It would be a combination of tags. each source will have 2800+ metric lines. Each of those being equal to the number of CPUs (47+) x the time stamp (5 seconds apart) x 5 total mins worth of data. There should be no duplicate lines of metrics because each one is printed with the source, cpu number, and a timestamp, but I will definitely go check and make sure.
The first 5 devices in the list all were printing exactly 2890 lines of metrics but the last one was printing 3000+, I just figured out that it has more cpus than the other 5, so that makes sense that it is printing more lines.
To be clear the following would be considered duplicates (I edited lines from above):
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=47,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=98.92 1723225150051000000
vendordevice_api_cpu_headend_stat,bucket_name=vendordevice,cpu_number=47,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=device,stat=stat stat=49.01 1723225150051000000
Even if they have different field values 98 vs 49, the tag set, metric name, and timestamp are identical, and so InfluxDB will only record the last item.
That makes total sense, however that should be the issue but I will go in and add another dynamic tag to see if it makes a difference. I don’t think that would be the problem because it prints the cpu number as one of the tags and also appends the timestamp on the end, so just those two things alone would make each dynamic.
I went through and retested it again and there is a disconnect between running in --test mode vs printing to a file. If I run the telegraf config in --test mode then it prints out everything exactly how I expect it. If I print it to a file it is missing a ton of lines of metrics. Here is an example of some of the lines that were omitted from the file but existed in the cli output from --test mode. Again, certain pieces replaced but the two pieces that change are cpu_number and the timestamp, which would keep any of these from being duplicates. I also put the entire cli --test mode output in excel and searched for duplicates and got zero back.
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=0,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=32.92 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=1,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.58 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=2,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.18 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=3,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=85.24 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=4,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.18 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=5,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=87.22 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=6,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=94.27 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=7,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=87.42 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=8,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.18 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=9,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.18 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=10,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.19 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=11,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.38 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=12,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.19 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=13,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.18 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=14,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.19 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=15,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.38 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=16,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.19 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=17,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=18,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=19,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=86.5 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=20,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=65.44 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=21,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=84.66 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=22,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=97.79 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=23,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=76.22 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=24,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.19 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=25,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=76.76 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=26,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=27,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=79.09 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=28,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=29,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=30,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=31,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=32,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=33,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=76.22 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=34,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=35,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=75.62 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=36,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=37,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=78.41 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=38,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=39,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=79.92 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=40,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=41,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=72.86 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=42,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.6 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=43,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=75.87 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=44,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=45,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=46,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.6 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=47,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=92.7 1723558907373000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=ALL,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=91.98 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=0,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=52.22 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=1,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.17 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=2,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.18 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=3,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=84.76 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=4,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=98.98 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=5,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=87.84 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=6,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=95.07 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=7,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=87.06 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=8,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=98.98 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=9,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.59 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=10,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.39 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=11,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.38 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=12,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.39 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=13,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.38 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=14,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.39 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=15,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.38 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=16,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=17,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=18,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=19,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=85.59 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=20,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=66.38 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=21,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=85.01 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=22,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=97.98 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=23,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=77.59 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=24,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=98.99 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=25,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=77.04 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=26,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=27,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=79.61 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=28,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=29,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=98.8 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=30,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=31,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=32,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=33,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=75.37 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=34,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=35,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=75.62 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=36,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=37,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=78.76 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=38,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=39,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=80.7 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=40,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.6 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=41,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=72.54 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=42,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=43,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=75.67 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=44,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=45,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.2 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=46,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=99.4 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=47,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=93.46 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=ALL,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=91.67 1723558917389000000
I also noticed, if I run the python script it directly by itself it prints out differently than if I run it through Telegraf. Example, it rearranges the cpu_number metric and then adds the host.
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,datasource=vendordevice_api_system_cpu_headend,stat=vendorstat,source=devicename,cpu_number=59 vendorstat=67.13 1723560429169000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,datasource=vendordevice_api_system_cpu_headend,stat=vendorstat,source=devicename,cpu_number=60 vendorstat=82.20 1723560429169000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,datasource=vendordevice_api_system_cpu_headend,stat=vendorstat,source=devicename,cpu_number=61 vendorstat=83.90 1723560429169000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,datasource=vendordevice_api_system_cpu_headend,stat=vendorstat,source=devicename,cpu_number=62 vendorstat=87.89 1723560429169000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,datasource=vendordevice_api_system_cpu_headend,stat=vendorstat,source=devicename,cpu_number=63 vendorstat=83.14 1723560429169000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=58,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=96.81 1723560519359000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=59,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=67.27 1723560519359000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=60,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=81.14 1723560519359000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=61,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=84.74 1723560519359000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=62,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=87.16 1723560519359000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=63,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=85.71 1723560519359000000
Yes, there will be differences in running with test and once as they do different things
One item to note is the only difference between your values is the timestamp, for example:
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=0,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=52.22 1723558912388000000
vendordevice_api_cpu_headend_vendorstat,bucket_name=vendordevice,cpu_number=0,datasource=vendordevice_api_system_cpu_headend,host=telegrafserver,source=devicename,stat=vendorstat vendorstat=32.92 1723558907373000000
If your precision that you chose is not greater than a 1 second, then these points may get read in as the same point! You have never shared your config, so I have no idea what you are actually doing with your outputs or precision, but it better be set to something other than 1 second.
Yes that its correct, it will go through cpu “all” and then cpu 0-47 for the same timestamp, then wrap around and do the exact same ones but for the next time stamp.
[agent]
collection_jitter = “0s”
flush_interval = “5s”
flush_jitter = “0s”
precision = “1ns”
hostname = “”
omit_hostname = false
debug = true
I have tried output to influxdb and also victoria, but since I figured out that the issue is before them I am just testing printing to file right now.
I set the precision as 1ns since that’s the granularity of the epoch time stamp. Should it be set differently than that?
I set the precision as 1ns since that’s the granularity of the epoch time stamp. Should it be set differently than that?
no
Is there a way to tell performance stats of Telegraf? I am still curious if I am overloading it or overunning buffers? Something along those lines. I changed out the python script that had a list of 9 individual devices to just 1 and everything worked fine. I then created 9 individual scripts (one for each device) and 9 individual telegraf config files (one for each script) so that it would treat them individually. Once I did that and deployed the missing data came back again (like it is dropping metrics again). All in all it is roughly 28,500 metric lines, between 2800 and 3800 lines per individual run of inputs.exec
You would get a message about buffer overflow in your logs. If this was the case, then yes it would result in dropping metrics.
Once I did that and deployed the missing data came back again (like it is dropping metrics again).
Now you have a light switch that you can go work with. I would use 2 of these to limit the amount of data coming in and start working through what is going on. I would run two different telegraf, collect the data from each into a file and compare them.
Welp looks like I found the problem:
2024-08-14T15:55:03Z W! [outputs.influxdb] Metric buffer overflow; 3346 metrics have been dropped
2024-08-14T15:55:03Z W! [outputs.influxdb] Metric buffer overflow; 1891 metrics have been dropped
2024-08-14T15:55:03Z W! [outputs.influxdb] Metric buffer overflow; 1891 metrics have been dropped
2024-08-14T15:55:03Z W! [outputs.influxdb] Metric buffer overflow; 1835 metrics have been dropped
So you suggested having multiple telegrafs running as an option. Would it help to decrease the flush interval to something like 1s, or better to just have multiple telegrafs instead?