Reading csv using telegraf

i am reading a csv file into influxdb using telegraf and the interval is set to 10s in the config file.
when the telegraf reads data at each interval it picks the data from the beginning and not from where it stopped due to which there are duplicate entries in my measurement.
##########
how can this be avoided?

Sounds like you might be using the file input currently? If you switch over to the tail input then only new lines will be added.

but the file gets created once every day… and that is it… after this time i want to read this file to influxdb… for which i used this plugin…

apart from this file there are many other files which i read using tail plugin… those files keep getting updated every minute for 5-6 hours and therefore tail plugin works fine in that case and keeps updating new data to the DB… for this file i think the tail plugin would keep posting the last time over and over again…

i think when telegraf starts picking data at time T1 it picks some data from each of the files and stops as soon as it hits the limit defined by metric_batch_size. and when it starts picking data at time T1+interval for the second time it reads this file from the beginning rather than starting from where it stopped the last time.

so if this situation can be dealt with, i suppose i will achieve what i desire. i have tried using from_beginning = false but it didn’t work as this parameter is not a part of inputs.file plugin.

i have googled for the solution i am seeking but no luck till now. please let me know if something can be done about it.

Other possible solutions are:

  1. If the file is generated only once you can set the interval of the input plugin to 24h (this will override the interval set in the “Agent” section for this plugin only). In this way, the file will be red once every day.
    There are two limitations in this approach
    1.1. You can’t decide the exact execution time (i.e. run at 03:00PM)
    2.1. If the file is not re-generated you will read duplicate data

  2. Another way is to write a custom script that checks the file creation/edit date and the current time, at the given hour, if the csv has changed since the last run then loads the data (using the exec plugin).
    If the script/command is short you may not need to run a file but just to execute the command string inside the exec. (this avoids dependencies between the conf and script files)