Metrics sent when running telegraf.exe but not sent when running as a service

Hello.

I have issue when running telegraf a s a service. I test the config before I run telegraf with:
.\telegraf --config-directory “C:\Progra~1\telegraf” --test

It prints out metrics from all inputs. Then I execute the telegraf.exe with double-click and all metrics are stored in Influxdb.
However, when I start telegraf as a service, it sends all metrics except from the exec plugin. I am trying to troubleshoot it for much time, any ideas on what’s the issue here?

The inputs part of my conf file is the following:

[[inputs.exec]]
  commands = ['powershell -Command "C:\Temp\csvfs.ps1"']
  timeout = "9s"
  data_format = "influx"

[[inputs.win_perf_counters]]
  [[inputs.win_perf_counters.object]]
    # Processor usage, alternative to native, reports on a per core.
    ObjectName = "Processor"
    Instances = ["*"]
    Counters = [
      "% Idle Time",
      "% Interrupt Time",
      "% Privileged Time",
      "% User Time",
      "% Processor Time"
    ]
    Measurement = "win_cpu"
    IncludeTotal=true

  [[inputs.win_perf_counters.object]]
    # Disk times and queues
    ObjectName = "LogicalDisk"
    Instances = ["*"]
    Counters = [
      "% Idle Time",
      "% Disk Time",
      "% Disk Read Time",
      "% Disk Write Time",
      "% User Time",
      "% Free Space",
      "Current Disk Queue Length",
      "Free Megabytes",
      "Disk Read Bytes/sec",
      "Disk Write Bytes/sec"
    ]
    Measurement = "win_disk"
    IncludeTotal=true

  [[inputs.win_perf_counters.object]]
    ObjectName = "System"
    Counters = [
      "Context Switches/sec",
      "System Calls/sec",
      "Processor Queue Length",
      "Threads",
      "System Up Time",
      "Processes"
    ]
    Instances = ["------"]
    Measurement = "win_system"
    IncludeTotal=true

  [[inputs.win_perf_counters.object]]
    # Example query where the Instance portion must be removed to get data back,
    # such as from the Memory object.
    ObjectName = "Memory"
    Counters = [
      "Available Bytes",
      "Cache Faults/sec",
      "Demand Zero Faults/sec",
      "Page Faults/sec",
      "Pages/sec",
      "Transition Faults/sec",
      "Pool Nonpaged Bytes",
      "Pool Paged Bytes"
    ]
    # Use 6 x - to remove the Instance bit from the query.
    Instances = ["------"]
    Measurement = "win_mem"
    IncludeTotal=true

  [[inputs.win_perf_counters.object]]
    # more counters for the Network Interface Object can be found at
    # https://msdn.microsoft.com/en-us/library/ms803962.aspx
    ObjectName = "Network Interface"
    Counters = [
      "Bytes Received/sec",
      "Bytes Sent/sec",
      "Packets Received/sec",
      "Packets Sent/sec"
    ]
    Instances = ["*"] # Use 6 x - to remove the Instance bit from the query.
    Measurement = "win_net"
    IncludeTotal=true

  [[inputs.win_perf_counters.object]]
    # Process metrics
    ObjectName = "Process"
    Counters = [
      "% Processor Time",
      "Handle Count",
      "Private Bytes",
      "Thread Count",
      "Virtual Bytes",
      "Working Set"
      ]
    Instances = ["*"]
    Measurement = "win_proc"
    IncludeTotal=true

When running it manually are you doing this from an elevate cmd or powershell console?

You should be able to see in the telegraf log file what is happening, but first guess is either permissions or execution policy.

try changing this
commands = ['powershell -Command "C:\Temp\csvfs.ps1"']

to this

commands = ["powershell.exe -ExecutionPolicy Bypass /path_to_file/file.ps1"]

That option works for me, but if you have log output from telegraf when you run it as a service it would be easier to diagnose.

As a personal point, i store my config and configd in programfiles/application/config/config.conf - AND programdata/application/config/configd

I would put the exec plugin in a seperate config file, drop that into configd with the powershell script and run from there. Thats running the service as LOCAL_SYSTEM
maybe try loading the file from a different location on the server.

Hello philb and thanks for your reply. When I run it mannually I am using an elevate cmd. In the installation it didn’t require anything about where to store logs. Is there a telegraf log file?

Permission or execution policy is not a problem when I run telegraf manually or when I execute the script manually. Is it possible to still cause issues on telegraf?

Hi @Marios_Karatisoglou

You can set the log file in your main telegraf config, just before the outputs section there should be this section

 ## Logging configuration:
  ## Run telegraf in debug mode
  debug = false
  ## Run telegraf in quiet mode
  quiet = false
  ## Specify the log file name. The empty string means to log to stdout.
  logfile = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""

Once you’ve provided a location for the log file, restart the agent windows service. You should see it go through starting, loading plugins, then if there are errors with the PS script you should get some output there.

RE execution policy, I’m not sure. I don’t know if running as elevated might bypass the EP, Windows isn’t my usual OS so I’m not 100% on that.

It does sound permissions related if you can run the commands elevated with no issues. A quick test would be to set the service account to local admin or domain admin and check if you receive data. Obviously i wouldn’t recommend running it with an admin account like that permanently but that should clear up whether its permissions related or not.

Edit: you might want to use something like notepad+ or VS Code to tail the log file, it should update as its running then

1 Like

Hello philb. Thanks for your info but I have an issue with logging configuration.

When I use the following logging configuration the service fails to start

  ## Log at debug level.
  debug = false
  ## Log only error level messages.
  quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  logfile = "C:\Users\mario\Downloads\telegraf-1.17.2_windows_amd64\telegraf-1.17.2\tlg.log"

However if I comment the previous log config lines the services starts normally.
Is there anything wrong with that? I have verified that the path is correct and the tlg.log file exists in that directory. Thanks in advance.

What is the error message?


Maybe the backslashes in the Windows path are the problem?
Try the following settings in the agent section of your conf file:

# Configuration for telegraf agent
[agent]
  debug = true
  logtarget = "file"
  logfile = "telegraf.log"

If that works, try the Windows path with slashes instead of backslashes.

HI,

Remove the C:\ part of the path and swap \ for /

that works for me.

1 Like

I restarted the telegraf service, it runs the ps script successfully and sends the data. Not quite sure what the problem was since I was not the only one with access to the specific server running telegraf. Indeed it must have been something with priviledges, however the logs didn’t include any error. Thanks for your help!