Influx doesn't work as a service in Windows 11

Suddenly, without anything having changed and in another twin system still working, telegraf no longer sends data if started as a service.
While if started from cmd it works without problems.

I tried activating the logs but there are no errors.
No problems are reported from the Windows event viewer either.
The problem does not occur if I deactivate the “inputs.nvidia_smi” plugin.

Well, then you will need to find out what changed :slight_smile:

The problem does not occur if I deactivate the “inputs.nvidia_smi” plugin.

Without any logs or a config, it is hard to provide any additional assistence.

It’s strange to me too but no changes have been made.

These are the logs that I managed to activate:

# Start Telegraf from cmd (data)
2024-01-22T11:37:05Z I! Starting Telegraf 1.29.2 brought to you by InfluxData the makers of InfluxDB
2024-01-22T11:37:05Z I! Available plugins: 241 inputs, 9 aggregators, 30 processors, 24 parsers, 60 outputs, 5 secret-stores
2024-01-22T11:37:05Z I! Loaded inputs: cpu disk diskio mem net nvidia_smi system
2024-01-22T11:37:05Z I! Loaded aggregators: 
2024-01-22T11:37:05Z I! Loaded processors: 
2024-01-22T11:37:05Z I! Loaded secretstores: 
2024-01-22T11:37:05Z I! Loaded outputs: influxdb_v2
2024-01-22T11:37:05Z I! Tags enabled: host=4090-2
2024-01-22T11:37:05Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"4090-2", Flush Interval:10s
2024-01-22T11:37:05Z W! DeprecationWarning: Value "false" for option "ignore_protocol_stats" of plugin "inputs.net" deprecated since version 1.27.3 and will be removed in 1.36.0: use the 'inputs.nstat' plugin instead
2024-01-22T11:37:44Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-01-22T11:37:44Z I! [agent] Stopping running outputs
# Start telegraf as a service (no data)
2024-01-22T11:37:52Z I! Starting Telegraf 1.29.2 brought to you by InfluxData the makers of InfluxDB
2024-01-22T11:37:52Z I! Available plugins: 241 inputs, 9 aggregators, 30 processors, 24 parsers, 60 outputs, 5 secret-stores
2024-01-22T11:37:52Z I! Loaded inputs: cpu disk diskio mem net nvidia_smi system
2024-01-22T11:37:52Z I! Loaded aggregators: 
2024-01-22T11:37:52Z I! Loaded processors: 
2024-01-22T11:37:52Z I! Loaded secretstores: 
2024-01-22T11:37:52Z I! Loaded outputs: influxdb_v2
2024-01-22T11:37:52Z I! Tags enabled: host=4090-2
2024-01-22T11:37:52Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"4090-2", Flush Interval:10s
2024-01-22T11:37:52Z W! DeprecationWarning: Value "false" for option "ignore_protocol_stats" of plugin "inputs.net" deprecated since version 1.27.3 and will be removed in 1.36.0: use the 'inputs.nstat' plugin instead

If it is possible to activate others logs, please tell me how I can do it, and i will.

Please share your config and enable debug mode.

What platform are you running on?

conf file:

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
  # debug = false
  ## Log only error level messages.
  # quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  logtarget = 'file'

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  logfile = 'C:\Program Files\Telegraf\file.log'

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Pick a timezone to use when logging or type 'local' for local time.
  ## Example: America/Chicago
  # log_with_timezone = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["http://192.168.154.160:8086"]

  ## Token for authentication.
  token = "-IOUzzFCnVed6EqndLW2CnbF6LNgNa9LLJz-wyPM3EtM0ioB3oBfD_1BEKAUekTaqjWrum92Ic2zbs0rQ1ANew=="

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "pveO"

  ## Destination bucket to write into.
  bucket = "WindowsTelegraf"

  ## The value of this tag will be used to determine the bucket.  If this
  ## tag is not set the 'bucket' option is used as the default.
  # bucket_tag = ""

  ## If true, the bucket tag will not be added to the metric.
  # exclude_bucket_tag = false

  ## Timeout for HTTP messages.
  # timeout = "5s"

  ## Additional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## HTTP Proxy override, if unset values the standard proxy environment
  ## variables are consulted to determine which proxy, if any, should be used.
  # http_proxy = "http://corporate.proxy:3128"

  ## HTTP User-Agent
  # user_agent = "telegraf"

  ## Content-Encoding for write request body, can be set to "gzip" to
  ## compress body or "identity" to apply no encoding.
  # content_encoding = "gzip"

  ## Enable or disable uint support for writing uints influxdb 2.0.
  # influx_uint_support = false

  ## Optional TLS Config for use on HTTP connections.
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false









# Input plugin to counterPath Performance Counters on Windows operating systems
#[[inputs.win_perf_counters]]
  ## By default this plugin returns basic CPU and Disk statistics.
  ## See the README file for more examples.
  ## Uncomment examples below or write your own as you see fit. If the system
  ## being polled for data does not have the Object at startup of the Telegraf
  ## agent, it will not be gathered.
  ## Settings:
  # PrintValid = false # Print All matching performance counters
  # Whether request a timestamp along with the PerfCounter data or just use current time
  # UsePerfCounterTime=true
  # If UseWildcardsExpansion params is set to true, wildcards (partial wildcards in instance names and wildcards in counters names) in configured counter paths will be expanded
  # and in case of localized Windows, counter paths will be also localized. It also returns instance indexes in instance names.
  # If false, wildcards (not partial) in instance names will still be expanded, but instance indexes will not be returned in instance names.
  #UseWildcardsExpansion = false
  # When running on a localized version of Windows and with UseWildcardsExpansion = true, Windows will
  # localize object and counter names. When LocalizeWildcardsExpansion = false, use the names in object.Counters instead
  # of the localized names. Only Instances can have wildcards in this case. ObjectName and Counters must not have wildcards when this
  # setting is false.
  #LocalizeWildcardsExpansion = true
  # Period after which counters will be reread from configuration and wildcards in counter paths expanded
  #CountersRefreshInterval="1m"
  ## Accepts a list of PDH error codes which are defined in pdh.go, if this error is encountered it will be ignored
  ## For example, you can provide "PDH_NO_DATA" to ignore performance counters with no instances
  ## By default no errors are ignored
  ## You can find the list here: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/win_perf_counters/pdh.go
  ## e.g.: IgnoredErrors = ["PDH_NO_DATA"]
  # IgnoredErrors = []

#  [[inputs.win_perf_counters.object]]
#    # Processor usage, alternative to native, reports on a per core.
#    ObjectName = "Processor"
#    Instances = ["*"]
#    Counters = [
#      "% Idle Time",
#      "% Interrupt Time",
#      "% Privileged Time",
#      "% User Time",
#      "% Processor Time",
#      "% DPC Time",
#    ]
#    Measurement = "win_cpu"
#    # Set to true to include _Total instance when querying for all (*).
#    # IncludeTotal=false
#    # Print out when the performance counter is missing from object, counter or instance.
#    # WarnOnMissing = false
#    # Gather raw values instead of formatted. Raw value is stored in the field name with the "_Raw" suffix, e.g. "Disk_Read_Bytes_sec_Raw".
#    # UseRawValues = true

#  [[inputs.win_perf_counters.object]]
#    # Disk times and queues
#    ObjectName = "LogicalDisk"
#    Instances = ["*"]
#    Counters = [
#      "% Idle Time",
#      "% Disk Time",
#      "% Disk Read Time",
#      "% Disk Write Time",
#      "% User Time",
#      "% Free Space",
#      "Current Disk Queue Length",
#      "Free Megabytes",
#    ]
#    Measurement = "win_disk"

#  [[inputs.win_perf_counters.object]]
#    ObjectName = "PhysicalDisk"
#    Instances = ["*"]
#    Counters = [
#      "Disk Read Bytes/sec",
#      "Disk Write Bytes/sec",
#      "Current Disk Queue Length",
#      "Disk Reads/sec",
#      "Disk Writes/sec",
#      "% Disk Time",
#      "% Disk Read Time",
#      "% Disk Write Time",
#    ]
#    Measurement = "win_diskio"

#  [[inputs.win_perf_counters.object]]
#    ObjectName = "Network Interface"
#    Instances = ["*"]
#    Counters = [
#      "Bytes Received/sec",
#      "Bytes Sent/sec",
#      "Packets Received/sec",
#      "Packets Sent/sec",
#      "Packets Received Discarded",
#      "Packets Outbound Discarded",
#      "Packets Received Errors",
#      "Packets Outbound Errors",
#    ]
#    Measurement = "win_net"


#  [[inputs.win_perf_counters.object]]
#    ObjectName = "System"
#    Counters = [
#      "Context Switches/sec",
#      "System Calls/sec",
#      "Processor Queue Length",
#      "System Up Time",
#    ]
#    Instances = ["------"]
#    Measurement = "win_system"

#  [[inputs.win_perf_counters.object]]
#    # Example counterPath where the Instance portion must be removed to get data back,
#    # such as from the Memory object.
#    ObjectName = "Memory"
#    Counters = [
#      "Committed Bytes",
#      "Available Bytes",
#      "% Committed Bytes In Use",
#    #   "Cache Faults/sec",
#    #   "Demand Zero Faults/sec",
#    #   "Page Faults/sec",
#    #   "Pages/sec",
#    #   "Transition Faults/sec",
#    #   "Pool Nonpaged Bytes",
#    #   "Pool Paged Bytes",
#    #   "Standby Cache Reserve Bytes",
#    #   "Standby Cache Normal Priority Bytes",
#    #   "Standby Cache Core Bytes",
#    ]
#    Instances = ["------"] # Use 6 x - to remove the Instance bit from the counterPath.
#    Measurement = "win_mem"

#  [[inputs.win_perf_counters.object]]
#    # Example query where the Instance portion must be removed to get data back,
#    # such as from the Paging File object.
#    ObjectName = "Paging File"
#    Counters = [
#      "% Usage",
#    ]
#    Instances = ["_Total"]
#    Measurement = "win_swap"

#  [[inputs.win_perf_counters.object]]
#    # GPU usage.
#    ObjectName = "GPU Engine"
#    Instances = ["*"]
#
#    Counters = [
#      "Running Time",
#      "Utilization Percentage",
#    ]
#    Measurement = "win_gpu"

#  [[inputs.win_perf_counters.object]]
#    # GPU usage.
#    ObjectName = "GPU Process Memory"
#    Instances = ["*"]
#
#    Counters = [
#      "Dedicated Usage",
#      "Local Usage",
#      "Non Local Usage",
#      "Shared Usage",
#      "Total Committed",
#    ]
#    Measurement = "win_gpu_mem"

[[inputs.mem]]

[[inputs.diskio]]

[[inputs.disk]]
  ## By default stats will be gathered for all mount points.
  ## Set mount_points will restrict the stats to only the specified mount points.
  # mount_points = ["/"]
  ## Ignore mount points by filesystem type.
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

  ## Ignore mount points by mount options.
  ## The 'mount' command reports options of all mounts in parathesis.
  ## Bind mounts can be ignored with the special 'bind' option.
  # ignore_mount_opts = []

[[inputs.net]]

[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = true
  ## If true, collect raw CPU time metrics
  collect_cpu_time = false
  ## If true, compute and report the sum of all non-idle CPU states
  ## NOTE: The resulting 'time_active' field INCLUDES 'iowait'!
  report_active = false
  ## If true and the info is available then add core_id and physical_id tags
  core_tags = false

[[inputs.system]]

[[inputs.nvidia_smi]]

How do I enable debug mode?

What do you mean with platform?
I’am running on Windows 11.

Are you sure that system has the nvidia-smi binary?

Your TOML looks valid. If you run telegraf via the CLI with that config does it work?

How do I enable debug mode?

Add --debug to the CLI or add under the [agent] section debug = true

What do you mean with platform?

system architecture: x86? arm64? etc.

Yes.
These are two twin systems in which in one telegraph does not cause the problem while in the other it does.
Working as a service.

Not working as a service.

Done, still no data.

2024-02-07T16:08:39Z I! Starting Telegraf 1.28.5 brought to you by InfluxData the makers of InfluxDB
2024-02-07T16:08:39Z I! Available plugins: 240 inputs, 9 aggregators, 29 processors, 24 parsers, 59 outputs, 5 secret-stores
2024-02-07T16:08:39Z I! Loaded inputs: cpu disk diskio mem net nvidia_smi system
2024-02-07T16:08:39Z I! Loaded aggregators: 
2024-02-07T16:08:39Z I! Loaded processors: 
2024-02-07T16:08:39Z I! Loaded secretstores: 
2024-02-07T16:08:39Z I! Loaded outputs: influxdb_v2
2024-02-07T16:08:39Z I! Tags enabled: host=4090-4
2024-02-07T16:08:39Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"4090-4", Flush Interval:10s
2024-02-07T16:08:39Z D! [agent] Initializing plugins

x64

So is the service still running? Or does it stop? can you add [[outputs.file]]?

Yes.

You didn’t tell me to do it.
What parameters should I put?