InfluxDB cli import from nas

mikeanita · February 4, 2024, 8:54pm

Hello,

I’m new to the Influxdb world.

And I already have my first request.

I would like to have text files in a folder from a nas.

Read these into influx.

I also managed to successfully import a file using cli.

But the problem I have or what doesn’t work.
The files have spaces in the name. How do I enter this in the CLI so that it also recognizes the empty line.

PS C:\Program Files\InfluxData\influxcli> .\influx write --bucket Test --p s --format=lp -f \\192.168.1.198\share\Mike Power Day-2024-02-04.txt


Error: failed to open "\\\\192.168.1.198\\share\\Mike": open \\192.168.1.198\share\Mike: The system cannot find the file specified.

When i write filname without space is okay, so what is the correct way??

PS C:\Program Files\InfluxData\influxcli> .\influx write --bucket Test --p ms --format=lp -f \\192.168.1.198\share\Mike.txt
PS C:\Program Files\InfluxData\influxcli>

and is it also possible to read all text files in a folder at once

scott · February 6, 2024, 6:44pm

But the problem I have or what doesn’t work.
The files have spaces in the name. How do I enter this in the CLI so that it also recognizes the empty line.

@mikeanita You need to escape the spaces in your file path. What client are you using? Looks like Powershell? The following should work:

PS C:\Program Files\InfluxData\influxcli> .\influx write --bucket Test --p s --format=lp -f \\192.168.1.198\share\Mike` Power` Day-2024-02-04.txt

is it also possible to read all text files in a folder at once

The influx write command supports multiple -f/--file flags, so you can pass multiple files at once. So with a little powershell scripting you could list the contents of the directory and add an -f flag for each file. I’m not a Powershell user, so I couldn’t tell you exactly how to do it, but the resulting command would look something like:

PS C:\Program Files\InfluxData\influxcli> .\influx write --bucket Test --p s --format=lp `
-f \\192.168.1.198\share\Mike` Power` Day-2024-01-31.txt `
-f \\192.168.1.198\share\Mike` Power` Day-2024-02-01.txt `
-f \\192.168.1.198\share\Mike` Power` Day-2024-02-02.txt `
-f \\192.168.1.198\share\Mike` Power` Day-2024-02-03.txt `
-f \\192.168.1.198\share\Mike` Power` Day-2024-02-04.txt

mikeanita · February 6, 2024, 8:29pm

Thank you very much. it works!

Yes, powershell is.
I’m glad I got it to work.

I’m currently only testing InfluxDB with a Windows PC.

But later I would like to have Influx on a Pi or something.

And automatically pick up the txt file from the NAS folder.

What hardware could you recommend?
PI okay?

scott · February 6, 2024, 8:37pm

Totally up to you and your use case. We see all kinds of setups. One thing you might consider is using Telegraf to read the files from wherever they live and writing them to InfluxDB. Telegraf is a deamon that just runs in the background and could routinely check for new files to write.

mikeanita · February 7, 2024, 8:33pm

hello @scott

Can you help me with the “Telegraf” configuration for file?

# Parse a complete file each interval
[[inputs.file]]
  ## Files to parse each interval.  Accept standard unix glob matching rules,
  ## as well as ** to match recursive files and directories.
  files = ["/tmp/metrics.out"]

  ## Character encoding to use when interpreting the file contents.  Invalid
  ## characters are replaced using the unicode replacement character.  When set
  ## to the empty string the data is not decoded to text.
  ##   ex: character_encoding = "utf-8"
  ##       character_encoding = "utf-16le"
  ##       character_encoding = "utf-16be"
  ##       character_encoding = ""
  # character_encoding = ""

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "influx"


  ## Name a tag containing the name of the file the data was parsed from.  Leave empty
  ## to disable. Cautious when file name variation is high, this can increase the cardinality
  ## significantly. Read more about cardinality here:
  ## https://docs.influxdata.com/influxdb/cloud/reference/glossary/#series-cardinality
  # file_tag = ""

Would like from this NAS folder "\192.168.1.198\share
Who receives new text files every day with the date as name “Name + Date”.
(there are every days 3 Files “File one-2024-02.txt” / “File two-2024-02.txt” / “File three-2024-02.txt”)
Read this in automatically
The time Stamp format is p=ms

I’m currently doing it by hand. By Powershell

PS C:\Program Files\InfluxData\influxcli> .\influx write --bucket Test --p ms --format=lp `
>> -f \\192.168.1.198\share\File` One-2024-02.txt `
>> -f \\192.168.1.198\share\File` Two-2024-02.txt `
>> -f \\192.168.1.198\share\File` Three-2024-02.txt

Can You Help me?

Many Thanks
LG Mike

scott · February 8, 2024, 2:16pm

For your use case, I’d actually recommend the directory_monitor input plugin. It will look for files in a directory and write any new files to InfluxDB and move the written files to another directory.

It’d look something like this:

# Ingests files in a directory and then moves them to a target directory.
[[inputs.directory_monitor]]
  ## The directory to monitor and read files from (including sub-directories if "recursive" is true).
  directory = "\\192.168.1.198\share"
  #
  ## The directory to move finished files to (maintaining directory hierarchy from source).
  finished_directory = "\\192.168.1.198\written"
  #
  ## Setting recursive to true will make the plugin recursively walk the directory and process all sub-directories.
  # recursive = false
  #
  ## The directory to move files to upon file error.
  ## If not provided, erroring files will stay in the monitored directory.
  error_directory = "\\192.168.1.198\erred"
  #
  ## The amount of time a file is allowed to sit in the directory before it is picked up.
  ## This time can generally be low but if you choose to have a very large file written to the directory and it's potentially slow,
  ## set this higher so that the plugin will wait until the file is fully copied to the directory.
  # directory_duration_threshold = "50ms"
  #
  ## A list of the only file names to monitor, if necessary. Supports regex. If left blank, all files are ingested.
  # files_to_monitor = ["^.*\\.csv"]
  #
  ## A list of files to ignore, if necessary. Supports regex.
  # files_to_ignore = [".DS_Store"]
  #
  ## Maximum lines of the file to process that have not yet be written by the
  ## output. For best throughput set to the size of the output's metric_buffer_limit.
  ## Warning: setting this number higher than the output's metric_buffer_limit can cause dropped metrics.
  # max_buffered_metrics = 10000
  #
  ## The maximum amount of file paths to queue up for processing at once, before waiting until files are processed to find more files.
  ## Lowering this value will result in *slightly* less memory use, with a potential sacrifice in speed efficiency, if absolutely necessary.
  # file_queue_size = 100000
  #
  ## Name a tag containing the name of the file the data was parsed from.  Leave empty
  ## to disable. Cautious when file name variation is high, this can increase the cardinality
  ## significantly. Read more about cardinality here:
  ## https://docs.influxdata.com/influxdb/cloud/reference/glossary/#series-cardinality
  # file_tag = ""
  #
  ## Specify if the file can be read completely at once or if it needs to be read line by line (default).
  ## Possible values: "line-by-line", "at-once"
  # parse_method = "line-by-line"
  #
  ## The dataformat to be read from the files.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "influx"

mikeanita · February 9, 2024, 4:04pm

@scott Thank you for Help

Okay, I’ll create this part on InfluxDB under Telegraf.

Where i writte the user and passowrd for the login NAS connection? in the config

# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
  # debug = false
  ## Log only error level messages.
  # quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  # logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  # logfile = ""

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Pick a timezone to use when logging or type 'local' for local time.
  ## Example: America/Chicago
  # log_with_timezone = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["http://docker:8086"]

  ## Token for authentication.
  token = "$INFLUX_TOKEN"

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "Privat"

  ## Destination bucket to write into.
  bucket = "Mike"

  ## The value of this tag will be used to determine the bucket.  If this
  ## tag is not set the 'bucket' option is used as the default.
  # bucket_tag = ""

  ## If true, the bucket tag will not be added to the metric.
  # exclude_bucket_tag = false

  ## Timeout for HTTP messages.
  # timeout = "5s"

  ## Additional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## HTTP Proxy override, if unset values the standard proxy environment
  ## variables are consulted to determine which proxy, if any, should be used.
  # http_proxy = "http://corporate.proxy:3128"

  ## HTTP User-Agent
  # user_agent = "telegraf"

  ## Content-Encoding for write request body, can be set to "gzip" to
  ## compress body or "identity" to apply no encoding.
  # content_encoding = "gzip"

  ## Enable or disable uint support for writing uints influxdb 2.0.
  # influx_uint_support = false

  ## Optional TLS Config for use on HTTP connections.
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false
# Ingests files in a directory and then moves them to a target directory.
[[inputs.directory_monitor]]
  ## The directory to monitor and read files from (including sub-directories if "recursive" is true).
  directory = "\\192.168.1.198\share"
  user = "xxxx"
  password = "xxxx"

  #
  ## The directory to move finished files to (maintaining directory hierarchy from source).
  finished_directory = "\\192.168.1.198\written"
  #
  ## Setting recursive to true will make the plugin recursively walk the directory and process all sub-directories.
  # recursive = false
  #
  ## The directory to move files to upon file error.
  ## If not provided, erroring files will stay in the monitored directory.
  error_directory = "\\192.168.1.198\erred"
  #
  ## The amount of time a file is allowed to sit in the directory before it is picked up.
  ## This time can generally be low but if you choose to have a very large file written to the directory and it's potentially slow,
  ## set this higher so that the plugin will wait until the file is fully copied to the directory.
  # directory_duration_threshold = "50ms"
  #
  ## A list of the only file names to monitor, if necessary. Supports regex. If left blank, all files are ingested.
  # files_to_monitor = ["^.*\\.csv"]
  #
  ## A list of files to ignore, if necessary. Supports regex.
  # files_to_ignore = [".DS_Store"]
  #
  ## Maximum lines of the file to process that have not yet be written by the
  ## output. For best throughput set to the size of the output's metric_buffer_limit.
  ## Warning: setting this number higher than the output's metric_buffer_limit can cause dropped metrics.
  # max_buffered_metrics = 10000
  #
  ## The maximum amount of file paths to queue up for processing at once, before waiting until files are processed to find more files.
  ## Lowering this value will result in *slightly* less memory use, with a potential sacrifice in speed efficiency, if absolutely necessary.
  # file_queue_size = 100000
  #
  ## Name a tag containing the name of the file the data was parsed from.  Leave empty
  ## to disable. Cautious when file name variation is high, this can increase the cardinality
  ## significantly. Read more about cardinality here:
  ## https://docs.influxdata.com/influxdb/cloud/reference/glossary/#series-cardinality
  # file_tag = ""
  #
  ## Specify if the file can be read completely at once or if it needs to be read line by line (default).
  ## Possible values: "line-by-line", "at-once"
  # parse_method = "line-by-line"
  #
  ## The dataformat to be read from the files.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "influx"

When i Start it on Raspbbery become this error:

root@docker:/etc/telegraf# telegraf --config http://docker:8086/api/v2/telegrafs/0xxxxxxxxxx
2024-02-09T17:17:02Z I! Loading config: http://docker:8086/api/v2/telegrafs/0xxxxxxxxx
2024-02-09T17:17:02Z I! Error getting HTTP config.  Retry 0 of 3 in 10s.  Status=401
2024-02-09T17:17:12Z I! Error getting HTTP config.  Retry 1 of 3 in 10s.  Status=401
2024-02-09T17:17:22Z I! Error getting HTTP config.  Retry 2 of 3 in 10s.  Status=401

when i test it become this error:

sudo -u telegraf telegraf --config /etc/telegraf/telegraf.conf --config-directory /etc/telegraf/telegraf.d/ --output-filter influxdb_v2 -test

error loading config file /etc/telegraf/telegraf.conf: error parsing data: line 493: invalid TOML syntax

Line 493 = this entry: directory = " \192.168.1.198\share"

Topic		Replies	Views
POST data from a text file into database	16	5058	November 13, 2019
Telegraf - influxDB : inputs.file / inputs.tail csv format Telegraf telegraf	6	4449	March 30, 2020
Upload json files to influx influxdb , telegraf	0	693	February 21, 2019
Json file into influx via telegraf	0	495	March 7, 2019
Data ingestion continuously Telegraf telegraf	2	907	November 25, 2019

InfluxDB cli import from nas

Related topics