Outputs.mqtt hierarchy topics limitation

write this information because i don’t find some solution by searching over internet. I don’t know if there is some solution to fix this.

Actually, I try to connect to one Azure IoT Hub by using its MQTT broker feature. Any Azure IoT Hub can be converted into a general MQTT broker with these parameters:

[[outputs.mqtt]]

servers= [“tcp://{iotHubName}.azure-devices.net:8883”]
topic_prefix = “devices/{clientID}/messages/events/”
username=“{iotHubName}.azure-devices.net/{clientID}/?api-version=2021-04-12”
password = “{SAS}”
client_id = "{clientID}
tls_ca = “/etc/telegraf/IoTHubRootCA_Baltimore.pem”
insecure_skip_verify
data_format = “json”

Note: SAS is generated using some kind of script or in Azure Portal, you must indicate number of seconds of validity for TLS connection.
Example: az iot hub generate-sas-token -n {iotHubName} --duration 36000

tls_ca is required for server authentication from client side. You can find CA certificate here: GitHub - Azure-Samples/IoTMQTTSample: MQTT samples for Azure IoT

Issue is following:

telegraf connects OK with Azure IoT Hub, but I do not see any message. After some tests and researching I have thought problem is the topic. Azure requires topic to publish from any client fits to this schema: “devices/{clientID}/messages/events/” and following telegraf guide por outputs.mqtt, it says that other subtopics are added to main publishing topic:
MQTT Topic for Producer Messages
MQTT outputs send metrics to this topic format:
<topic_prefix> / <hostname> / <pluginname>/ (e.g. prefix/web01.example.com/mem).
Then, when apply complete topic we would have:

devices/{clientID}/messages/events/{hostname}/{pluginname}

It seems there is a limitation in Azure topics hierarchy, Azure allows only one level more in publishing topics, at least , in my tests it seems problem is that. I have checked with mosquitto_pub.

If I try:

mosquitto_pub -d -h {iotHubName}.azures-devices.net -p 8883 -i {clientID} -u “{iotHubName}.azure-devices.net/{clientID}/?api-version=2018-06-30” -P “$SAS_TOKEN” -t “devices/{clientID}/messages/events/” -m “Testing from mosquitto_pub” -V mqttv311 -q 1

It works and I see message inside portal. Mosquitto_pub client gives me this log:

Client testInfluxDB sending CONNECT
Client testInfluxDB received CONNACK (0)
Client testInfluxDB sending PUBLISH (d0, q1, r0, m1, ‘devices/{clientID}/messages/events/’, … (X bytes))
Client testInfluxDB received PUBACK (Mid: 1, RC:0)
Client testInfluxDB sending DISCONNECT

If a try with a first sublevel at MQTT topic, it continues working:

mosquitto_pub -d -h {iotHubName}.azures-devices.net -p 8883 -i {clientID} -u “{iotHubName}.azure-devices.net/{clientID}/?api-version=2018-06-30” -P “$SAS_TOKEN” -t “devices/{clientID}/messages/events/level1/” -m “Testing from mosquitto_pub” -V mqttv311 -q 1

Client testInfluxDB sending CONNECT
Client testInfluxDB received CONNACK (0)
Client testInfluxDB sending PUBLISH (d0, q1, r0, m1, ‘devices/{clientID}/messages/events/level1/’, … (X bytes))
Client testInfluxDB received PUBACK (Mid: 1, RC:0)
Client testInfluxDB sending DISCONNECT

But If I try with a second sublevel at MQTT topic, it fails:

mosquitto_pub -d -h {iotHubName}.azures-devices.net -p 8883 -i {clientID} -u “{iotHubName}.azure-devices.net/{clientID}/?api-version=2018-06-30” -P “$SAS_TOKEN” -t “devices/{clientID}/messages/events/level1/level2/” -m “Testing from mosquitto_pub” -V mqttv311 -q 1

Client testInfluxDB sending CONNECT
Client testInfluxDB received CONNACK (0)
Client testInfluxDB sending PUBLISH (d0, q1, r0, m1, ‘devices/{clientID}/messages/events/level1/level2/’, … (X bytes))
OpenSSL Error[0]: error:0A000126:SSL routines::unexpected eof while reading
Error: The connection was lost.

Maybe I don’t know complete method to use outputs.mqtt to use with IoT Hub MQTT broker or, is there some workaround to fix this?

@Jay_Clifford can you help here?
Thank you!

Hi @francisjjp, can you try changing your URL from tcp:// to ssl:// since you are using port 8883

I will try explicitly, but I have watched that when I put tcp://, telegraf ,itself, at log level changes it automatically to ssl://

Hmm sorry, I had a reread of your analysis around the topic structure. It’s interesting the error appears as an SSL issue. To understand correctly level 1 publishes successfully and level 2 fails to write data?

Yes, it is strange. It seems limitation of Azure IoT Hub…Is there some possibility to reduce number of subtopics on Telegraf side, maybe not to publish to / <hostname> / <pluginname>/, only hostname, maybe.

So sadly this is currently not possible. I will add an issue to the Telegraf repo and we shall see if one of the team/community will pick it up. I agree however you should be able to define your own topic for writing the resulting payloads to.

actually @francisjjp I told a lie. If you choose to omit the host in your config you can bring this down to one level

Here is an example:


# Telegraf Configuration
#
# Telegraf is entirely plugin driven. All metrics are gathered from the
# declared inputs, and sent to the declared outputs.
#
# Plugins must be declared in here to be active.
# To deactivate a plugin, comment out the name and any variables.
#
# Use 'telegraf -config telegraf.conf -test' to see what metrics a config
# file would generate.
#
# Environment variables can be used anywhere in this config file, simply surround
# them with ${}. For strings the variable must be within quotes (ie, "${STR_VAR}"),
# for numbers and booleans they should be plain (ie, ${INT_VAR}, ${BOOL_VAR})


# Global tags can be specified here in key="value" format.
[global_tags]
  # dc = "us-east-1" # will tag all metrics with dc=us-east-1
  # rack = "1a"
  ## Environment variables can be used as tags, and throughout the config file
  # user = "$USER"


# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "5s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "5s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
  debug = true
  ## Log only error level messages.
  quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  # logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  # logfile = ""

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Pick a timezone to use when logging or type 'local' for local time.
  ## Example: America/Chicago
  # log_with_timezone = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = true


# Configuration for MQTT server to send metrics to
[[outputs.mqtt]]
  ## MQTT Brokers
  ## The list of brokers should only include the hostname or IP address and the
  ## port to the broker. This should follow the format `[{scheme}://]{host}:{port}`. For
  ## example, `localhost:1883` or `mqtt://localhost:1883`.
  ## Scheme can be any of the following: tcp://, mqtt://, tls://, mqtts://
  ## non-TLS and TLS servers can not be mix-and-matched.
  servers = ["broker.hivemq.com:1883"] # or ["mqtts://tls.example.com:1883"]

  ## Protocol can be `3.1.1` or `5`. Default is `3.1.1`
  # procotol = "3.1.1"

  ## MQTT Topic for Producer Messages
  ## MQTT outputs send metrics to this topic format:
  ## <topic_prefix>/<hostname>/<pluginname>/ (e.g. prefix/web01.example.com/mem)
  topic_prefix = ""

  ## QoS policy for messages
  ## The mqtt QoS policy for sending messages.
  ## See https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_9.0.0/com.ibm.mq.dev.doc/q029090_.htm
  ##   0 = at most once
  ##   1 = at least once
  ##   2 = exactly once
  # qos = 2

  ## Keep Alive
  ## Defines the maximum length of time that the broker and client may not
  ## communicate. Defaults to 0 which turns the feature off.
  ##
  ## For version v2.0.12 and later mosquitto there is a bug
  ## (see https://github.com/eclipse/mosquitto/issues/2117), which requires
  ## this to be non-zero. As a reference eclipse/paho.mqtt.golang defaults to 30.
  # keep_alive = 0

  ## username and password to connect MQTT server.
  # username = "telegraf"
  # password = "metricsmetricsmetricsmetrics"

  ## client ID
  ## The unique client id to connect MQTT server. If this parameter is not set
  ## then a random ID is generated.
  # client_id = ""

  ## Timeout for write operations. default: 5s
  # timeout = "5s"

  ## Optional TLS Config
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"

  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

  ## When true, metrics will be sent in one MQTT message per flush. Otherwise,
  ## metrics are written one metric per MQTT message.
  # batch = false

  ## When true, metric will have RETAIN flag set, making broker cache entries until someone
  ## actually reads it
  # retain = false

  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
  data_format = "json"

key fields:

  • Under agent make sure to set omit_hostname = true

  • Make sure to level this empty: topic_prefix = ""

1 Like

Thank you a lot Jay, I try with this and i will give you the feedback of the test! :grinning:

2 Likes

Not luck with these settings.

omit_hostname=true
topic_prefix = “devices/{clientID}messages/events/”

  • If I put omit_hostname = true, messages continue without arriving, If I redirect output to local broker MQTT (change all choices for SSL, and redirect to one simple local MQTT to port 1883). I see that subtopic for hostname is not erased, only appears a blank space as shown in the screenshot.

  • On the other hand, I must fix topic prefix as “devices/{clientID}/messages/events/” in order fits to Azure IoT Hub schema.

  • In this case {clientID} is testInfluxDB.

[[outputs.mqtt]]
#   ## MQTT Brokers
#   ## The list of brokers should only include the hostname or IP address and the
#   ## port to the broker. This should follow the format `[{scheme}://]{host}:{port}`. For
#   ## example, `localhost:1883` or `mqtt://localhost:1883`.
#   ## Scheme can be any of the following: tcp://, mqtt://, tls://, mqtts://
#   ## non-TLS and TLS servers can not be mix-and-matched.
     servers=["tcp://192.168.1.123:1883"]
#
#   ## Protocol can be `3.1.1` or `5`. Default is `3.1.1`
#   procotol = "3.1.1"
#
#   ## MQTT Topic for Producer Messages
#   ## MQTT outputs send metrics to this topic format:
#   ## <topic_prefix>/<hostname>/<pluginname>/ (e.g. prefix/web01.example.com/mem)
    topic_prefix = "devices/testInfluxDB/messages/events/"
    ##topic_prefix=""
#
#   ## QoS policy for messages
#   ## The mqtt QoS policy for sending messages.
#   ## See https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_9.0.0/com.ibm.mq.dev.doc/q029090_.htm
    qos = 2
  • As you see, telegraf send this complete topic: “devices/test/testInfluxDB/messages/events/ {blank space}/mqtt_consumer”.

  • In next image shows some logging from IoT Hub side, when output MQTT broker is redirected to IoT Hub with SSL considerations. It seems when some message try to arrive, device is disconnected.

UPDATE. It works with following settings

It works, finally.
Changes to make work was removing a simple slash.
If I keep…

[agent]
 omit_hostname = true
[[outputs.mqtt]]

"devices/testInfluxDB/messages/events"

Now, it works. Is not valid to put last slash /

On the other hand, I don’t understand that if I put qos=2 that indicates message is delivered exactly once, message is sent a lot of times as shown in screenshot. If qos=1, message arrives exactly once, each time you click publish MQTT button,