Parsing Telegraf output json as input json

I’m trying to “pipe” JSON between instances of Telegraf.

Data in Telegraf’s JSON output standard format:

{
    "fields": {
        "field_1": 30,
        "field_2": 4,
        "field_N": 59,
        "n_images": 660
    },
    "name": "my_measurement",
    "tags": {
        "tag_1": "bovik"
    },
    "timestamp": 1458229140
}

Given the input config

  data_format = "json"

we get

{
    "fields": {
        "fields_field_1": 30,
        "fields_field_2": 4,
        "fields_field_N": 59,
        "fields_n_images": 660,
        "timestamp": 1458229140
    },
    "name": "file",
    "tags": {
        "host": "my-computer.local"
    },
    "timestamp": 1661502531
}

where the measurement name is lost, all the tags are lost, and the fields are prepended with fields_.

Given the input config

  data_format = "json"
  tag_keys = ["name", "tags_*"]

we get

{
    "fields": {
        "fields_field_1": 30,
        "fields_field_2": 4,
        "fields_field_N": 59,
        "fields_n_images": 660,
        "timestamp": 1458229140
    },
    "name": "file",
    "tags": {
        "host": "my-computer.local",
        "name": "my_measurement",
        "tags_tag_1": "bovik"
    },
    "timestamp": 1661502562
}

where the measurement name and tags are kept, but field and tag names are prepended with tags_ and fields_.

Given the input config

  data_format = "json"
  tag_keys = ["name", "tags_*"]
  [[processors.regex]]
    [[processors.regex.tag_rename]]
      pattern = "^tags_(.+)$"
      replacement = "${1}"
    [[processors.regex.field_rename]]
      pattern = "^fields_(.+)$"
      replacement = "${1}"

we get

{
    "fields": {
        "field_1": 30,
        "field_2": 4,
        "field_N": 59,
        "n_images": 660,
        "timestamp": 1458229140
    },
    "name": "file",
    "tags": {
        "host": "my-computer.local",
        "name": "my_measurement",
        "tag_1": "bovik"
    },
    "timestamp": 1661502609
}

which at least resembles the original data (ignoring the extra timestamp) and is usable.

Using json_v2 was even more of a time sink. In the data, tags is an object, so I hit Add dynamic tag set to json_v2 · Issue #10576 · influxdata/telegraf · GitHub

Am I missing something or should not there be an easy “out of the box” way to ingest JSON from another instance of Telegraf, without hard-coding strings and regexes into the config?

Hi @john_gronska,
So in the original JSON parser, you had to specify the string values like so:

  ## String fields is an array of keys that should be added as string fields.
  json_string_fields = []

Personally, for your use case, I would make use of the JSON V2 parser instead:

Will ingest strings by default and give you more versatility over how you parse the incoming json payload.

Even more, why not using telegraf’s native influx format to pipe 1 telegraf to another?

Thanks for your response. At the bottom of my admittedly long post, I wrote “Using json_v2 was even more of a time sink. In the data, tags is an object, so I hit Add dynamic tag set to json_v2 · Issue #10576 · influxdata/telegraf · GitHub”. I actually started with json_v2 and backed down to json after not being able to come up with a config that would work for this use case.

That makes sense, yes, but they’re connected via MQTT, and other services are potentially on that “message bus” (plus I already have a fleet of devices speaking JSON in the field).

Okay, that wasn’t clear initially. Nevermind what I said then…

Hey @john_gronska,

did you try to use the xpath parser with

  data_format = "xpath_json"

  [[inputs.file.xpath]]
    metric_name = "/name"
    timestamp = "/timestamp"
    timestamp_format = "unix"
    field_selection = "fields/*"
    tag_selection = "tags/*"