Via Kafka, I have this incoming JSON payload, that looks very similar to this:
{
"measurement": "weather",
"batches": [
{
"time": 1735854321,
"tags": {
"tag1": "tagValueA",
"tag2": "tagValueB"
},
"fields": {
"field1": 1,
"field2": 2
}
},
{
"time": 1735854322,
"tags": {
"tag1": "tagValueC",
"tag2": "tagValueD"
},
"fields": {
"field1": 3,
"field2": 4
}
}
]
}
I’d like to parse this with Telegraf with either json_v2
, or xpath_json
, or another parser. In this situation, we don’t know the tags names in advance.
The expected LP result should be:
weather,tag1=tagValueA,tag2=tagValueB field1=1,field2=2 1735854321
weather,tag1=tagValueC,tag2=tagValueD field1=3,field4=2 1735854322
Q) How can I accomplish this?
I’m aware of both of the following, but don’t seem to meet my use case with dynamic tags AND batching in array :
opened 06:17PM - 02 Feb 22 UTC
closed 02:35PM - 03 Apr 24 UTC
feature request
area/json
## Feature Request
Add support for dynamic tag names in json_v2 input format
…
### Proposal:
Currently you can specify a dynamic set of fields at a certain path
```toml
[[inputs.file.json_v2.object]]
path = "fields"
```
but this only applies to fields, not tags. In my case, the fields are unknown. What I am using is a JSON representation of what line protocol like this:
```json
{
"measurement": "measurementName",
"tags": {
"tag1": "tagValue1",
"tag2": "tagValue2"
},
"fields": {
"field1": 1.0,
"field2": 2.0
}
}
```
I want to be able to specify "tags_path", with the desired result being:
`measurementName,tag1=tagValue1,tag2=tagvalue2 field1=1.0,field2=2.0 ...`
### Current behavior:
Currently tags must be specified explicitly in the telegraf configuration file as an array of strings. In my case, 'path' does what I want for fields (pretend it is named "field_path") but "tag_path" is missing.
If we create "tag_path" then the input could be documented like this:
```toml
[[inputs.file.json_v2.object]]
path = "" # A string with valid GJSON path syntax, can include array's and object's
tag_path = "" # A string with valid JSON path syntax, can include arrays and objects
```
### Use case:
I am trying to get data into telegraf using Node Red. There is a plugin for Node Red that manipulates line protocol. It uses the object representation I described above. See [node-red-contrib-influxdb-line-protocol](https://github.com/opatut/node-red-contrib-influxdb-line-protocol) for an example.
There are some other requests out in the ether looking for "line protocol in json" as well. (example)[https://community.influxdata.com/t/json-file-line-protocol-influxdb/7818/6] is one. So I think the generic capability would be worthwhile.
master
← HRI-EU:parser_xpath
opened 12:10PM - 04 Feb 22 UTC
- [x] Updated associated README.md.
- [x] Wrote appropriate unit tests.
- [x] … Pull request title or commits are in [conventional commit format](https://www.conventionalcommits.org/en/v1.0.0/#summary)
This PR allows to process tags contained in data in a batch fashion by only specifying queries for selecting tag-nodes, the "name" part within that node as well as the "value" part. This is necessary if you cannot or do not want to specifying them explicitly (e.g. if there is a large number or the tag-names are not known in advance).
The example given in PR #10576
```json
{
"measurement": "measurementName",
"timestamp": "2022-02-02T00:00:00Z",
"tags": {
"tag1": "tagValue1",
"tag2": "tagValue2"
},
"fields": {
"field1": 1.0,
"field2": 2.0
}
}
```
can then be parsed via
```toml
[[inputs.file]]
files = ["example.json"]
data_format = "xpath_json"
[[inputs.file.xpath]]
metric_name = "/measurement"
timestamp = "/timestamp"
timestamp_format = "2006-01-02T15:04:05Z"
field_selection = "fields/child::*"
tag_selection = "tags/child::*"
```
Thanks so much in advance!
REF: https://github.com/influxdata/telegraf/issues/16977
KEYWORDS:
Dynamic Tag Set Parsing
Parsing Dynamic Tags in Batched Array