Getting all data in a deep nested json with telegraf json_v2

Hello. I am trying to get data in a deep nested json. All data. I have seen the test examples on GitHub but none of them has the context to get all data deep nested json files.
Example

{
  trafficRegistrationPoints {
    id
    name
    location {
      roadLinkSequence {
        roadLinkSequenceId
        relativePosition
      }
      county {
        name
      }
      municipality {
        name
      }
      coordinates {
        latLon {
          lat
          lon
        }
      }
      roadReference {
        shortForm
        roadCategory {
          id
        }
      }
      roadReferenceHistory {
        validFrom
        validTo
        meteringDirectionChanged
      }
    }
    trafficRegistrationType
    direction {
      from
      fromAccordingToMetering
      to
      toAccordingToMetering
    }
    commissions {
      validFrom
      validTo
      lanes {
        laneNumber
        laneNumberAccordingToMetering
      }
    }
    manualLabels {
      validFrom
      validTo
      affectedLanes {
        lane {
          laneNumber
          laneNumberAccordingToMetering
        }
        states
      }
    }
    dataTimeSpan {
      firstData
      firstDataWithQualityMetrics
      latestData {
        volumeByHour
        volumeByDay
        volumeAverageDailyByMonth
        volumeAverageDailyBySeason
        volumeAverageDailyByYear
      }
    }
    meteringDirectionChanged
    operationalStatus
    registrationFrequency
  }
}

How do I get this to the line protocol influxdb requires, kindly. thanks.

This is a json like structure, but not a valid json dataset.

  • Can you provide a dataset with real data in json?
  • Which of the nested parameters should be read?

Hello thanks for the reply. an example similar to mine is an examples on GitHub testdata links, the multiple_json_input example. under root.station object and #.etd.0.estimate.0.minutes. How can I convert it to get all minutes in each json file rather than first minute in each. I’ve tried #.etd.0.estimate.#.minutes and #.etd.#.estimate.#.minutes as object.field, object.fields, object.tag, and object.object but still no solution. thanks.

I would suggest you show us how far you have come so far. Show us your sample json data and your parser config and then we’ll help.

my data is

{
  "data": {
    "trafficRegistrationPoints": [
      {
        "id": "53887V521439",
        "name": "VOLD",
        "location": {
          "roadLinkSequence": {
            "roadLinkSequenceId": 521439,
            "relativePosition": 0.53887
          },
          "county": {
            "name": "Vestfold og Telemark"
          },
          "municipality": {
            "name": "Skien"
          },
          "coordinates": {
            "latLon": {
              "lat": 59.124808,
              "lon": 9.50871
            }
          },
          "roadReference": {
            "shortForm": "FV353 S2D1 m6526",
            "roadCategory": {
              "id": "F"
            }
          },
          "roadReferenceHistory": [
            {
              "validFrom": "2013-02-27T00:00:00+01:00",
              "validTo": "2020-01-01T00:00:00+01:00",
              "meteringDirectionChanged": false
            },
            {
              "validFrom": "2020-01-01T00:00:00+01:00",
              "validTo": null,
              "meteringDirectionChanged": false
            }
          ]
        }
]
}
}

my parser config is

[[inputs.file]] 
    files = ["/etc/telegraf/data/trafficRegistrationPointsj.json"]
    data_format = "json_v2"                                                                                                                                                                                            
    [[inputs.file.json_v2]]
            [[inputs.file.json_v2.object]]                                                                                                                                                                                         
            path = "data.trafficRegistrationPoints"                                                                                                                                                                           
            disable_prepend_keys = false                                                                                                                                                                                      
            included_keys = [                                                                                                                                                                                                                
                           "name","location_roadLinkSequence_roadLinkSequenceId",                                                                                                                                                             
                          "location_roadLinkSequence_relativePosition",                                                                                                                                                            
                          "location_municipality_name",                                                                                                                                                                                      
                         "location_coordinates_latLon_lat",                                                                                                                                                                                 
                         "location_coordinates_latLon_lon",                                                                                                                                                                                 
                          "location_roadReference_shortForm",                                                                                                                                                                                
                         "location_roadReference_roadCategory_id"  ]
                   
              tags = [                                                                                                                                                                               
                        "id"                                                                                                                                                                                                                
                              ] 

             [[inputs.file.json_v2.object.field]]
                  path = "location.roadReferenceHistory"

I have an issue when getting roadReferenceHistory data. This is just part of a big data and would like to get everything in it. Thanks.

Both my editor and telegraf complains, that this is not valid json.
Maybe something went wrong during copy and paste?
Can you provide a full example of your data?
If it is too large for the forum, provide a download link or a github gist or similar.

link to gist

thanks

Phew, this is a tough one.
I had no success either, so far.
The problem seems to be the arrays within the array.
They are ignored by the parser and i don’t know how to properly address them in the config :thinking:
I agree that there is no example, that covers this use case.
Maybe someone form the developers can help?
@jpowers

1 Like

When you get to roadReferenceHistory what is your objective for the line protocol to look like. Looking at the data I see an array:

"roadReferenceHistory": [
    {
        "validFrom": "2014-05-23T00:00:00+02:00",
        "validTo": "2020-01-01T00:00:00+01:00",
        "meteringDirectionChanged": false
    },
    {
        "validFrom": "2020-01-01T00:00:00+01:00",
        "validTo": null,
        "meteringDirectionChanged": false
    }
]

How would you want that in-line protocol to look? You said you wanted all the data, so I assume you want each element of that array to be in your final data like:

metricname,id=00892V578228 0.meteringDirectionChanged="false",1.meteringDirectionChanged="false"

Since you mentioned getting these into line protocol, my first instinct with something like this is not to use Telegraf, but instead, to use my programming language of choice and the InfluxDB Client Libraries. The Telegraf JSON parsers are great for relatively simple data, but as soon as you get nested and complex data it becomes far too difficult to keep straight, let alone debug.

2 Likes

@Cyrus_Jomo
That would have been my next suggestion as well.
Before you waste many hours trying to configure the parser, I would have written the parser myself faster for such complicated data structures, either in an exec(d) or starlark plugin.

1 Like

Thank you guys, I appreciate. @jpowers @Franky1