Telegraf json array parser causes the field keys to not line up

telegraf

#1

I am trying to use Telegraf to parse a JSON array of discovered HTTP status counters into influxdb. Is it possible to not have array counter prepended into field key?

My desired output for each status code:

http_StatusCounters_200=31320849,
http_StatusCounters_404=1378251,
http_StatusCounters_408=16719,
http_StatusCounters_500=100164 1550618052000000000

The current output causing difficulty:

http_StatusCounters_0_200=31320849,
http_StatusCounters_1_404=1378251,
http_StatusCounters_2_408=16719,
http_StatusCounters_3_500=100164 1550618052000000000

The json used an array because it expects to discover new status codes and the array can expand over time (upon restart will start sparse again). The code should be the key for the counter value not the offset into the array.

The json parser prepending an array index before the code causes the key to keep shifting whenever the number of array elements is changed. Here are samples that show when the JSON array expands and the result in output series to influx:

JSON input 1

{
    "http": { 
        "RequestCount": 32826821,
        "ResponseCount": 32823588,
        "StatusCounters" : [ 
            { "200": 31320849 },
            { "404": 1378251 },
            { "408": 16719 },
            { "500": 100164 } ]
        }
    }
}

JSON input 2

{
    "http": { 
        "RequestCount": 32830174,
        "ResponseCount": 32827825,
        "StatusCounters" : [ 
            { "200": 31322577 },
            { "400": 64 },
            { "404": 1378539 },
            { "405": 503 },
            { "408": 17432 },
            { "410": 178 },
            { "500": 100898 } ]
        }
    }
}

Output1:
http_RequestCount=32826821,http_ResponseCount=32823588,http_StatusCounters_0_200=31320849,http_StatusCounters_1_404=1378251,http_StatusCounters_2_408=16719,http_StatusCounters_3_500=100164 1550618052000000000

Output2:
http_RequestCount=32830174,http_ResponseCount=32827825,http_StatusCounters_0_200=31322577,http_StatusCounters_1_400=64,http_StatusCounters_2_404=1378539,http_StatusCounters_3_405=503,http_StatusCounters_4_408=17432,http_StatusCounters_5_410=178,http_StatusCounters_6_500=100898 1550618070000000000

Notice that a 500 level status code doesn’t allow running counter to line up so it causes an awkward/bad influx data series:
http_StatusCounters_3_500=100164
http_StatusCounters_6_500=100898

I am using telegraf 1.9.4 with the following simple input config. I would like to avoid any rigid assumptions of status codes it might discover.

[[inputs.http]]
    name_suffix=".rest.http"
    urls = [
        "http://127.0.0.1:8100/httpstatus.json"
    ]
    method = "GET"
    data_format = "json"
    [inputs.http.tags]
        influxdb_database = "app"

I am still newbie to Telegraf and welcome suggestions. I might be able to alter the code if I was pointed to where the array index was getting inserted.


#2

Is there any chance that you can modify the incoming JSON? Something like this would work well with the JSON parser:

{
    "http": {
        "RequestCount": 32830174,
        "ResponseCount": 32827825,
        "StatusCounters" : {
            "200": 31322577,
            "400": 64,
            "404": 1378539,
            "405": 503,
            "408": 17432,
            "410": 178,
            "500": 100898
        }
    }
}

If that’s not a possibility, you could consider going with this config. It will grab only the status counters, no longer collecting the total request and response:

[[inputs.file]]
  files = ["test.json"]
  json_query = "http.StatusCounters"
  data_format = "json"

If you are feeling especially adventurous, the code that adds these indexes to the field name is here: https://github.com/influxdata/telegraf/blob/f8cc9719a237ab221594a7ac2c42c4cc18d1859a/plugins/parsers/json/parser.go#L238