How to use inputs.http plugin for the http APIs which return XML data

Hello, Am using [[inputs.http]] to fetch the outcome from a REST API.
However, am the REST API is is returning xml data instead of json.
am using data_format = "xml" , but it doesn’t works though. Just wondering if this plugin works for xml.

[root@h0012345 naypavk]# telegraf --config /home/naypavk/src/telegraf/mutedtests.conf -test
2021-08-02T15:51:17Z I! Starting Telegraf 1.19.1
2021-08-02T15:51:17Z D! [agent] Initializing plugins
2021-08-02T15:51:17Z D! [agent] Starting service inputs
2021-08-02T15:51:17Z D! [agent] Stopping service inputs
2021-08-02T15:51:17Z D! [agent] Input channel closed
2021-08-02T15:51:17Z D! [agent] Stopped Successfully

Of course you also have to configure the input format xml correctly:

1 Like

Thanks for quick response @Franky1.
Am new to the telegraf tool. Just have one quick question.
I understand that [[inputs.file]] will help in this context.
But am not understanding how to pass the outcome of the [[inputs.http] to [[inputs.file]]

Eg: here is the conf portion i have for [[inputs.http] (ofcourse it doesn’t return anything because xml formatting is required)

[[inputs.http]]
  ## One or more URLs from which to read formatted metrics
  urls = ["https://toscity.xxxxxxxxx.local/httpAuth/app/rest/mutes"]

  ## HTTP method
     method = "GET"

  ## Optional HTTP headers
  # headers = {"X-Special-Header" = "Special-Value"}
    headers = {"authorization" = "Basic xxxxxxxxxxxxxxxxxxxx"}

  ## Amount of time allowed to complete the HTTP request
     timeout = "30s"

  ## List of success status codes
     success_status_codes = [200]

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
    data_format = "xml"
	tag_keys = ["state"]
	name_override = "muted_tests"
	json_query = "values"
	#json_string_fields = ["id", "title", "createdDate","updatedDate"]

please advise.

I do not understand how the inputs.file plugin will help here? :thinking:
You can’t forward inputs to inputs, it doesn’t work like that. What is the purpose of this?

I have an idea why you come up with inputs.file plugin - don’t be confused by the example in the docs. :wink:
For the inputs.http plugin, the example config looks something like this:

[[inputs.http]]
  # your http config here
  data_format = "xml"
  
  [[inputs.http.xml]]
    #metric_selection = "/Bus/child::Sensor"
    #metric_name = "string('example')"
    #timestamp = "/Gateway/Timestamp"
    #timestamp_format = "2006-01-02T15:04:05Z"

    ## Tag definitions using the given XPath queries.
    [inputs.http.xml.tags]
      name   = "substring-after(Sensor/@name, ' ')"
      device = "string('the ultimate sensor')"

    ## Integer field definitions using XPath queries.
    [inputs.http.xml.fields_int]
      consumers = "Variable/@consumers"

    ## Non-integer field definitions using XPath queries.
    [inputs.http.xml.fields]

Thanks @Franky1, this is a great input. Now am understanding better.
Now the telegraf is able to recognize the xml data.
But the telegraf is finding trouble interpreting the first line of the xml file which is something comes as as a default text.

<!-- Add XML Data --><?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<mutes count="100" nextHref="/httpAuth/app/rest/mutes?locator=start:100,count:100" href="/httpAuth/app/rest/mutes">
    <mute id="2" href="/httpAuth/app/rest/mutes/id:2">
        <assign..............................

telegraf fails with below error…

[root@h00123456 naypavk]# telegraf --config /home/naypavk/src/telegraf/mutedtests.conf -test
2021-08-03T17:04:34Z I! Starting Telegraf 1.19.1
2021-08-03T17:04:34Z D! [agent] Initializing plugins
2021-08-03T17:04:34Z D! [agent] Starting service inputs
2021-08-03T17:04:34Z E! [inputs.http] Error in plugin: [url=https://toscity.xxxxxxxxxx.local/httpAuth/app/rest/mutes]: *metric parse error: expected timestamp at 1:21: "<?xml version=\"1.0\" <-- here"*
2021-08-03T17:04:34Z D! [agent] Stopping service inputs
2021-08-03T17:04:34Z D! [agent] Input channel closed
2021-08-03T17:04:34Z D! [agent] Stopped Successfully
2021-08-03T17:04:34Z E! [telegraf] Error running agent: input plugins recorded 1 errors

Is there any way to ignore it and proceed ahead. Since the xml file is direct outcome of the http api, i don’t have much control over that. If there is any provision to save the outcome to a local, then may be i might be able to remove the line and interpret the xml data as expected. please advise.

I am not sure. I would guess that the XML parser should be able to handle it.
Maybe there is something wrong with your configuration.
I have not used the XML parser yet. So I could be wrong.

If you get stuck, please post an example XML payload and your XML input configuration.

I tried it with the snippet, it should work.
If it doesn’t work for you, please post a full example of the XML payload here.

It has been posing limitation to upload the files for me being a new user.
here am pasting the conf file & xml payload. I know it would be inconvenient for you - but i do not see any other option.

# Telegraf Configuration
#
# Telegraf is entirely plugin driven. All metrics are gathered from the
# declared inputs, and sent to the declared outputs.
#
# Plugins must be declared in here to be active.
# To deactivate a plugin, comment out the name and any variables.
#
# Use 'telegraf -config telegraf.conf -test' to see what metrics a config
# file would generate.
#
# Environment variables can be used anywhere in this config file, simply surround
# them with ${}. For strings the variable must be within quotes (ie, "${STR_VAR}"),
# for numbers and booleans they should be plain (ie, ${INT_VAR}, ${BOOL_VAR})


# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "30s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "20s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
    debug = true
  ## Log only error level messages.
    quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  # logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  # logfile = ""

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Pick a timezone to use when logging or type 'local' for local time.
  ## Example: America/Chicago
  # log_with_timezone = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
  
###############################################################################
#                            INPUT PLUGINS                                    #
###############################################################################


# Read formatted metrics from one or more HTTP endpoints
[[inputs.http]]
  ## One or more URLs from which to read formatted metrics
  urls = ["https://teamcity.localsys.local/httpAuth/app/rest/mutes"]

  ## HTTP method
     method = "GET"

  ## Optional HTTP headers
  # headers = {"X-Special-Header" = "Special-Value"}
    headers = {"authorization" = "Basic xxxxxxxxxxxxxxxxx"}

  ## Amount of time allowed to complete the HTTP request
     timeout = "30s"

  ## List of success status codes
     success_status_codes = [200]

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
    #data_format = "xml"
	#tag_keys = ["state"]
	#name_override = "muted_tests"
	#json_query = "values"
	#json_string_fields = ["id", "title", "createdDate","updatedDate"]

   [[inputs.http]]
  # your http config here
	data_format = "xml"  
	[[inputs.http.xml]]
    #metric_selection = "/Bus/child::Sensor"
    #metric_name = "string('example')"
     timestamp = "mutes/mute/assignment/timestamp"
    #timestamp_format = "2006-01-02T15:04:05Z"

    ## Tag definitions using the given XPath queries.
    [inputs.http.xml.tags]
      #name   = "substring-after(Sensor/@name, ' ')"
      #device = "string('the ultimate sensor')"

    ## Integer field definitions using XPath queries.
    [inputs.http.xml.fields_int]
      # consumers = "Variable/@consumers"

    ## Non-integer field definitions using XPath queries.
    [inputs.http.xml.fields]

PFA for the xml payload

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<mutes count="100" nextHref="/httpAuth/app/rest/mutes?locator=start:100,count:100" href="/httpAuth/app/rest/mutes">
    <mute id="2" href="/httpAuth/app/rest/mutes/id:2">
        <assignment>
            <user username="abc123" name="~Disabled firstname1, lastname1" id="79" href="/httpAuth/app/rest/users/id:79"/>
            <timestamp>20150422T135143-0500</timestamp>
        </assignment>
        <scope>
            <project id="Project1" name="XYZ Project" parentProjectId="Builds" description="floatorfly project, see https://confluence.globalsys.local/display/XYZ" href="/httpAuth/app/rest/projects/id:Project1" webUrl="https://xyzcity.localsys.local/project.html?projectId=Project1"/>
        </scope>
        <target>
            <tests count="1">
                <test id="2189408212702110526" name="com.experteam.xyz.test.time.RoutingSwitchTest.testDisabledSwitching" href="/httpAuth/app/rest/tests/id:2189408212702110526"/>
            </tests>
        </target>
        <resolution type="manually"/>
    </mute>
    <mute id="5" href="/httpAuth/app/rest/mutes/id:5">
        <assignment>
            <user username="abc123" name="~Disabled firstname1, lastname1" id="79" href="/httpAuth/app/rest/users/id:79"/>
            <timestamp>20150427T211329-0500</timestamp>
        </assignment>
        <scope>
            <project id="Project1" name="XYZ Project" parentProjectId="Builds" description="floatorfly project, see https://confluence.globalsys.local/display/XYZ" href="/httpAuth/app/rest/projects/id:Project1" webUrl="https://xyzcity.localsys.local/project.html?projectId=Project1"/>
        </scope>
        <target>
            <tests count="1">
                <test id="-8974266636665971803" name="com.experteam.xyz.test.configuration.MarginParametersManagementTest.testDefaultMarginProfileCreation" href="/httpAuth/app/rest/tests/id:-8974266636665971803"/>
            </tests>
        </target>
        <resolution type="manually"/>
    </mute>
</mutes>

I put your xml payload in a file and ran this telegraf example config, seems to work:

[[inputs.file]]
  name_override = "teamcity"
  files = ["payload.xml"]
  data_format = "xml"
  [[inputs.file.xml]]
    metric_selection = "/mutes/mute"
    timestamp = "assignment/timestamp"
    timestamp_format = "20060102T150405-0700"
    [inputs.file.xml.tags]
      project = "string(scope/project/@id)"
    [inputs.file.xml.fields_int]
      mute = "@id"
      userid = "assignment/user/@id"
    [inputs.file.xml.fields]
      username = "string(assignment/user/@username)"
      resolution = "string(resolution/@type)"

[[outputs.file]]  # only for debugging
  files = ["xmlparser.out"]
  influx_sort_fields = true

Output:

teamcity,project=Project1 mute=2i,resolution="manually",userid=79i,username="abc123" 1429728703000000000
teamcity,project=Project1 mute=5i,resolution="manually",userid=79i,username="abc123" 1430187209000000000

Adjust the config to your needs and transfer it to the http plugin.

1 Like

Hello @Franky1 ,
Somehow, I never got a chance to thank you back, due to other overwhelming priorities.
But your timely and detailed assistance helped me to the great extent. I highly appreciate your help in this regard. Great respect!! Thanks much again!!

1 Like