Hi,
I am looking for a plugin to scrape the body content from a url
For example:
URL = http://192.168.2.13/js/status.js
Response:
var version=“H4.01.38Y1.0.09W1.0.08”;var m2mMid=“632788327”;var wlanMac=“BC:54:F9:F2:78:5C”;var m2mRssi=“44%”;var wanIp=“192.168.2.13”;var nmac=“BC54F9F2785F”;var fephy=“off”;var webData=“NLBN402017AL2144,NL2-V1.0-45943,V5.3-90170,omnik4000tl2,4000,900,396,201488 ,1,”; …
Data to extract:
Rated power = 4000 W
Current power = 900 W
Yield today = 396 kWh
Total yield = 201488 kWh
Alerts =
Last updated = 1 min Ago
My data is in the bold section this I can parse with regex processor.
Is this possible with a standard Telegraf plugin?
jpowers
October 26, 2022, 11:35pm
2
You can use the http plugin to call to a web page and get the response. However, that response needs to be in something we can parse. Like JSON, CSV, or values to correctly and easily parse out the data.
The other option is to use the exec plugin to curl or wget the file and parse it with a script and send the output of that parsing to telegraf.
Thanx for the help!
It would be the second option then because the format is not a supported format!
You might also try to use the http plugin with the grok parser to extract the webData
part and then use the parser processor to split the inner CSV… Like
[[inputs.http]]
...
data_format = "grok"
grok_patterns = ['''var webData="%{DATA:value}";''']
[[processors.parser]]
parse_fields = ["value"]
drop_original = true
data_format = "csv"
csv_column_names = ["SN", "device", "version", "name", "power_rated", "power_current", "yield_today", "yield_total", "last_updated", "alerts"]
csv_column_types = ["string", "string", "string", "string", "float", "float", "float", "float", "int", "string"]
csv_tag_columns = ["name"]
which leads to
file,name=omnik4000tl2 SN="NLBN402017AL2144",alerts="",device="NL2-V1.0-45943",last_updated=1i,power_current=900,power_rated=4000,version="V5.3-90170",yield_today=396,yield_total=201488 1666879627513394591
in your example. Please note that I replaced the strange double-quotes in your example, so you might need to adapt those in the grok-pattern…
1 Like
Hi,
Thanx for this solution looks very good!