Job Scheduling for periodic web API scraping into InfluxDB

Kevin-Chen0 · October 20, 2019, 7:32pm

Hi there,

I have just started with my InfluxData journey, so am still very new. I have some time-series that I am scraping on a few web APIs. The goal of my workflow is the following:

Every hour, trigger GET calls to several web APIs, receive JSONs response.
Convert JSONs into a Tabular format.
Write into a staging Time-series bucket for API.
Use Flux to aggregate these time-series buckets and upload them into a final, refined bucket.
Run various analytics or ML models on that final bucket to gain insight.

My question is two-fold:

Do InfluxDB buckets accept JSON records as is, or do we need to further preprocess the JSON record in order to get accepted? For example, turning it into a tabular row with the timestamp as the primary key. The doc isn’t very clear about whether there is any schema requirement for the bucket:
Create a bucket in InfluxDB | InfluxDB OSS 2.0 Documentation
Can I utilize any Influx tool, such as Telegraf agent and Kapacitor, where I can deploy my Python (or Flux) script to kick off and make these hourly API calls into InfluxDB? Or do I have to use my own job scheduler (i.e. SSIS), and then kick-off my script and call the Influx API to “write” to the DB?

Thanks,
~Kevin

Elbehery · February 4, 2020, 3:56pm

Hi Kevin

You can write to InfluxDB using the LineProtocol. Find more about the format in the docs.

Also you can dump the data directly from any Block Storage ( locally or in the cloud ) using Influx-CLI.

For the second question : Kapacitor has all the functionality you need, and its well integrated with the TICK stack.

Topic		Replies	Views
JSON ingestion from an API Telegraf	2	3231	May 31, 2019
Ingest json data to influxdb	1	466	June 3, 2024
Push json file to InfluxDB Telegraf influxdb , time-series	3	11267	May 26, 2020
How to transfer the data from json to influxdb Welcome & Getting Started	3	8856	February 28, 2021
Adding Json Data from Python to Influxdb Store influxdb , time-series , influxdata	4	23063	July 27, 2020

Job Scheduling for periodic web API scraping into InfluxDB

Related topics