I have just started with my InfluxData journey, so am still very new. I have some time-series that I am scraping on a few web APIs. The goal of my workflow is the following:
- Every hour, trigger GET calls to several web APIs, receive JSONs response.
- Convert JSONs into a Tabular format.
- Write into a staging Time-series bucket for API.
- Use Flux to aggregate these time-series buckets and upload them into a final, refined bucket.
- Run various analytics or ML models on that final bucket to gain insight.
My question is two-fold:
- Do InfluxDB buckets accept JSON records as is, or do we need to further preprocess the JSON record in order to get accepted? For example, turning it into a tabular row with the timestamp as the primary key. The doc isn’t very clear about whether there is any schema requirement for the bucket:
- Can I utilize any Influx tool, such as Telegraf agent and Kapacitor, where I can deploy my Python (or Flux) script to kick off and make these hourly API calls into InfluxDB? Or do I have to use my own job scheduler (i.e. SSIS), and then kick-off my script and call the Influx API to “write” to the DB?