Hello there,
I am working on a project where we are collecting large-scale IoT data from hundreds of sensors deployed in industrial equipment. The data streams in real-time, with each sensor generating metrics like temperature; pressure; and vibration at sub second intervals. Naturally; this has led us to choose InfluxDB due to its time-series capabilities and high write throughput.
Although, as the scale of the data grows, I am starting to face challenges with efficient data management, querying, and storage optimization. Here are a few specific questions I am grappling with:
What is the best way to configure retention policies to ensure old data is automatically removed without affecting system performance? Would it make sense to create separate retention policies for different sensor groups based on their criticality?
I understand that continuous queries can be used for downsampling. Are there any recommended practices for determining the right balance between granularity and storage savings?
Are there specific guidelines or strategies for sharding and indexing when dealing with high-frequency data from a large number of sensors?
Also, I have gone through this post; https://www.influxdata.com/resources/deploy-monitor-and-manage-your-iot-devices-salesforce-marketing which definitely helped me out a lot.
I am also curious about best practices for integrating InfluxDB with visualization tools like Grafana; particularly when dealing with high-cardinality data. Are there specific configurations that optimize dashboard performance?
Thanks in advance for your help and assistance.