I have gotten started with InfluxDB for collecting systems and HTCondor job scheduler metrics in our cluster, which will grow up to a few hundred compute nodes (in addition to about 6 servers, which we also want to monitor). Monitoring done with Grafana, and collection done with Telegraf. I am currently collecting the basic network and systems metrics with Telegraf, and perhaps from 6 to 12 different metrics from HTCondor.
Work is being done in the lab, but I would like to understand how to determine what storage space is necessary for the InfluxDB database, for planning services when we go into production? I am unsure yet how long we must retain the data for, so it could be 3 mos, 6 mos, or 1 yr. Any guidance would be greatly appreciated!