How to monitor the disk space used by influxdb databases?

I have two databases in InfluxDB:

Database #1 = the internal influxdb database (named _internal by default)
Database #2 = data inserted by Telegraf (called telegraf).

I am simply trying to monitor the size of the /data/ and /wal/ directories on the disk.

I obtain these readings in two different ways, but they do not agree with each other:

  1. Using diskBytes field from the influxdb_shard metric of the InfluxDB Input Plugin in Telegraf, described here, and as recommended in the docs.

  2. Using a small bash script to grab the sizes of the /data/ and /wal/ directories using the du command, and executing it with the exec input plugin of Telegraf. Here is the script:

echo "["
du -s -B1 "$@" | awk '{if (NR!=1) {printf ",\n"};printf "  { \"dir_size_bytes\": "$1", \"path\": \""$2"\" }";}'
echo
echo "]"

Now, as far as I understand, the influxdb_shard metric gives the sum of both the /wal/ and /data/ directories, and gives a value for each database (which it uses for a Tag). Whereas the result of du will sum data for both both databases and wal for both databases (because they both sit in one directory).

Therefore, to just get the total disk space used, I plot in Grafana the sum of the two shard readings and the sum of the

  1. du /data/ + du /wal/
  2. influxdb_shard (for database _internal) + influxdb_shard (for database telegraf)

Why don’t these two things agree with each other?

Surely this is a common thing to do, and someone can help?

I use influxdb scrapper and get the size of each one of my buckets. No idea if that will be different to what you do.

Create an InfluxDB scraper | InfluxDB OSS v2 Documentation (influxdata.com)

from(bucket: “Scrapper”)
|> range(start: -30s)
|> filter(fn: (r) => r[“_measurement”] == “storage_tsm_files_disk_bytes”)
|> aggregateWindow(every: 30s, fn: last, createEmpty: false)
|> last()

I am using InfluxDB 1.8.