Cronjob to store data in Influxdb

Hi,
Unfortunately db’s are something I know very little about, so please excuse my questions if it appears too simple.
I’ve set up an Influxdb on my Pine64 PineHAB server to store monitoring data (which it successfully does and I can view it on Chronograf). My concern over time is that I’ll over fill my 16GB micro SD and kill the server.
To try and avoid this I want to keep track of the available space and store it in my influxdb (say once a day) to keep an eye on it through somethinglike Chronograf or Grafana.
I assume somehow combining a df command (e.g. df /dev/mmcblk0p2) with a daily cronjob to store the % remaining to influxdb may be the way to go.
Unfortunately I have no idea how to do any of that.
If anyone can help with that or has a better suggestion that would be great.
Thanks and regards,
KISA

There are a number of ways to go about this.

You could run Telegraf on your Pine64 using the Disk Input Plugin, and then specify in the Agent Configuration that you only want Telegraf to collect data every 24 hours. The relevant configuration options for this are interval, round_interval, and flush_interval, which are documented in the agent configuration link above.

Alternatively, you could use a cronjob. How you go about this really depends on what languages and tools you’re comfortable with. You would need to write a script of some kind, which you could then configure cron to execute periodically. Bash, Python, Ruby, or Go would all be appropriate language choices for for this kind of task, but there are other choices as well. You’ll find countless examples of this kind of script if you search Google for “cron disk space” or something similar. Your script could then write the data directly to InfluxDB over the network (maybe using curl if it’s a bash script, or using one of the available client libraries for Influx), or it could write data to a file which could then be read by Telegraf’s tail plugin.

Great, thanks for that. I’ll look over your suggestions and see which is easiest for me.

Just to clarify, can writing data to influxDB be as easy using curl?

InfluxDB has an HTTP API, so it’s easy to use curl. If you look at the Writing data with the HTTP API guide, all of the examples use curl.

Awesome, thanks again.

it could be done the way you suggested which would be kind of cool and instructive

so first we run df and get something like this

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 98846220 78445048 15364732 84% /
devtmpfs 6142456 0 6142456 0% /dev
tmpfs 6145276 0 6145276 0% /dev/shm
tmpfs 6145276 9772 6135504 1% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 6145276 0 6145276 0% /sys/fs/cgroup
tmpfs 1229056 0 1229056 0% /run/user/1000

so then we just get the filesystem we are interested in with grep root

df | grep root

and we get this:
/dev/root 98846220 78446448 15363332 84% /

then we split the line into two pieces using the percent sign as the delimeter

df | grep root | cut -f 1 -d "%"

and we get this:
/dev/root 98846220 78447872 15361908 84

then we get just the number we want which is the percentage free at the end with awk

df | grep root | cut -f 1 -d “%” | awk '{ printf $5; }'

and that gives me 84 (because my disk is 84% used)

now we want to format that in line protocol… which would look like this

insert mystats,id=fsroot pctfull=84i

that means insert into a measurement “mystats” (created if it doesn’t exist) a reading with the id “fsroot” (the idea being that you could support additional statistics or filesystems in the same measurement

84 is the value we extracted and 84i means force this to be an integer

we put all of this into a shell script (bash) so that we can more easily manipulate the construction of the strings (I cannot upload yet), but here’s what it looks like (the formatting my remove some important characters)

#! /bin/bash

cute script to store disk full into InfluxDB

for KISA by Frank Inselbuch July 6, 2018 and placed into the public domain with no restrictions

pctfull=df | grep root | cut -f 1 -d "%" | awk '{ printf $5; }'
influx -database ‘test’ -execute “insert mystats,id=fsroot pctfull=”$pctfull"i"

you can create that file by copying the text above, then typing this at the command line

cat > mystats.sh

then you will need to make the file executable with the following command

chmod +x mystats.sh

this is what I have in my database so far

select * from mystats
name: mystats
time id pctfull


2018-07-06T18:58:52.525066566Z fsroot 84
2018-07-06T19:05:39.001033954Z fsroot 84
2018-07-06T19:06:44.482074596Z fsroot 84
2018-07-06T19:06:45.685583125Z fsroot 84
2018-07-06T19:06:46.158353308Z fsroot 84
2018-07-06T19:06:46.517829397Z fsroot 84

Now we just need to schedule this… perhaps hourly… using cron

I do this in the system-wide crontab as root

sudo vi /etc/crontab

and I added this line:

0 * * * * root /home/frank/mystats.sh

that means it runs on minute 0 of every hour of every darn day… it runs as root

let me know if you have troubles

1 Like

That’s awesome, thank you very much.