First time to influx, am I doing this correctly?

trk204 · October 26, 2022, 3:58pm

Hey everyone, first time over here playing with influxdb, have 2.4 running inside a docker container and it all seems to be humming along.

My goal for this is to be able to glean some information/trends on 1000s of files we ingest every day. We are pulling in a ton of different weather data/images from various sources and I would like to be able to track ingest rate of specific types of files, and also track the delta of the ingest time and image creation time (amongst other items aswell once I get going).

Doing a bit of reading/youtubing and saw someone write out a schema design and decided to try it for my data aswell. Here is what I came up with.

database = image_stats
measurements = images 
tags  
--->host       ==  hostname of system (multiple systems to monitor)
--->source   ==  source of product  (may or may not bu used)
--->product  ==  type of product
--->site         ==  site for product
--->type0      ==  product sub type 1  (may or may not be used)
--->type1      ==  product sub type 2  (may or may not be used)
fields 
--->issuetime (in epoch seconds)
--->timestamp (in epoch seconds)
--->file_size
precision=s

So what I would like to be able to do eventually, is be able to say on host x, how many images in the last 10m/30m/1h have come in for product Y on site X. (drill down to sub types if needed)

Or host X, product Y, site Z, what is the average difference between ingesttime and timestamp 10m/30m/1h to monitor the lag between creation and ingest.

Data that we get generally takes the form of

URP WMB CAPPI 1.0 AGL MPRATE SNOW 
URP WMB CAPPI 1.5 AGL MPRATE RAIN 
URP CASAG CAPPI 1.5 AGL MPRATE RAIN 
URP CASAG CAPPI 1.0 AGL MPRATE SNOW
URP XNI ECHOTOP 2.0 100M AGL 78 N 
URP XNI MAXR 2.0 AGL MPRATE
FOTO Composite North-America VIS-Red
FOTO Composite North-Atlantic WV-Lower

I’ve written a parser that takes a hook from our ingest process and am able to produce workable line protocol data. Parser is in perl (I need some internal libs to work out some metadata on the images), so using InfluxDB::LineProtocol module to generate the output. Haven’t quite figured out the connecting over perl aspect to influx yet, so dumping to file and manually importing for now while testing. My perl is dusty

example line protocol

images,host=testbox.local,product=URP,site=CASAG,source=XXX,type0=CAPPI,type1=1.0 issuetime=1666798832i,size=37124i,timestamp=1666798560i 1666798877
images,host=testbox.local,product=URP,site=CASAG,source=XXX,type0=CAPPI,type1=1.5 issuetime=1666798848i,size=33117i,timestamp=1666798560i 1666798908
images,host=testbox.local,product=URP,site=XNI,type0=ECHOTOP,type1=2.0 issuetime=1666798971i,size=18145i,timestamp=1666798800i 1666798999

So basically I’m just curious if I’ve setup my database correctly for what I’m looking to do and it’s safe to move on to figuring out flex to try and get something usefull out of this data.

Thanks!

scott · October 26, 2022, 10:59pm

Your schema design looks great! It will certainly work for your use case.

Topic		Replies	Views
Newbie - data ingestion	9	561	March 30, 2021
Newbie Question : Unable to do a GROUP by day count Welcome & Getting Started influxdb , query , flux	1	456	April 27, 2021
Schema design: peer review request Store	6	1179	October 14, 2018
Multidimensional Data to Influx InfluxDB 2 influxdb	1	196	January 18, 2024
Schema for varying field-keys InfluxDB 2 influxdb	1	15	May 21, 2025

First time to influx, am I doing this correctly?

Related topics