Hi all, I am a complete newbie and trying to figure out the best way to design a schema for a dataset that looks as follows:
Product ID: A character string with 8 characters
Product Type: L, M or H
Air temperature [K]:
Process temperature [K]:
Rotational speed [rpm]:
Torque [Nm]: torque values are normally distributed around 40 Nm with a σ = 10 Nm and no negative values.
Tool wear [min]: in minutes
Machine failure: label that indicates, whether the machine has failed in this particular datapoint for any of the following failure modes are true.
Tool wear failure (TWF): time in minutes
Heat dissipation failure (HDF): Boolean 0 or 1
Power failure (PWF): Boolean 0 or 1
Overstrain failure (OSF): Boolean 0 or 1
Random failures (RNF): Boolean 0 or 1
So would the schema look something like this in line protocol?
I found this webinar from Influxdb about Schema Design for IoT to be extremely helpful. Watch it all the way through, then think about your own data. Write down your field names & tag names, then go back and watch the video again to make sure it still makes sense.
Just as an aside, your data is apparently related to machinery. Do you have several types of machines, or just one? If you have more than one, and you are monitoring each, then you could assign each an Equipment ID. Depending on your setup, you may have something like this:
Hello @grant1, thank you this is really helpful! This is actually a synthetic dataset from the UCI predictive maintenance dataset - https://archive.ics.uci.edu/ml/dataset/AI4I+2020+Predictive+Maintenance+Dataset
But it’s supposed to be one kind of machine and product id is a misnomer in my opinion. I’m using the product id as a unique key, essentially but maybe that’s not the right thing to do and I should generate an equipment id as you point out.
Thanks for the link to webinar, will watch it.
Best,
Chait