Project: ESP32 -> MQTT -> Node-Red -> InfluxDB -> Grafana

Apologies for the complexity of this topic, but I need specific help and, perhaps, guidance as to the overall strategy. The general trajectory of this project is to collect sensor data from:

  • ESP32 linked to sensors
  • Zigbee sensors

and display the values eventually in Grafana. A Raspberry Pi 5 is being used to run Mosquitto, Node-Red and Influxdb, eventually Grafana further down the line.

The ESP32 Units are standalone and work well. I can publish data from controller ESP32’s to an on-board web server using AsyncWebServer.h library in Adrduino IDE / Platformio (depends on the size of the unit test project). The data is displayed in “Cards” and also charts using “Highcharts”. I’m very happy with that deployment.

Zigbee to MQTT is a separate unit for now and can be put into touch.

The data pathway vexing me for quite some days, okay, weeks, is the route in the title. I want to store more permanent data looking back over months rather than the time-limited window coded into the Async Webserver (about 30 mins, refreshed from t=0 with each new browser instance). Highcharts is a delight to use.

ESP32 → MQTT (solved) - I can prepare any format for the data including JSON should that be required. The data from a single sensor is published to MQTT at any frequency I choose.

I can use a public broker (EMQX) as well as a MQTT broker (Eclipse-Mosquitto) running on the Raspberry Pi. Currently Portainer, MQTT, Node-Red, Influxdb all installed in docker containers.

I can intercept data using MQTTX client with either server confirming data is getting through to that point.

Node-Red on the Raspberry Pi was an easy install and easy to get working. MQTT input nodes intercept topics correctly and a debug node displays the values correctly.
Currently the sensor data is sent from the ESP 32 to the MQTT as a comma-delimited packet:

  • topic (string), value (string)

I’m pretty new to all of this (just a few weeks exposure to MQTT/Node-Red, etc., but the format isn’t a million miles away from what I understand to be JSON format. (Still much to learn there).
Node-Red MQTT-In node shows:

  • 04/02/2024, 14:38:14node: debug 40
    some_prefix/tempDHW_Flow : msg.payload : string[5]
    “78.75”

An intermediate function node (change/replace) strips out the topic prefix to yield:

  • 04/02/2024, 14:57:10node: debug 41
    tempDHW_Flow : msg.payload : string[5]
    “78.75”

An intermediate Change node transforms the value (as string) to a number:

  • 04/02/2024, 15:02:19node: debug 42
    tempDHW_Flow : msg.payload : number
    78.75

I am happy that data gets through to Node-Red and that the Function and Change nodes work as anticipated. (those interventions seem logical to me, but…)

I have installed Influxdb and have followed various recipes/gists and can enter data manually and display the results in a graph. I’m happy that influxdb is working (I’ve tried to install Influxdb 1.8 as a precursor to upgrading to influx db 3 so I can use InfluxQL rather than Flux, but I seem always to get v2.7.5, but that’s a separate issue to be solved some other time).

Here’s the stumbling block. I cannot get data out of Node-Red and into Influxdb. This topic is either for the Node-red Community or Influxdb, but I am reluctant to duplicate the posts on both fora. So trying here first…

The final node is an Influx Out node. No data appears in the InfluxDb database. And beyond that I am quite lost as to what to try/resolve next.

This topic is already lengthy so I am happy to post more specific data about the nodes and/or Influxdb configuration upon request to help isolate the issue(s).
Thanks, Ric

Hi Ric,

I’ve been down a similar path although for ¯_(ツ)_/¯ reasons, I didn’t use MQTT.

There are a couple of things to watch out for when using node-RED and InfluxDB. I should start by saying that all of the below is my personal opinion based on my particular set of skills and (probably more importantly) limitations. Folks that are more database/JavaScript/IoT/Linux experienced undoubtedly would have a different answer.

I should also say that I’m a huge fan of Influx. It’s been a fantastic tool. So anything that sounds negative in this post should be taken within that context i.e. I am not trying to dump on the product.

So in no particular order:

  1. Check out Getting Started - IOTstack. This is a Dockerized setup that is super simple to use and contains just about every software package you could possibly need, from Grafana and Influx to node-RED and Mosquitto. I’m no longer using a Docker-based system, but this project saved me enormous amounts of time and anguish when I was getting started. Once you go down the IoTStack route, you’ll find much of the tricky configuration is taken care of for you.

  2. If you use IOTStack, it becomes easy to install influxdb 1.8. You can simply select it from the setup menu. This, in my opinion, is a vital step.

  3. InfluxDB v2 had lots of new hotness, but I personally found the setup and use incredibly painful in comparison to v1.8. I found it had a much steeper learning curve. There are also a lot more beginner-friendly tutorials online for v1.8. For your use case, I would strongly recommend not using v2.

  4. Related to 1,2, and 3, take a look at videos by Andreas Spiess, particularly this one: https://youtu.be/a6mjt8tWUws?si=zFJJZRjKa4YLKpjd . Andreas is a fantastic resource.

  5. Within node-RED, you need to add the InfluxDB plugin. This is done using the node-RED Palette Manager (top right hamburger menu). Just search for InfluxDB.

  6. Once you add the plugin, you will have a collection of read/write nodes for InfluxDB. If you are writing multiple bits of data, the “Batch write” node is the one to use. You will need to configure it with the name, hostname, language etc of your Influx database. However, since everything is happening inside Docker, this is fairly easy. I think Andreas walks through it too.

  7. Spend some time understanding how tags work in InfluxDB. As I understand it, one of the key differences is that tags are very efficient to query with. For example, let’s say you have a temperature measurement from a sensor, and the sensor has a tag of “ID” for an ID/serial number, and the serial number of the device is “123”. Running queries that return data where ID = “123” will be very fast. Running a query on sensors at 20.1ºC would be very slow in comparison. Equally, imagine you have a bunch of sensors reading things from a single machine e.g. a server. You might want to tag all of those readings with the server name so you can quickly query everything for that machine.

  8. Finally, format the data from the sensors into a JSON object that has measurements and tags etc, and connect it to the InfluxDB batch write node. Then you should be all set. Here’s an example JSON payload for reference:

[
  {
    "measurement": "thingmeasurement",
    "fields": {
      "thing1": 12.6,
      "thing2": 38,
      "thing3": 1020.5
    },
    "tags": {
      "ID": "123",
      "SomeGroup": "1",
      "SomeTag": "Foo"
    },
    "timestamp": "2024-02-06T17:47:12.000Z"
  },
  {
    "measurement": "thing2measurement",
    "fields": {
      "thing21": 45,
      "thing22": 4992,
      "thing23": 0,
      "thing24": "somestring"
    },
    "tags": {
      "ID": "123",
      "SomeGroup": "1",
      "SomeTag": "Foo"
    },
    "timestamp": "2024-02-06T17:47:12.000Z"
  },
]

This would write three values (thing1, thing2 and thing 3) to the measurement “thingmeasurement”, and four values to the measurement “thing2measurement”. All of the data would be associated with the tags ID, SomeGroup and SomeTag.

I find it is sometimes helpful to think of measurements like traditional database tables, but that analogy only goes a very short distance before it gets you into trouble.

I hope that all helps. Good luck.

-Nick

Well, now. What an unexpected, interesting and helpful email.
Thank you Nick!

I have been involved on-and-off for many years in IT but mostly my career has been teaching pure maths, physics, etc. So I get rusty and find myself soon left behind with recent software developments. Yes, almost all of what I’m trying to do is new to me, but I’m not a bad learner and although it now takes longer to get to grips with stuff, I can usually get there.

I’m a great fan of MySql and similar databases. I find them simple, straighforward and easy to get up-and running. I have spent many an hour (plural, many, many hours) trying to work with influxDB and for some time had the feeling I’d landed on Mars with locals wondering why I couldn’t grasp Martian. Or through the Looking Glass trying to think 6 impossible things before breakfast.

I came across a website where I found I could load data manually into InfluxDB and the resulting graph was very impressive. That was the main reason for sticking it out. The journey you describe is one I’ve already largely undertaken. Here are some very brief comments to your points:

Yes, I’ve used IOTstack, having first come across it via GitHub; the same as your link (1). It floored me: whilst most packages installed correctly, InfluxDB would not. (see my post here: Login to new Influxdb 2 installation). I also didn’t like the fact that Portainer, one of the first packages to get up and running, couldn’t take control of everything because stacks, volumes, etc, had been created elsewhere. In the end I went for a CLI install of InfluxDB in Docker; all the other packages are still via IOTstack. Installing InfluxDB has consistently installed version 2.7.5 irrespective of whether I choose InfluxDB or InfluxDB2 in IOTstack.

(2) I installed InfluxDB2 without understanding the implications, simply on the naive belief that newer was better. Flux=Martian (at least to me). And (3), yes, I hope 1.8 will be easier bacause I’m OK with SQL. I say “will be” because I’ve given the MQTT->NodeRed->InfluxDB work a bit of a rest, instead I’m re-writing code for the ESP32 to be more event/interrupt driven and removing all blocking code.

(4) I’ve come across Andreas Speiss and it was his videos that encouraged me to persist with IOTstack. I’ve used quite a few times the PiBuilder script he encourages and I’m impressed it runs through to completion without a single error. I use a snapshot of The Raspberry Pi installation (when PiBuilder completes) to restart/reinstall quickly from a solid base.

(5), yep done that.

(6) haven’t yet tried to use batch mode. The advice is to get a single Influx-Out node working first and do the batch stuff when more fluent. I’ve yet to get anything out of Node-Red and into InfluxDB, even with a manual Inject node.

(7) is definitely an achilles heel for me. I really do need to get to grips with that. I understand the structure (somewhat, but not enough) and the distinction between fields and tags but there’s a huge psychological difficulty to overcome: embracing the new constructs and leaving conventional pre-declared fields and primary/secondary indexes to one side. It all seems rather fluid and that’s ‘unsettling’.

(8) The raw data handling in the ESP32 is by structs (peripherals → controller using ESP-NOW). Within the controller, sending data to a web server uses JSON document format. For some reason I didn’t do that for sending data to MQTT. Why? Dunno, perhaps some random cosmic ray hitting a neuron. But that’s easy to reconfigure. In Node-Red I can see data arriving with the topic and value which, when run through a function and change node, yield a “field” name and number. And the Rasperry Pi MQTT instance works just the same as the public EMQX broker. So that part seems OK.

… it should be easy to pass to InfluxDB. Except it hasn’t been:

I shall try your sample flow and maybe I’ll see what I’m doing wrong. I suspect I have mangled the data at some point or failed to adhere to the field/tag structure. If your flow works it will be the first occasion I’ve been able to send data from Node-Red to InfluxDB. If it doesn’t work then something’s wrong with the configuration of the containers and something’s not communicating.

It will be the weekend before I get back to this. Demands at home mean I need to do some plumbing and plastering. Which is wonderful because then only my back hurts rather than my head.

Thanks again, I’ll get back to you.
Ric

Glad it was useful. In my experience Influx 2 will almost certainly be the root of your issues. I’ll check and see if I still have a docker system somewhere, but I thought IoTStack had 2 install options, one for 1.8 and the other for v2. It has been a while since I looked at it.

In general, I find it helpful to think of “measurements” as tables, the actual data being written as records and fields, and “tags” as labels for each record in the DB. Apologies again to those with a deeper understanding.

The thing that makes a DB record unique is the timestamp and the tags. The timestamp is effectively the primary key. So for example, if you write two sets of data with a timestamp of 12:00:00 one will overwrite the other unless they have different tags (I’m leaving out the date for simplicity, but you get the idea.)


In the version of IoTStack I have (which may be out of date) there are two options for Influx. Not sure that’s helpful for you, but at least it means I wasn’t imagining it!

Here’s some sample code from a function node. There are a variety of bits missing, but hopefully it’s enough to be useful. I have used very similar code to write to both 1.8 and 2.x once things were set up.

// Parsing and mapping

const parsedId = data[1];
const parsedTime = new Date(data[2]);
const parsedThing1 = parseFloat(data[3]);
const parsedThing2 = parseInt(data[4]);
const parsedThing3 = parseInt(data[5]);


// < rest of the setup logic **some things are missing** >


// Prepare the final payload
msg.payload = [
    {
        measurement: "table1",
        fields: {
            thing1: parsedThing1,
            thing2: parsedThing2,
            thing3: parsedThing3
        },
        tags: {
            ID: parsedId,
            SomeGroup: parseGroup
        },
        timestamp: parsedTime
    },
   
];

return msg;

Hello Nick,
I’ve been able to find a few hours today to push this a bit further. There were many things misconfigured and now, even more things I don’t understand.

Everything is reverted to Influxdb 1.8

Your examples proved very useful and illustrated well the structure of fields and tags. I was able to construct an inject node and by trying a few different settings to the server configuration, get data into InfluxDB. I installed the Influx CLI and was able to create a database, users, etc. and eventually query the data. It was quite a straightforward step to link the MQTT In nodes in Node-Red to the Influxdb out and see some real-time data. Curiously, unexpectedly, the host setting to link to Influxdb was 172.20.0.5:8086. The IPs are visible in Portainer and are, I assume the internal IPs for each container mapping the container (inside) port to the outside (8086:8086).

Finally I’ve been able to get Grafana up-and-running and after quite some few error messages link Influxdb as a source. Very nice graphs too!

I’m going to go back at some point and do some fully documented installs before even considering real data. But for now I can see directly the effects of the change and function nodes as data moves from MQTT to InfluxDB. And it’s inconsequential if I mangle data or make mistakes: I can roll back and start again. There’s a lot to explore.

But before I go: you mentioned the batch node. At the moment the ESP32 is configured to send individual sensor data through MQTT as soon as it arrives. For MQTT → Node-Red → InfluxDB
I have a choice:

  1. Have lots of individual readings and post each “as and when” to InfluxDB. This will create a large number of points (rows?) each with a single sensor reading and timestamp.
  2. Or should I batch the data and post it as a single message containing all the sensor readings against the one timestamp?

I’m asking because I have no idea yet about the impact further down the line of either choice…
My best wishes,
Ric

Welcome to the journey :slight_smile:

I believe you are correct about the IP addresses being internal to Docker. I think you can also just put a name there (I don’t recall if it was “influx” or “influxdb”).

Grafana is great. I was dreading the piles of custom code I thought I would need to get nice graphs and it’s a doddle with Grafana.

Regarding batch or not to batch. I was assured by someone smarter than me that batch was almost always the answer. It doesn’t require any changes to the rest of your code – just swap out the node. I would suggest trying it and seeing how you get on.

Overall, InfluxDB 1.8 is astonishingly performant on a low power SBC (Pi included). Unless you are really hammering it with data, you most likely won’t run into a bottleneck there. On the setup I have, we get floods of data every few minutes so I decided to err on the side of caution and I added a rate limiter/buffer in node-RED. I think I limited writes to 40/s but I have 10-12 different data points for each individual reading. Your mileage may vary, as they say.

node-RED is also ridiculously performant for most things. I set up an Arduino to simulate messages from our devices en-masse and a lot of other parts broke before node-RED got fed up.

Glad to hear you are up and running!