I use the readsb docker container to collect data about aircraft flying nearby. The container has an instance of telegraf running inside it, which sends data to influxdb with the following schema (full schema can be seen here):
_measurement: aircraft
Tag: Icao
Tag: Call
...
Tag: _field
Field Key: Altitude
Field Key: Latitude
Field Key: Longitude
Field Key: Airspeed
....
which gives me tens of thousands of rows in the result, showing the ICAO codes of aircraft seen in the past 24 hours. Note that a unique ICAO code appears hundreds of times in the results, as each aircraft sends hundreds of packets at a time with its position etc.
_stop
Icao
2022-08-17 18:23:36.878
0101DB
2022-08-17 18:23:36.878
0101DB
2022-08-17 18:23:36.878
0101DB
…
…
2022-08-17 18:23:36.878
06A07A
2022-08-17 18:23:36.878
06A07A
2022-08-17 18:23:36.878
06A07A
You get the idea.
Now to my question.
I want to filter out the rows where the same ICAO code appears several times within seconds, keeping only one row.
BUT if an ICAO code re-appears after an hour, I want to keep the row, not filter it out. (This is why the unique() command won’t work for me). This is a scenario where an aircraft with the same ICAO code is now flying over head again, possibly on another flight.
I did think about using the callsign instead of / in combination with the ICAO code. The issue with this is that the same callsign can be used by the aircraft on the next day, for example. So can’t use the unique() function there either.
I have used the unique() query on these so far, so it shows me the unique ICAO codes (aircraft) seen over the selected period; I am missing the scenario where the same aircraft with the same ICAO code flies over me several times per day, for example.
Hello @tejasitraj,
I’m a little confused about why you want to group by time…
however to answer:
I want to filter out the rows where the same ICAO code appears several times within seconds, keeping only one row. BUT if an ICAO code re-appears after an hour, I want to keep the row, not filter it out.
But I recognize that you’re looking for something more complex. You probably don’t want to just aggregate the data by 1h intervals (the aggregateWindow aggregates data from windows starting at the unix epoch). Is your data written at regular intervals? Maybe you can make the start time the exact point of your first points.