Hello @uniquewaheed,
What is your error?
I made some dummy points, the group by is to simulate the default grouping that happens (measurement and tags have groups).
import "array"
import "experimental"
state = array.from(rows: [
{
_time: 2024-05-15T10:00:00Z,
_field: "state",
_value: "up",
source: "deviceA",
interface_name: "Gig0/0/0",
_measurement: "Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface",
host: "router1",
path: "/interfaces/interface-xr/interface",
subscription: "sub1"
},
{
_time: 2024-05-15T10:10:00Z,
_field: "state",
_value: "down",
source: "deviceA",
interface_name: "Gig0/0/1",
_measurement: "Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface",
host: "router1",
path: "/interfaces/interface-xr/interface",
subscription: "sub1"
}
])
|> group(columns: ["source", "interface_name", "_measurement", "path", "subscription"], mode:"by")
// state |> yield(name: "state")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
descriptions = array.from(rows: [
{
_time: 2024-05-15T10:00:00Z,
_field: "description",
_value: "Uplink to Core VLAN 10",
source: "deviceA",
interface_name: "Gig0/0/0",
_measurement: "Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface",
host: "router1",
path: "/interfaces/interface-xr/interface",
subscription: "sub1",
},
{
_time: 2024-05-15T10:10:00Z,
_field: "description",
_value: "",
source: "deviceA",
interface_name: "Gig0/0/1",
_measurement: "Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface",
host: "router1",
path: "/interfaces/interface-xr/interface",
subscription: "sub1",
}
])
|> group(columns: ["source", "interface_name", "_measurement", "path", "subscription"], mode:"by")
// descriptions |> yield(name: "descriptions")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
Here is the query I’m testing on that dummy data:
union(tables: [state, descriptions])
|> filter(fn: (r) => exists r["description"])
// Uncomment to test regex:
// |> filter(fn: (r) => r["description"] =~ /.VLAN./)
// |> drop(columns: ["description"])
// |> pivot(rowKey: ["_time"], columnKey: ["interface_name"], valueColumn: "state")
// |> drop(columns: ["host", "path", "subscription"])
// |> columns()
Here are some points:
- you’re going to want to pivot before you union. If you don’t do that and union first and then pivot you’ll only retain one row for each identical timestamp thereby losing data if you have multiple points written at the same time.
- filtering for the values that only have a description column effectively undo the union and need to even query for the state data.
Take a look at what the deata looks like after it’s been pivoted and then unioned (I assumed one description was an empty string. Anways no values exist for state. So if you filter right there you’ll end up back with your pivoted descriptions data:
- At this point I ask that you notice that I’m uncommenting and debugging line by line to make sure that Im getting what I expect with each data transformation. I also like to use multiple yield functions like print statements to compare and contrast the values before and after additional functions. I also recommending adding a limit to your query so you can work with a smaller dataset before adding additonal transformations.
- Now if you drop the description column an then pivot again you’re also undoing a bunch of work. That being said the rest of your query does work. With the pivot happening first.
However the result feels not like what you want.
Please use multiple yield() statements like print statements and go line by line to make sure your transformation is as expected. Next if you can take a look at the dummy data I share with you with array.from() and make corrections to it so it resembles your data that would be helpful so we have somethign to work with. Finally can you share what your expected output is? Thanks