Incorrect binary format for nested json

input:

{"foobar":{"time":"2023-07-19T06:04:53.429Z","serial":"simulated","device":"1324","power1":2.6,"power2":96.6}}

Config

[[inputs.mqtt_consumer]]
	servers = [ "mqtt://broker:1883" ]

	topics = [ "#" ]

	data_format = "json_v2"

	[[inputs.mqtt_consumer.topic_parsing]]
		topic = "foo/+/+"
		measurement = "_/_/measurement"
		tags = "_/serial/_"

    [[inputs.mqtt_consumer.json_v2]]
        [[inputs.mqtt_consumer.json_v2.object]]
            path = "foobar"
            tags = [ "device" ]

[[processors.unpivot]]
	tag_key = "field"
	value_key = "value"

[[outputs.postgresql]]
	connection = "host=database user=postgres password=password dbname=bar"

	tag_table_add_column_templates = []
	tag_table_create_templates = []
	add_column_templates = []
	create_templates = []

	tagexclude = [ "topic", "host" ]

error:

telegraf-1           | 2023-07-19T06:00:53Z E! [outputs.postgresql] PG CopyFrom - map[columnNames:[time device field serial value] err:ERROR: incorrect binary data format (SQLSTATE 22P03) pid:1122 tableName:[public telemetry] time:564.899µs]
database-1           | 2023-07-19 06:00:53.516 UTC [1122] STATEMENT:  copy "public"."telemetry" ( "time", "device", "field", "serial", "value" ) from stdin binary;
telegraf-1           | 2023-07-19T06:00:53Z E! [outputs.postgresql] write error (permanent, dropping sub-batch): ERROR: incorrect binary data format (SQLSTATE 22P03)

This only happens when using nested input and a json_v2.object config

Can you view your postgres logs and see what column it is complaining about?
Can you also use the outputs.file output to print out what the resulting metric looks like?

Thanks

Updated json_v2 config:

    [[inputs.mqtt_consumer.json_v2]]
		timestamp_path = "foobar.time"
		timestamp_format = "2006-01-02T15:04:05.000Z"
        [[inputs.mqtt_consumer.json_v2.object]]
            path = "foobar"
            tags = [ "device", "foo_serial" ]

it’s the value column

2023-07-24 07:22:19.111 UTC [35] ERROR:  incorrect binary data format
2023-07-24 07:22:19.111 UTC [35] CONTEXT:  COPY telemetry, line 1, column value
2023-07-24 07:22:19.111 UTC [35] STATEMENT:  copy "public"."telemetry" ( "time", "device", "field", "foo_serial", "value" ) from stdin binary;

outputs.file

telegraf-1           | telemetry,device=foo-el-1,field=time,host=b368f48176e6,foo_serial=simulated,topic=foo/simulated/telemetry value="2023-07-24T07:41:03.641Z" 1690184463641000000
telegraf-1           | telemetry,device=foo-el-1,field=power1,host=b368f48176e6,foo_serial=simulated,topic=foo/simulated/telemetry value=335.7 1690184463641000000
telegraf-1           | telemetry,device=foo-el-1,field=power2,host=b368f48176e6,foo_serial=simulated,topic=foo/simulated/telemetry value=-574 1690184463641000000

I guess this happens because time gets unpivoted as well.

Using timestamp_key instead works:

    [[inputs.mqtt_consumer.json_v2]]
        [[inputs.mqtt_consumer.json_v2.object]]
            path = "foobar"
			timestamp_key = "time"
			timestamp_format = "2006-01-02T15:04:05.000Z"
			tags = [ "device", "foo_serial" ]

unpivot correctly ignores time now:

telegraf-1           | telemetry,device=foo-el-1,field=power1,host=b368f48176e6,foo_serial=simulated,topic=foo/simulated/telemetry value=1032.1 1690184844027000000
telegraf-1           | telemetry,device=foo-el-1,field=power2,host=b368f48176e6,foo_serial=simulated,topic=foo/simulated/telemetry value=-1643.5 1690184844027000000

and no postgres error is logged.

Previously at the time of the original post, before I modified tags, it also tried to unpivot the foo_serial tag which contains a string because I forgot it. Might have caused the same error.

Should I open an issue on github about this, or is this expected?

The config you provided, before the pivot, will produce a metric like the following:

file,device=1324,host=ryzen time="2023-07-19T06:04:53.429Z",serial="simulated",power1=2.6,power2=96.6 1689746693000000000

and after the pivot:

file,device=1324,field=time,host=ryzen value="2023-07-19T06:04:53.429Z" 1689746693000000000
file,device=1324,field=serial,host=ryzen value="simulated" 1689746693000000000
file,device=1324,field=power1,host=ryzen value=2.6 1689746693000000000
file,device=1324,field=power2,host=ryzen value=96.6 1689746693000000000

Which is exactly what you told it to do :wink: That is clearly not a bug.

What you should instead be doing is correctly setting the timestamp:

        [[inputs.file.json_v2.object]]
            path = "foobar"
            tags = [ "device", "foo_serial" ]
            timestamp_key = "time"
            timestamp_format = "2006-01-02T15:04:05.000Z"

Note this is under the object and uses timestamp_key, which produces:

file,device=1324,field=serial,host=ryzen value="simulated" 1689746693000000000
file,device=1324,field=power1,host=ryzen value=2.6 1689746693000000000
file,device=1324,field=power2,host=ryzen value=96.6 1689746693000000000

What about the usage of timestamp_path in Incorrect binary format for nested json - #3 by sezanzeb?

Your timestamp is under an object you are parsing. so it is better to grab that value under the object itself

If timestamp_path is discouraged to use, and if it doesn’t work with pivot, shouldn’t it be deprecated and eventually be removed from the docs? And then maybe make timestamp_key available for the root, if the timestamp is not inside a nested object.

Not discouraged, it is simply the wrong value to use in this scenario.

Ah interesting, please cite from JSON v2 input data format | Telegraf 1.21 Documentation where it says that it is wrong.

Because all I see is

timestamp_path = “” # A string with valid GJSON path syntax to a valid timestamp (single value)

and

timestamp_path (OPTIONAL): You can define a query with GJSON Path Syntax to set a timestamp from the JSON input. The query must return a single data value or it will default to the current time.