Avoiding redundancy when writing batches of points

sauvant · October 6, 2019, 12:55pm

Hi,

I am writing points in batches using the line protocol on the “/write” HTTP endpoint. Repeating measurement and field names in every line generates a lot of redundant data to communicate. Is there a way to avoid that? I would like to have an option to specify a default measurement name and field/tag order that applies to every line followed by the pure values in the lines.

Thanks and best regards
Keith

sauvant · February 6, 2022, 12:16pm

Nobody…? Length of field and tag names do have an unneccessary huge impact on the amount of data that is transferred between client and server as it is currently implemented. And that may increase the costs of your influxdb cloud usage significantly.

I simply don’t understand that. When I offer an api to write similar points in batches it should minimize the need of sending redundant data. Think of CSV: meta data (=column names) first, followed by pure data afterwards (one additional character per field as a separator is not avoidable).

Nobody else feeling that need?

Anaisdg · February 7, 2022, 7:33pm

Hello @sauvant,
You can do that with the clients here’s a python example:

from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS

bucket = "my-bucket"

client = InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org")

query_api = client.query_api()

p = Point("my_measurement").tag("location", "Prague").field("temperature", 25.3)
write_api.write(bucket=bucket, record=p)

You can also do that with the CLI write command:

And inject constants like:

influx write -b example-bucket \
  -f path/to/example.csv \
  --header "#constant measurement,birds" \
  --header "#datatype dateTime:2006-01-02,long,tag"

Anaisdg · February 7, 2022, 7:34pm

But either way you’ll notice that line protocol is still generated.

Anaisdg · February 7, 2022, 7:34pm

I encourage you to submit a feature request here:

Topic		Replies	Views
Why is data duplicated when I write the same points and timestamps after changing the field value? Store influxdb	1	1128	September 13, 2018
Struggling to write data with multiple fields via python InfluxDB 2 python	1	507	July 8, 2024
Field set vs single field InfluxDB 2	2	406	September 21, 2022
Writing in batches result in no field values? Drop Measurements? InfluxDB 2	1	513	July 6, 2021
Points with same fields but different values and tags InfluxDB 2	1	61	August 7, 2024

Avoiding redundancy when writing batches of points

Related topics