Avoiding redundancy when writing batches of points

Hi,

I am writing points in batches using the line protocol on the “/write” HTTP endpoint. Repeating measurement and field names in every line generates a lot of redundant data to communicate. Is there a way to avoid that? I would like to have an option to specify a default measurement name and field/tag order that applies to every line followed by the pure values in the lines.

Thanks and best regards
Keith

Nobody…? Length of field and tag names do have an unneccessary huge impact on the amount of data that is transferred between client and server as it is currently implemented. And that may increase the costs of your influxdb cloud usage significantly.

I simply don’t understand that. When I offer an api to write similar points in batches it should minimize the need of sending redundant data. Think of CSV: meta data (=column names) first, followed by pure data afterwards (one additional character per field as a separator is not avoidable).

Nobody else feeling that need?

1 Like

Hello @sauvant,
You can do that with the clients here’s a python example:

from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS

bucket = "my-bucket"

client = InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org")

query_api = client.query_api()

p = Point("my_measurement").tag("location", "Prague").field("temperature", 25.3)
write_api.write(bucket=bucket, record=p)

You can also do that with the CLI write command:

And inject constants like:

influx write -b example-bucket \
  -f path/to/example.csv \
  --header "#constant measurement,birds" \
  --header "#datatype dateTime:2006-01-02,long,tag"

But either way you’ll notice that line protocol is still generated.

I encourage you to submit a feature request here: