Store Python dictionary as field using InfluxDB 2

karl · April 22, 2021, 9:33pm

How can I store a Python dictionary in a field using InfluxDB 2? Example:

write_client.write(bucket, org, {
    "measurement": measurement,
    "fields": {
        "mydict": {'a': 1, 'b': 2}
    },
    "time": timestamp
})

A straightforward way is to convert the dictionary (mydict) to JSON and store it as string. However, for binary data this is very space inefficient. Are there more efficient / compact ways? Thanks!

Anaisdg · April 23, 2021, 4:06pm

Hello @karl,
Welcome! May I ask, why do you want to store the json as a field? You won’t be able to visualize that data in a meaningful way. Can you provide me with a larger context about what you’re trying to accomplish?

karl · April 24, 2021, 11:05am

Thanks, Anaisdg! Most of the fields in my use case contain time series data (just plain numbers). Additionally, I want to store unstructured / varying auxiliary data (Python objects) for some points. That data need not be visualized and is only used for script-based postprocessing purposes. For simplicity reasons, I would prefer having them in the same database instead of storing them somewhere else. The only way to achieve this seems to serialize (pickle) the Python objects and to encode them as base64 strings, which is not very efficient, however.

Anaisdg · April 26, 2021, 5:00pm

Hello @karl,
Thanks for explaining. I’ve asked someone on the storage team to help. That’s cool you’re building on top of InfluxDB. What specifically? I’m curious about your postprocessing purposes. Care to share more?

philjb · April 26, 2021, 5:23pm

Hi @karl -

I’m not sure if you’re using influxv2 OSS or cloud but my advice here should apply to both. The storage engine uses gzip compression for strings already. If you convert the dict to a base64 string, you won’t get the compression advantage (since you already compressed it essentially). I suspect you are storing many of these dicts with overlapping key names. I would not compress in advance to allow compression across dicts.

If these dicts become very long strings, you may run into the length limit (~64k roughly I believe). Additionally, I would store the string as a field so that it is not indexed (i.e. don’t use a tag for this value). Indexing longs strings is expensive and you said you don’t need to query for these directly.

This is my expectation, depending on the characteristics of your data and how it serializes, your mileage may vary. If you’re on influx oss, you’ll be able to A/B test different approaches and the impact on data volume on disk and query time. My general suggestion is to stick the serialized value into a field and continue until the performance is unsatisfactory and then you can iterate on improving it.

Let us know how it works out! Storing metadata like this is a nice use case.

Topic		Replies	Views
InfluxDB write api python: write list of dictionaries	3	141	October 1, 2024
Writing queries with Python doesn't correctly add tags/fields Store schema	0	2527	June 15, 2018
Adding Json Data from Python to Influxdb Store influxdb , time-series , influxdata	4	23052	July 27, 2020
MQTT.input - JSON - save/convert Int as String telegraf	0	507	March 6, 2020
Write JSON with python into InfluxDB v2 Client SDKs python , json	7	8998	December 28, 2021

Store Python dictionary as field using InfluxDB 2

Related topics