Influx has too many threads and killed by OOM when he try to execute too many queries, but should skip them

Admin_Topflow · June 24, 2021, 12:06pm

InfluxDB OSS version: InfluxDB 2.0.7 (git: 2a45f0c037) build_date: 2021-06-04T19:17:40Z (running in docker container)
Hardware: 64Gb RAM, 2Tb SSD (aws ec2 m5.4xlarge)

Hello!

My Influx 2.0 running in docker container and has OOM when I try to execute hundred queries from many processes.
I’ve tried to change influxd configuration fields: 2 concurrent queries limit, 1 query queue size, 100000000 query memory bytes limit, 10000000 query initial memory bytes. But Influx still create too many influxd threads (when he should just skip all queries and put in queue just one of them) and use too many RAM and will be killed just in seconds after queries has started.

Redefined fields in config:

query-concurrency: 2
query-initial-memory-bytes: 10000000
query-max-memory-bytes: 3200000000
query-memory-bytes: 100000000
query-queue-size: 1

Result of influxd print-config:

assets-path: ""
bolt-path: /root/.influxdbv2/influxd.bolt
e2e-testing: false
engine-path: /root/.influxdbv2/engine
feature-flags: {}
http-bind-address: :8086
http-idle-timeout: 3m0s
http-read-header-timeout: 10s
http-read-timeout: 0s
http-write-timeout: 0s
influxql-max-select-buckets: 0
influxql-max-select-point: 0
influxql-max-select-series: 0
key-name: ""
log-level: info
metrics-disabled: false
nats-max-payload-bytes: 1048576
nats-port: -1
no-tasks: false
pprof-disabled: false
query-concurrency: 2
query-initial-memory-bytes: 10000000
query-max-memory-bytes: 3200000000
query-memory-bytes: 100000000
query-queue-size: 1
reporting-disabled: false
secret-store: bolt
session-length: 60
session-renew-disabled: false
storage-cache-max-memory-size: 200000000
storage-cache-snapshot-memory-size: 100000000
storage-cache-snapshot-write-cold-duration: 10m0s
storage-compact-full-write-cold-duration: 4h0m0s
storage-compact-throughput-burst: 50331648
storage-max-concurrent-compactions: 0
storage-max-index-log-file-size: 1048576
storage-retention-check-interval: 30m0s
storage-series-file-max-concurrent-snapshot-compactions: 0
storage-series-id-set-cache-size: 0
storage-shard-precreator-advance-period: 30m0s
storage-shard-precreator-check-interval: 10m0s
storage-tsm-use-madv-willneed: false
storage-validate-keys: false
storage-wal-fsync-delay: 0s
store: bolt
testing-always-allow-setup: false
tls-cert: ""
tls-key: ""
tls-min-version: "1.2"
tls-strict-ciphers: false
tracing-type: ""
vault-addr: ""
vault-cacert: ""
vault-capath: ""
vault-client-cert: ""
vault-client-key: ""
vault-client-timeout: 0s
vault-max-retries: 0
vault-skip-verify: false
vault-tls-server-name: ""
vault-token: ""

Query template (each query has 30 minute len):

from(bucket: "{bucket}") 
  |> range(start: {start}, stop: {stop})
  |> filter(fn: (r) => r["_measurement"] == "{measurement}")
  |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
  |> filter(fn: (r) => r["{tag1}"] == "{value1}")
  |> filter(fn: (r) => r["{tag2}"] == "{value2}")

Cardinality: 435, number has calculated with flux request:

from(bucket: "buck")
  |> range(start: -1y)
  |> last()
  |> toString()
  |> group()
  |> count()

5 second after Influx has started requests processing (and 5 second right before Influx killed by OOM):

Anaisdg · June 30, 2021, 3:29pm

Hello @Admin_Topflow,
Welcome. I don’t know how to configure skipping and queuing queries. let me ask the team. Thanks.

wbaker · July 8, 2021, 10:02pm

Hi @Admin_Topflow !

Are your other process running queries getting any kind of error response before influxd crashes? Or are all the queries completely successfully until the server crashes?

Also I’m curious to learn a little bit more about the kind of data you are querying so that I can reproduce this kind of OOM crash. Would it be possible to get a representative sample of the data you are using?

Admin_Topflow · July 19, 2021, 1:21pm

Hello!

Is there any update about queries scheduling?

Topic		Replies	Views
InfluxDB memory consumption optimization InfluxDB 2	1	4648	August 6, 2019
OOMKilled in InfluxDB, how to properly set memory settings?	1	1863	August 9, 2019
Influxd 2.4 : killed by oom InfluxDB 2	0	343	September 6, 2023
OOMKilled in InfluxDB, how to preperly set memory settings?	1	758	July 24, 2019
How to avoid OOM InfluxDB 2 influxdb , performance	1	1017	November 24, 2021

Influx has too many threads and killed by OOM when he try to execute too many queries, but should skip them

Related topics