Influxdb container recreation empties mounted host-volume?

#1

Hello,

I’m using influxdb to persist some data of my home (via Home Assistant). I’m using the official docker container with docker-compose but I guess i’m using it wrong…

I wanted to implement a backup strategy and added a local /backup mount to the configuration (see below).
docker-compose up -d recreated the container and boom, all my data was gone. The statistics started at zero again (but the databases still exist). :scream:

Edit: I just found out that all my telegraf data is still available in the system, docker and influxdb dashboards, only my home Dashboard data seems to be resetted. I don’t get it :confused: :see_no_evil:

When recreating the chronograf image, the database remains untouched and my dashboards are still available. Shouldn’t it be the same with influxdb as I’m only recreating the container and not the data-structure?

docker-compose.yaml
version: '3'
services:
  influxdb:
    image: influxdb:latest
    volumes:
      - ./influxdb/meta:/influxdb/meta
      - ./influxdb/data:/influxdb/data
      - ./influxdb/config/:/etc/influxdb/
      - ./influxdb/backups:/backups
    ports:
      - "8086:8086"
      - "8082:8082"
      - "8089:8089"
    restart: always
influxdb.conf
reporting-disabled = false
bind-address = ":8088"

[meta]
  dir = "/influxdb/meta"
  retention-autocreate = true
  logging-enabled = true

[data]
  dir = "/influxdb/data"
  index-version = "inmem"
  wal-dir = "/root/.influxdb/wal"
  wal-fsync-delay = "0s"
  validate-keys = false
  query-log-enabled = true
  cache-max-memory-size = 1073741824
  cache-snapshot-memory-size = 26214400
  cache-snapshot-write-cold-duration = "10m0s"
  compact-full-write-cold-duration = "4h0m0s"
  compact-throughput = 50331648
  compact-throughput-burst = 50331648
  max-series-per-database = 1000000
  max-values-per-tag = 100000
  max-concurrent-compactions = 0
  max-index-log-file-size = 1048576
  trace-logging-enabled = false
  tsm-use-madv-willneed = false

[coordinator]
  write-timeout = "10s"
  max-concurrent-queries = 0
  query-timeout = "0s"
  log-queries-after = "0s"
  max-select-point = 0
  max-select-series = 0
  max-select-buckets = 0

[retention]
  enabled = true
  check-interval = "30m0s"

[shard-precreation]
  enabled = true
  check-interval = "10m0s"
  advance-period = "30m0s"

[monitor]
  store-enabled = true
  store-database = "_internal"
  store-interval = "10s"

[subscriber]
  enabled = true
  http-timeout = "30s"
  insecure-skip-verify = false
  ca-certs = ""
  write-concurrency = 40
  write-buffer-size = 1000

[http]
  enabled = true
  bind-address = ":8086"
  auth-enabled = false
  log-enabled = true
  suppress-write-log = false
  write-tracing = false
  flux-enabled = false
  pprof-enabled = true
  debug-pprof-enabled = false
  https-enabled = false
  https-certificate = "/etc/ssl/influxdb.pem"
  https-private-key = ""
  max-row-limit = 0
  max-connection-limit = 0
  shared-secret = ""
  realm = "InfluxDB"
  unix-socket-enabled = false
  unix-socket-permissions = "0777"
  bind-socket = "/var/run/influxdb.sock"
  max-body-size = 25000000
  access-log-path = ""
  max-concurrent-write-limit = 0
  max-enqueued-write-limit = 0
  enqueued-write-timeout = 30000000000

[logging]
  format = "auto"
  level = "info"
  suppress-logo = false

[[graphite]]
  enabled = false
  bind-address = ":2003"
  database = "graphite"
  retention-policy = ""
  protocol = "tcp"
  batch-size = 5000
  batch-pending = 10
  batch-timeout = "1s"
  consistency-level = "one"
  separator = "."
  udp-read-buffer = 0

[[collectd]]
  enabled = false
  bind-address = ":25826"
  database = "collectd"
  retention-policy = ""
  batch-size = 5000
  batch-pending = 10
  batch-timeout = "10s"
  read-buffer = 0
  typesdb = "/usr/share/collectd/types.db"
  security-level = "none"
  auth-file = "/etc/collectd/auth_file"
  parse-multivalue-plugin = "split"

[[opentsdb]]
  enabled = false
  bind-address = ":4242"
  database = "opentsdb"
  retention-policy = ""
  consistency-level = "one"
  tls-enabled = false
  certificate = "/etc/ssl/influxdb.pem"
  batch-size = 1000
  batch-pending = 5
  batch-timeout = "1s"
  log-point-errors = true

[[udp]]
  enabled = false
  bind-address = ":8089"
  database = "udp"
  retention-policy = ""
  batch-size = 5000
  batch-pending = 10
  read-buffer = 0
  batch-timeout = "1s"
  precision = ""

[continuous_queries]
  log-enabled = true
  enabled = true
  query-stats-enabled = false
  run-interval = "1s"

[tls]
  min-version = ""
  max-version = ""

Am I missing anything?
I don’t want to create backups everytime before updating the docker image etc, that’s why I’ve mounted the data directory to the host.

#2

Hi,

Your write ahead log is located inside /root, which isn’t configured to survive a container restart (directory isn’t a volume).

If there wasn’t enough data written to trigger a shard commit, then your WAL data was lost when the container exited.

You can move the WAL to one of your persisted mounts / volumes.

1 Like
#3

Thank you, excellent point, wasn’t aware of that. I’ve “found” older data so the WAL could be the problem.

DeepinBildschirmfoto_Bereich%20ausw%C3%A4hlen_20190116193529

Is cache-snapshot-memory-size = 26214400 a limit for each database? Since the other data is still available.

I’m just wondering why I’ve lost a few days. Sure there were only a few records sent to influxdb in that specific database so it’s possible that cache-snapshot-memory-size = 26214400 diddn’t get triggered, but shouldn’t cache-snapshot-write-cold-duration = "10m0s" flush the WAL every 10min?

#4

cache-snapshot-write-cold-duration of 10m0s would cause a write to TSM if there was no data for 10 minutes, so I’m making the assumption that your data was coming in qucker than this rate, but never hitting the cache-snapshot-memory-size-* limits to cause the TSM write.

More details can be found here.

1 Like
#5

Okay, I get it know. That’s possible :thinking:
Thank you!

And also thank you for the link, I’ve missed that section :slight_smile: