Sporadic `tls: bad record MAC` error

We recently updated our influxdb configuration (to reduce SWEET32 issues) as follows:

[http]
  auth-enabled = true
  pprof-enabled = false
  flux-enabled = true
  https-enabled = true
  https-certificate = "client.crt"
  https-private-key = "client.key"
  [tls]
    min-version = "tls1.2"
    max-version = "tls1.3"
    # https://wiki.mozilla.org/Security/Server_Side_TLS#Intermediate_compatibility_.28recommended.29
    #
    # Can this be configured more cleanly?  
    # strict-ciphers didn't work / or not sure on where to configure it
    ciphers = [ "TLS_AES_128_GCM_SHA256",
                "TLS_AES_256_GCM_SHA384",
                "TLS_CHACHA20_POLY1305_SHA256",
                "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
                "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
                "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
                "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
                "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
                "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
                "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
                "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
                "TLS_RSA_WITH_AES_128_CBC_SHA",
                "TLS_RSA_WITH_AES_256_CBC_SHA"
              ]

Anyway the configuration seemd to running fine for a while and then suddenly yesterday our influxdb became in accessible. Grafana started throwing 502 errors and trying to do curl commands:

curl --fail --silent --show-error -k -u grafana_user:<redacted> -G "https://10.0.67.1:8086/query?db=metrics" --data-urlencode "q=select LAST(value) from /^some.metric*/ where time > now() - 1m"

Failed with curl: (7) Failed to connect to 10.0.67.1 port 8086: Connection refused

On restarting the VM ofcourse everything worked back again. On checking the logs the error stated on influxdb was

http: TLS handshake error from 10.0.67.6:38084: local error: tls: bad record MAC

How could this be debugged, and what could be the possible ways to fix this?

Using influxdb: 1.8.10

@vipinvkmenon I’m not sure. Thanks in advance for your patience. Also out of curiosity, what are you using InfluxDB for?
@mhall119 do you have any thoughts here?

No, sorry. It looks like something it going wrong with the certs or the encryption library. Maybe somebody from the Edge team has seen this before?

Our initial assumption was it might be because of manually listing the ciphers, so we changed the configuration as

[http]
  auth-enabled = true
  log-enabled = <%= p('influxdb.enable_http_log') %>
  pprof-enabled = false
  flux-enabled = true
  https-enabled = true
  https-certificate = "/var/vcap/jobs/influxdb/config/client.crt"
  https-private-key = "/var/vcap/jobs/influxdb/config/client.key"
  [tls]
    min-version = "tls1.3"

[[graphite]]
  enabled = true
  database = "<%= p('influxdb.metrics_database_name') %>"

After the new update with just 1.3, we noticed a reduction in the issue, however, it occasionally creeps up. As it just happens sporadically haven’t identified a trigger for it as such, as it just suddenly starts in an entirely healthy production setup. The issue fixes itself after a really long time or when we restart influxdb. Anyways to debug this?