Apparent discrepancy in memory usage reported by telegraf and docker stats

Hi everyone,

I have a question regarding container memory usage reported by docker input. My current understanding is that usage measurement should be equal to the one reported by docker stats, however in my case it is not so.

Telegraf:

docker run --rm --user telegraf:$(stat -c '%g' /var/run/docker.sock) -v /var/run/docker.sock:/var/run/docker.sock -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf telegraf:1.24.2-alpine telegraf --input-filter docker --test|grep docker_container_mem|grep automatl-mariadb

> docker_container_mem,com.docker.compose.config-hash=a67af2ccb8aeac9848037677f3efd5ab9cc4f2c995ceb82f6dc43c7f06431b5b,com.docker.compose.container-number=1,com.docker.compose.oneoff=False,com.docker.compose.project=automatl-mariadb,com.docker.compose.project.config_files=docker-compose.yml,com.docker.compose.project.working_dir=/home/kykc/automatl-docker/automatl-mariadb,com.docker.compose.service=automatl-mariadb,com.docker.compose.version=1.26.2,container_image=automatl-mariadb_automatl-mariadb,container_name=automatl-mariadb,container_status=running,container_version=unknown,engine_host=halo,host=b011afded2cb,server_version=20.10.18 active_anon=4096i,active_file=29614080i,cache=77262848i,container_id="3e04de2449cf19e647cdd405380c3e71d1d4e910daf958080ee18bf1f2455042",hierarchical_memory_limit=9223372036854771712i,inactive_anon=87457792i,inactive_file=47648768i,limit=16706154496i,mapped_file=25022464i,max_usage=168194048i,pgfault=73829i,pgmajfault=251i,pgpgin=86129i,pgpgout=45913i,rss=87461888i,rss_huge=0i,total_active_anon=4096i,total_active_file=29614080i,total_cache=77262848i,total_inactive_anon=87457792i,total_inactive_file=47648768i,total_mapped_file=25022464i,total_pgfault=73829i,total_pgmajfault=251i,total_pgpgin=86129i,total_pgpgout=45913i,total_rss=87461888i,total_rss_huge=0i,total_unevictable=0i,total_writeback=0i,unevictable=0i,usage=90247168i,usage_percent=0.5402031210809652,writeback=0i 1665736707000000000

Docker stats:

CONTAINER ID   NAME                        CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
3e04de2449cf   automatl-mariadb            0.05%     114.3MiB / 15.56GiB   0.72%     987kB / 713kB     103MB / 14.1MB    9

So, telegraf’s usage=90247168=86.07MiB, while docker stats reports 114.3MiB. The usage percents are also different.

Just playing around with numbers I’ve noticed that active_file + cached = 29614080+90247168 = 114.3MiB

Can you please clarify for me is this expected behavior or I’ve stumbled upon something weird?

Thank you in advance.

I think this is inputs.docker mem usage stats are computed wrong for cgroup v1 · Issue #11596 · influxdata/telegraf · GitHub where we are subtracting the cache each container has, but I haven’t looked hard enough. Just thought this sounded familiar.