cami
May 4, 2023, 1:50pm
1
Hi
We try to backup a influxdb (Version 2.0.8) running via docker.
The command we run is:
influx backup --bucket default -t "token" /influxdb_default_backup/
Backup dies with the following output:
2023-05-04T12:52:44.171092Z info Backing up shard {"log_id": "0hahxYl0000", "id": 197, "path": "/influxdb_manual_backup/default/20230504T125158Z.s197.tar.gz"}
2023-05-04T12:52:44.172172Z warn Shard removed during backup {"log_id": "0hahxYl0000", "id": 197}
2023-05-04T12:52:44.172204Z info Backing up shard {"log_id": "0hahxYl0000", "id": 205, "path": "/influxdb_manual_backup/default/20230504T125158Z.s205.tar.gz"}
2023-05-04T12:52:52.888456Z info Backing up shard {"log_id": "0hahxYl0000", "id": 213, "path": "/influxdb_manual_backup/default/20230504T125158Z.s213.tar.gz"}
2023-05-04T12:53:11.054548Z info Backing up shard {"log_id": "0hahxYl0000", "id": 221, "path": "/influxdb_manual_backup/default/20230504T125158Z.s221.tar.gz"}
2023-05-04T12:53:28.487491Z info Backing up shard {"log_id": "0hahxYl0000", "id": 229, "path": "/influxdb_manual_backup/default/20230504T125158Z.s229.tar.gz"}
Error: Failed to download shard backup: An internal error has occurred.
See 'influx backup -h' for help
Docker logs does not provide helpful information:
influxdb_1 | ts=2023-05-04T12:53:11.056098Z lvl=info msg="Cache snapshot (end)" log_id=0hahjvWG000 service=storage-engine engine=tsm1 op_name=tsm1_cache_snapshot op_event=end op_elapsed=0.446ms
influxdb_1 | ts=2023-05-04T12:53:28.497755Z lvl=info msg="Cache snapshot (start)" log_id=0hahjvWG000 service=storage-engine engine=tsm1 op_name=tsm1_cache_snapshot op_event=start
influxdb_1 | ts=2023-05-04T12:53:28.498097Z lvl=info msg="Snapshot for path written" log_id=0hahjvWG000 service=storage-engine engine=tsm1 op_name=tsm1_cache_snapshot path=/root/.influxdbv2/engine/data/0dfadaa5214ae2fd/autogen/229 duration=0.380ms
influxdb_1 | ts=2023-05-04T12:53:28.498137Z lvl=info msg="Cache snapshot (end)" log_id=0hahjvWG000 service=storage-engine engine=tsm1 op_name=tsm1_cache_snapshot op_event=end op_elapsed=0.418ms
Any suggestions on how to backup this database?
Is it possible some files are corrupted? If so, can those files be repaired?
Thanks a lot!
cami
May 5, 2023, 6:02am
2
Update. We tried copying the influxdb to a backup location and we get many “Input/output errors”:
rsync: readlink_stat("/data2/influxdb-data/engine/data/0dfadaa5214ae2fd/autogen/301") failed: Input/output error (5)
rsync: readlink_stat("/data2/influxdb-data/engine/data/0dfadaa5214ae2fd/autogen/293") failed: Input/output error (5)
rsync: readlink_stat("/data2/influxdb-data/engine/data/0dfadaa5214ae2fd/autogen/277") failed: Input/output error (5)
rsync: readlink_stat("/data2/influxdb-data/engine/data/0dfadaa5214ae2fd/autogen/253") failed: Input/output error (5)
rsync: readlink_stat("/data2/influxdb-data/engine/data/0dfadaa5214ae2fd/autogen/261") failed: Input/output error (5)
rsync: readlink_stat("/data2/influxdb-data/engine/data/0dfadaa5214ae2fd/autogen/269") failed: Input/output error (5)
Can those files be repaired?
Hello @cami ,
Hmm that error is not very helpful huh.
Looking at this issue someone was able to resolve by restarting their pod
opened 06:01AM - 19 Jan 18 UTC
closed 05:58PM - 19 Jan 18 UTC
I cannot save rules that I create in the Chronograf UI. The error that I get is … a 500 input/output error:
```
➜ ~ curl 'https://chronograf.xxxx/chronograf/v1/sources/1/kapacitors/1/rules' -sv -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0' -H 'Accept: application/json, text/plain, */*' -H 'Content-Type: application/json;charset=utf-8' --data '{"id":"DEFAULT_RULE_ID","trigger":"threshold","values":{"operator":"greater than","value":"100","rangeValue":"","relation":"once","percentile":"90"},"message":"test","alerts":["slack"],"alertNodes":[{"name":"slack","args":[],"properties":[{"name":"channel","args":["#bitcoin"]}]}],"every":null,"name":"Bitcoin","query":{"id":"97ce122f-7b85-4a95-8668-0c085d2ff627","database":"metrics","measurement":"bitcoin_exporter_current_balance","retentionPolicy":"autogen","fields":[{"value":"value","type":"field"}],"tags":{},"groupBy":{"time":null,"tags":["currency"]},"areTagsAccepted":true,"rawText":null,"status":null}}'
> POST /chronograf/v1/sources/1/kapacitors/1/rules HTTP/2
> Accept-Encoding: deflate, gzip
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
> Accept: application/json, text/plain, */*
> Accept-Language: en-US,en;q=0.5
> Content-Type: application/json;charset=utf-8
> Connection: keep-alive
> Content-Length: 613
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* We are completely uploaded and fine
< HTTP/2 500
< date: Fri, 19 Jan 2018 05:58:13 GMT
< content-type: application/json
< content-length: 43
< x-chronograf-version: 1.3.10.0
< strict-transport-security: max-age=15724800; includeSubDomains;
<
{"code":500,"message":"input/output error"} ➜ ~
```
I see this in the interface as well when saving.
The error in the logs is:
```
time="2018-01-19T06:01:25Z" level=info msg=Request component=server method=POST remote_addr="10.2.1.3:36176" url=/chronograf/v1/sources/1/kapacitors/1/rules
time="2018-01-19T06:01:26Z" level=error msg="Error message input/output error" component=server http_status =500
time="2018-01-19T06:01:26Z" level=info msg="Response: Internal Server Error" code=500 component=server remote_addr="10.2.1.3:36176" response_time=188.247067ms
```
And seeing this
opened 04:44PM - 03 Sep 21 UTC
Hi,
I have a problem with my influxdb (latest version)
I´m running the influ… x inside of a docker container (azure container instance)
Data is inserted by telegraf.
Everything is fine.
Now I implemented a second telegraf instance which sending the data to a second bucket, same org.
After a few seconds the following log messages appear.
I have no idea where this is coming from.
First of all, who is sending the message "unexpected end of JSON input", and when does this happen?
Is it related to the input data?
Or how can I enhance the debug message?
```
ts=2021-09-03T16:35:22.019827Z lvl=info msg="Series partition compaction (start)" log_id=0WN81wYG000 service=storage-engine partition=0 op_name=series_partition_compaction path=/var/lib/influxdb2/engine/data/1360cd9a8ff45dc9/_series/00 op_event=start
ts=2021-09-03T16:35:26.197561Z lvl=info msg="Series partition compaction (end)" log_id=0WN81wYG000 service=storage-engine partition=0 op_name=series_partition_compaction path=/var/lib/influxdb2/engine/data/1360cd9a8ff45dc9/_series/00 op_event=end op_elapsed=4177.741ms
ts=2021-09-03T16:35:26.477786Z lvl=info msg="Series partition compaction (end)" log_id=0WN81wYG000 service=storage-engine partition=2 op_name=series_partition_compaction path=/var/lib/influxdb2/engine/data/1360cd9a8ff45dc9/_series/02 op_event=end op_elapsed=5379.702ms
ts=2021-09-03T16:35:26.506912Z lvl=info msg="Series partition compaction (end)" log_id=0WN81wYG000 service=storage-engine partition=4 op_name=series_partition_compaction path=/var/lib/influxdb2/engine/data/1360cd9a8ff45dc9/_series/04 op_event=end op_elapsed=4661.029ms
ts=2021-09-03T16:35:46.184542Z lvl=info msg="Write failed" log_id=0WN81wYG000 service=storage-engine service=write shard=3 error="[shard 3] unexpected end of JSON input"
ts=2021-09-03T16:35:56.344869Z lvl=info msg="Write failed" log_id=0WN81wYG000 service=storage-engine service=write shard=3 error="[shard 3] unexpected end of JSON input"
ts=2021-09-03T16:36:05.670810Z lvl=info msg="Write failed" log_id=0WN81wYG000 service=storage-engine service=write shard=3 error="[shard 3] unexpected end of JSON input"
ts=2021-09-03T16:36:18.987579Z lvl=info msg="Write failed" log_id=0WN81wYG000 service=storage-engine service=write shard=3 error="[shard 3] unexpected end of JSON input"
```
Thx in advance
Martin
cami
May 8, 2023, 11:53am
4
Hi @Anaisdg
Thank you for the reply.
It looks like the filesystem was corrupted and and destroyed some of the database files (autogen/301 etc…). Using fsck
we could repair the ext4 filesystem but the damaged files were deleted in the process. After that, the database started up successfully and we were able to create a backup.
All the best