New Features in InfluxDB Open Source Backup and Restore

Originally published at: New Features in InfluxDB Open Source Backup and Restore | InfluxData

New in version 1.5, the open source backup utility provides the option to run both backup and restore functions on a live database. It also provides features to backup and restore single or multiple databases, along with optional filtering based on data point timestamps. Finally, the restore tool supports import of data from an InfluxEnterprise cluster, while the backup tool can now generate backup files that may be imported into an Enterprise database. The offline backup/restore functions provided in InfluxDB versions 1.4 and earlier are retained in version 1.5 without change, and are detailed in the open source documentation.

About File Formats

Prior to version 1.5, the open source backup tool created a different backup file format than the Enterprise version of InfluxDB. This legacy format remains fully supported, and in some cases may even be used as input to the new online restore functionality. For new users, we recommend using the new portable backup format, which uses less disk space and also provides a clear transfer path for data between the Enterprise and open source versions of InfluxDB.

Backup

The improved backup command is similar to previous versions of InfluxDB, except that it can optionally generate backups in a portable format and has some new filtering options to constrain the range of data points that are exported to the backup. It is invoked by the influxd binary using the -portable flag:
influxd backup -portable [options] <path-to-backup>
Backup Options
  • -host <host:port> - The host to connect to and perform a snapshot of. Defaults to 127.0.0.1:8088.
  • -database <name> - The database to backup. Optional. If not given, all databases are backed up.
  • -retention <name> - The retention policy to backup. Optional.
  • -shard <id> - The shard id to backup. Optional. If specified, -retention is required.
  • -since <2015-12-24T08:12:13Z> - Do a file-level backup since the given time. The time needs to be in the RFC3339 format. Optional.
  • -start <2015-12-24T08:12:23Z> - All points earlier than this timestamp will be excluded from the export. Not compatible with -since.
  • -end <2015-12-24T08:12:23Z> - All points later than this time stamp will be excluded from the export. Not compatible with -since.
  • -portable - Generate backup files in the format used for InfluxDB Enterprise.
Example:

To backup all databases in an existing system:

[/tmp/backup_ex]$ influxd backup -portable /tmp/backup_ex
2018/03/15 12:55:26 backing up metastore to /tmp/backup_ex/meta.00
2018/03/15 12:55:26 No database, retention policy or shard ID given. Full meta store backed up.
2018/03/15 12:55:26 Backing up all databases in portable format
2018/03/15 12:55:26 backing up db=
2018/03/15 12:55:26 backing up db=collectd_db rp=autogen shard=1 to /tmp/backup_ex/collectd_db.autogen.00001.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=2 to /tmp/backup_ex/telegraf.autogen.00002.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=3 to /tmp/backup_ex/telegraf.autogen.00003.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=4 to /tmp/backup_ex/telegraf.autogen.00004.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=5 to /tmp/backup_ex/telegraf.autogen.00005.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=6 to /tmp/backup_ex/telegraf.autogen.00006.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=7 to /tmp/backup_ex/telegraf.autogen.00007.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=telegraf rp=autogen shard=8 to /tmp/backup_ex/telegraf.autogen.00008.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=mydb rp=forever shard=9 to /tmp/backup_ex/mydb.forever.00009.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=mydb rp=forever shard=10 to /tmp/backup_ex/mydb.forever.00010.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=tmp3 rp=autogen shard=11 to /tmp/backup_ex/tmp3.autogen.00011.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=tmp3 rp=autogen shard=12 to /tmp/backup_ex/tmp3.autogen.00012.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=tmp3 rp=autogen shard=13 to /tmp/backup_ex/tmp3.autogen.00013.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=tmp3 rp=autogen shard=14 to /tmp/backup_ex/tmp3.autogen.00014.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backing up db=_internal rp=monitor shard=15 to /tmp/backup_ex/_internal.monitor.00015.00 since 0001-01-01T00:00:00Z
2018/03/15 12:55:26 backup complete:
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.meta
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s1.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s2.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s3.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s4.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s5.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s6.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s7.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s8.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s9.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s10.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s11.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s12.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s13.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s14.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.s15.tar.gz
2018/03/15 12:55:26 	/tmp/backup_ex/20180315T165526Z.manifest

Restore

The restore function is improved in two important ways. First, the restore process no longer requires system down time. Second, a data restore no longer erases the data currently on the target system. A brief technical explanation is that the old restore process deactivated the system and replaced the data folder with the backup data, while the new online process imports data through a streaming API provided by the influxd program.

Regardless of whether you have existing backup automation that supports the legacy format, or you are a new user, you may wish to test the new online feature for legacy to gain the advantages described above. It is activated by using either the -portable or -online flags. The flags indicate that the input is in either the new portable backup format (which is the same format that Enterprise InfluxDB uses), or the legacy backup format, respectively. It has the following options:

  • -host <host:port> - The host to connect to and perform a snapshot of. Defaults to 127.0.0.1:8088.
  • -db <name> - Identifies the database from the backup that will be restored.
  • -newdb <name> - The name of the database into which the archived data will be imported on the target system. If not given, then the value of -db is used. The new database name must be unique to the target system.
  • -rp <name> - Identifies the retention policy from the backup that will be restored. Requires that -db is set.
  • -newrp <name> - The name of the retention policy that will be created on the target system. Requires that -rp is set. If not given, the value of -rp is used.
  • -shard <id> - Optional. If given, -db and -rp are required. Will restore the single shard's data.
Example:

To restore the backup taken above to a new, empty instance of influxdb (Note: the _internal database is skipped by default. Though it would be uncommon to do so, it may be imported explicitly using the -db parameter as described above):

[/tmp/backup_ex]$ influxd restore -portable /tmp/backup_ex
2018/03/15 13:01:45 Restoring shard 13 live from backup 20180315T165526Z.s13.tar.gz
2018/03/15 13:01:45 Restoring shard 14 live from backup 20180315T165526Z.s14.tar.gz
2018/03/15 13:01:45 Restoring shard 5 live from backup 20180315T165526Z.s5.tar.gz
2018/03/15 13:01:45 Restoring shard 6 live from backup 20180315T165526Z.s6.tar.gz
2018/03/15 13:01:45 Restoring shard 11 live from backup 20180315T165526Z.s11.tar.gz
2018/03/15 13:01:45 Restoring shard 3 live from backup 20180315T165526Z.s3.tar.gz
2018/03/15 13:01:45 Restoring shard 10 live from backup 20180315T165526Z.s10.tar.gz
2018/03/15 13:01:45 Restoring shard 12 live from backup 20180315T165526Z.s12.tar.gz
2018/03/15 13:01:45 Meta info not found for shard 15 on database _internal. Skipping shard file 20180315T165526Z.s15.tar.gz
2018/03/15 13:01:45 Restoring shard 7 live from backup 20180315T165526Z.s7.tar.gz
2018/03/15 13:01:45 Restoring shard 8 live from backup 20180315T165526Z.s8.tar.gz
2018/03/15 13:01:46 Restoring shard 9 live from backup 20180315T165526Z.s9.tar.gz
2018/03/15 13:01:46 Restoring shard 1 live from backup 20180315T165526Z.s1.tar.gz
2018/03/15 13:01:46 Restoring shard 2 live from backup 20180315T165526Z.s2.tar.gz
2018/03/15 13:01:46 Restoring shard 4 live from backup 20180315T165526Z.s4.tar.gz
1 Like

Getting following error while Running

influxd restore -portable .

restore: restore failed while processing manifest files: read manifest: invalid character ‘\x1f’ looking for beginning of value

It seems that your manifest file is missing or corrupted. there should be some file similar to 20180315T165526Z.manifest . That file a plain text JSON file. Open it and check that it contains valid JSON. If not, then you’ll need to trace your steps backward to when you created the backup to see when the manifest file might have been modified. If it looks visually OK, I would recommend running the file through a tool such as jq to see if it exposes any problems with the data.

When i just tried to do a live backup with 1.5.2, it was going fine, churning along (database was 32GB measured by du), it eventually started showing this error, then stopped.
2018/09/14 10:34:46 backing up db=sightline rp=RP_1hr shard=35112 to /root/myBackup/mydb.RP_1hr.35112.00 since 0001-01-01T00:00:00Z
2018/09/14 10:34:46 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2s and retrying (0)…
2018/09/14 10:34:48 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2s and retrying (1)…
2018/09/14 10:34:50 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2s and retrying (2)…
2018/09/14 10:34:52 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2s and retrying (3)…
2018/09/14 10:34:54 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2s and retrying (4)…
2018/09/14 10:34:56 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2s and retrying (5)…
2018/09/14 10:34:58 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 3.01s and retrying (6)…
2018/09/14 10:35:01 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 11.441s and retrying (7)…
2018/09/14 10:35:12 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 43.477s and retrying (8)…
2018/09/14 10:35:56 Download shard 35112 failed copy backup to file: err=, n=0. Waiting 2m45.216s and retrying (9)…
2018/09/14 10:38:41 backup failed: copy backup to file: err=, n=0
backup: copy backup to file: err=, n=0

looking at that retention policy, which is only 1 hour long, it probably doesn’t have any data in it. Is this an edge case where the backup fails, trying to backup a shard with no data, since it’s expired out?

P.S. it looks like it had backed up about 9GB in the specified folder, when it reached this error, which probably has no relevance, but I just mention it to show it was backing up data until this point.

Edit: I see in the 1.6.0 release notes, this “Delete deleted shards in retention service.” This may avoid the issue I mentioned since the shard is removed. Can anyone confirm?

hi @Jeffery_K looks like it might be a bug. I don’t think I have a workaround but I wanted to let you know I’ll look into it and get back to you.

@Jeffery_K we have an idea for what’s happening and you are basically correct. At the start of the backup, we get a listing of shards to backup. If one of those shards is disabled by the RP in the meantime, then there could be a failure. in our influxdb nightly build, and the forthcoming 1.7, we have added an option to continue backing up if there is an error with the current shard:

https://github.com/influxdata/influxdb/blob/master/cmd/influxd/backup/backup.go#L621

where do we pass these commands please

Hi @Ahmed_OMRANE welcome to the community ,

The backup and restore commands can be passes on the commandline ,

Best regards

influxd backup -portable -database demo2 D:/Program Files/influxData/influxdb

backup: Exactly one backup path is required.

it doesn t work it shows one path is required whats that means how should i pass the command?

Hi ,

The problem is the space in your backup path
Can you try putting the backup path between quotes ?

Best tegards

Hi, sorry for reviving this old topic but I was hoping it would help clarifying the context of my question.

I’m trying to do a restore incremental to the target database, which based on the sentence quoted below should be possible and yet I get the “database already exists” error message…

Where do I miss something, please?

Thanks,

The error message :

2019/10/05 20:12:12 error updating meta: DB metadata not changed. database may already exist
restore: DB metadata not changed. database may already exist`

What made me think it should be possible:

Above Commands work fine for me!
if you are running influxdb on server use this commands directly like influxd backup -portable -DatabaseName /directory/where/to/save.
If pod then kubectl exec -it podname – /bin/bash --> influxd backup -portable -database xyz /directory/where/to/save
Restore
influxd restore -portable -db xyz -newdb abc /Directory/wherefromrestore(this solves already DB exists probelem)