Backup, part 2: Review and test rsync-based backup tools

Backup, part 2: Review and test rsync-based backup tools



This note continues

backup loop
  1. Backup, part 1: Why backup, review of methods, technologies
  2. Backup, part 2: Review and test rsync-based backup tools
  3. Backup, part 3: Review and test duplicity, duplicaty, deja dup
  4. Backup, part 4: zbackup, restic, borgbackup review and testing
  5. Backup, Part 5: Testing bacula and veeam backup for linux
  6. Backup, Part 6: Comparison of backup tools
  7. Backup Part 7: Conclusions


As we wrote in the first article, there are a very large number of backup programs based on rsync.

Of those that best suit our conditions, I will consider 3: rdiff-backup, rsnapshot and burp.

Test file sets


The test file sets will be the same for all candidates, including future articles.

First set : 10 GB of media files, and about 50 MB - the source code of the site in php, file sizes from a few kilobytes for the source code, to tens of megabytes for media files. The goal is an imitation of a site with statics.

Second set : is obtained from the first set when renaming a subdirectory with 5 GB media files. The goal is to study the behavior of the backup system to rename the directory.

Third set : is obtained from the first by deleting 3 GB of media files and adding new 3 GB of media files. The goal is to study the behavior of the backup system during a typical site update operation.

Getting Results


Any backup is performed at least 3 times and is followed by a reset of the file system caches with the commands sync and echo 3 & gt;/proc/sys/vm/drop_caches both on the test server side and the backup storage server.

On the server that will be the backup source, the monitoring software is installed - netdata, which will be used to estimate the server load during copying, this is necessary for estimating the server load of the backup process.
I also believe that the backup storage server is slower on the processor than the main server, but it has more capacious disks with a relatively low random write speed - the most common situation when backing up, and due to the fact that the backup server should not do well other tasks besides backup, I will not monitor its load using netdata.

Also I have changed the servers on which I will check various systems for backup.

They now have the following characteristics
Processor

  sysbench --threads = 2 --time = 30 --cpu-max-prime = 20000 cpu run
 sysbench 1.0.17 (using system LuaJIT 2.0.4)

 Running the test with the following options:
 Number of threads: 2
 Initializing random number generator from current time


 Prime numbers limit: 20000

 Initializing worker threads ...

 Threads started!

 CPU speed:
  events per second: 1081.62

 General statistics:
  total time: 30.0013s
  total number of events: 32453

 Latency (ms):
  min: 1.48
  avg: 1.85
  max: 9.84
  95th percentile: 2.07
  sum: 59973.40

 Threads fairness:
  events (avg/stddev): 16226.5000/57.50
  execution time (avg/stddev): 29.9867/0.00
  

RAM, reading ...

  sysbench --threads = 4 --time = 30 --memory-block-size = 1K --memory-scope = global --memory-total-size = 100G --memory  -oper = read memory run
 sysbench 1.0.17 (using system LuaJIT 2.0.4)

 Running the test with the following options:
 Number of threads: 4
 Initializing random number generator from current time


 Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: read
  scope: global

 Initializing worker threads ...

 Threads started!

 Total operations: 104857600 (5837637.63 per second)

 102400.00 MiB transferred (5700.82 MiB/sec)


 General statistics:
  total time: 17.9540s
  total number of events: 104857600

 Latency (ms):
  min: 0.00
  avg: 0.00
  max: 66.08
  95th percentile: 0.00
  sum: 18544.64

 Threads fairness:
  events (avg/stddev): 26214400.0000/0.00
  execution time (avg/stddev): 4.6362/0.12
  

... and recording

  sysbench --threads = 4 --time = 30 --memory-block-size = 1K --memory-scope = global --memory-total-size = 100G --memory  -oper = write memory run
 sysbench 1.0.17 (using system LuaJIT 2.0.4)

 Running the test with the following options:
 Number of threads: 4
 Initializing random number generator from current time


 Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

 Initializing worker threads ...

 Threads started!

 Total operations: 91414596 (3046752.56 per second)

 89272.07 MiB transferred (2975.34 MiB/sec)


 General statistics:
  total time: 30.0019s
  total number of events: 91414596

 Latency (ms):
  min: 0.00
  avg: 0.00
  max: 1022.90
  95th percentile: 0.00
  sum: 66430.91

 Threads fairness:
  events (avg/stddev): 22853649.0000/945488.53
  execution time (avg/stddev): 16.6077/1.76
  

Disk on the data source server

  sysbench --threads = 4 - file-test-mode = rndrw - time = 60 - file-block-size = 4K - file-total-size = 1G fileio  run
 sysbench 1.0.17 (using system LuaJIT 2.0.4)

 Running the test with the following options:
 Number of threads: 4
 Initializing random number generator from current time


 Extra file open flags: (none)
 128 files, 8MiB each
 1GiB total file size
 Block size 4KiB
 Number of IO requests: 0
 Read/Write ratio for combined random IO test: 1.50
 Periodic FSYNC enabled, calling fsync () each 100 requests.
 Calling fsync () at the end of test, Enabled.
 Using synchronous I/O mode
 Doing random r/w test
 Initializing worker threads ...

 Threads started!


 File operations:
  reads/s: 4587.95
  writes/s: 3058.66
  fsyncs/s: 9795.73

 Throughput:
  read, MiB/s: 17.92
  written, MiB/s: 11.95

 General statistics:
  total time: 60.0241s
  total number of events: 1046492

 Latency (ms):
  min: 0.00
  avg: 0.23
  max: 14.45
  95th percentile: 0.94
  sum: 238629.34

 Threads fairness:
  events (avg/stddev): 261623.0000/1849.14
  execution time (avg/stddev): 59.6573/0.00
  

Disk on the backup storage server

  sysbench --threads = 4 - file-test-mode = rndrw - time = 60 - file-block-size = 4K - file-total-size = 1G fileio  run
 sysbench 1.0.17 (using system LuaJIT 2.0.4)

 Running the test with the following options:
 Number of threads: 4
 Initializing random number generator from current time


 Extra file open flags: (none)
 128 files, 8MiB each
 1GiB total file size
 Block size 4KiB
 Number of IO requests: 0
 Read/Write ratio for combined random IO test: 1.50
 Periodic FSYNC enabled, calling fsync () each 100 requests.
 Calling fsync () at the end of test, Enabled.
 Using synchronous I/O mode
 Doing random r/w test
 Initializing worker threads ...

 Threads started!


 File operations:
  reads/s: 11.37
  writes/s: 7.58
  fsyncs/s: 29.99

 Throughput:
  read, MiB/s: 0.04
  written, MiB/s: 0.03

 General statistics:
  total time: 73.8868s
  total number of events: 3104

 Latency (ms):
  min: 0.00
  avg: 78.57
  max: 3840.90
  95th percentile: 297.92
  sum: 243886.02

 Threads fairness:
  events (avg/stddev): 776.0000/133.26
  execution time (avg/stddev): 60.9715/1.59
  

Network speed between servers

  iperf3 -c backup
 Connecting to host backup, port 5201
 [4] local x.x.x.x port 59402 connected to y.y.y.y port 5201
 [ID] Interval Transfer Bandwidth Retr Cwnd
 [4] 0.00-1.00 sec. 419 MBytes 3.52 Gbits/sec 810 182 KBytes
 [4] 1.00-2.00 sec 393 MBytes 3.30 Gbits/sec 810 228 KBytes
 [4] 2.00-3.00 sec 378 MBytes 3.17 Gbits/sec 810 197 KBytes
 [4] 3.00-4.00 sec 380 MBytes 3.19 Gbits/sec 855 198 KBytes
 [4] 4.00-5.00 sec 375 MBytes 3.15 Gbits/sec 810 182 KBytes
 [4] 5.00-6.00 sec 379 MBytes 3.17 Gbits/sec 765 228 KBytes
 [4] 6.00-7.00 sec 376 MBytes 3.15 Gbits/sec 810 180 KBytes
 [4] 7.00-8.00 sec 379 MBytes 3.18 Gbits/sec 765 253 KBytes
 [4] 8.00-9.00 sec 380 MBytes 3.19 Gbits/sec 810 239 KBytes
 [4] 9.00-10.00 sec 411 MBytes 3.44 Gbits/sec 855 184 KBytes
 - - - - - - - - - - - - - - - - - - - - - - - - - -
 [ID] Interval Transfer Bandwidth Retr
 [4] 0.00-10.00 sec. 3.78 GBytes 3.25 Gbits/sec 8100 sender
 [4] 0.00-10.00 sec 3.78 GBytes 3.25 Gbits/sec receiver
  


Testing Method


  1. The test system prepares the file system with the first test suite, the repository is initialized on the backup storage server if necessary.
    The backup process starts and its time is measured.
  2. On the test server, files are migrated to the second test case. The backup process starts and its time is measured.
  3. The test server migrates to the third test case. The backup process starts and its time is measured.
  4. The received third test set is accepted new first; points 1-3 are repeated 2 more times.
  5. Data is recorded in a pivot table, charts are added with netdata.
  6. Create a report using a separate backup method.

Expected Results


Since all 3 candidates are based on the same technology (rsync), the results are expected to be close to normal rsync, including all of its benefits, namely:

  1. Files in the repository will be stored "as is."
  2. The size of the repository will only grow including the difference between backups.
  3. There will be a relatively large load on the network during data transfer, as well as a small load on the processor.

A standard rsync test run will be used as a reference, its results

are


The bottleneck was on the server storage backup data in the form of a disk based on HDD, which is quite clearly seen in the graphs in the form of a saw.

Data was copied in 4 minutes and 15 seconds.


Testing rdiff-backup


The first candidate is rdiff-backup, a python script that backs up one directory to another. At the same time, the current backup is stored "as is", and backups made earlier are added to a special subdirectory incrementally, and thus, space is saved.

We will check the typical operation mode, i.e. Starting the backup process initiates the client independently, and on the server side, the process that accepts data is started for backup.
Let's see
what it can do in our conditions
.



Opening hours for each test run:

First Run Second Launch Third Run
First Set 16m32s 16m26s 16m19s
Second Set 2h5m 2h10m 2h8m
Third Set 2h9m 2h10m 2h10m


Rdiff-backup very painfully reacts to any big data change, also does not fully utilize the network.

Testing rsnapshot


The second candidate, rsnapshot, is a perl script, the main requirement of which for effective work is support for hard links. This saves disk space. In this case, the files that have not changed since the previous backup will link to the original file using hard links.

The logic of the backup process is also inverted: the server actively “walks” on its clients and collects data.

Test results

we got the following


First Run Second Launch Third Run
First Set 4m22s 4m19s 4m16s
Second Set 2m6s 2m10s 2m6s
Third Set 1m18s 1m10s 1m10s

It worked very, very fast, much faster rdiff-backup and very close to pure rsync.

Burp Testing


Another option is a C implementation on top of librsync - a burp, has a client-server architecture including client authorization, as well as a web interface (not included in the basic distribution). Another interesting feature is backup without the right to restore to customers.

Let's look at
performance
.



First Run Second Launch Third Run
First Set 11m21s 11m10s 10m56s
Second Set 5m37s 5m40s 5m35s
Third Set 3m33s 3m24s 3m40s

I worked 2 times slower than rsnapshot, but also quickly enough, and certainly faster than rdiff-backup. The graphs are a bit saw-tooth - again, performance rests on the disk subsystem of the backup storage server, although this is not as pronounced as in rsnapshot.

Results


The size of the repositories for all candidates was about the same, that is, first growth to 10 GB, then growth to 15 GB, then growth to 18 GB, etc., which is associated with the feature of rsync. Also worth noting is the single-threadedness of all candidates (CPU load is about 50% with a dual-core machine). All 3 candidates provided the ability to restore the latest backup "as is", that is, it was possible to restore files without using any third-party programs, including those used to create repositories. It is also rsync's “patrimonial heritage.”

Findings


The more complex the backup system is and the more different capabilities it has - the slower it will work, but for not very demanding projects, any of them will work, except perhaps rdiff-backup.

Announcement


This note continues the backup cycle.

Backup, part 1: Why backup, review of methods, technologies
Backup, Part 2: Review and test rsync-based backup tools
Backup, Part 3: Review and test duplicity, duplicaty, deja dup
Backup part 4: zbackup, restic, borgbackup review and testing
Backup, Part 5: Testing bacula and veeam backup for linux
Backup, Part 6: Comparison of backup tools
Backup, Part 7: Conclusions

Author of the publication : Pavel Demkovich

Source text: Backup, part 2: Review and test rsync-based backup tools