Playing with last version of xtrabackup and compress it I noticed that gzip is unacceptable slow for both compression and decompression operations. Actually Peter wrote about it some time ago, but I wanted to review that data having some new information. In current multi-core word the compression utility should utilize several CPU to speedup operation, and another my requirement was the ability to work with stdin / stdout, so I could do scripting something like: innobackupex –stream | compressor | network_copy.
My research gave me next list: pigz (parallel gzip), pbzip2 (parallel bzip2), qpress ( command line utility for QuickLZ) and I wanted to try LZO (as lzop 1.03 command line + LZO 2 libraries). Actually lzop does not support parallel operations, but it is know to have good decompression speed even with 1 thread. UPDATE 17-Mar-2009: I added lzma results also by request from comments.
For compression test I took ~12GB of InnoDB data files generated by tpcc benchmark with 100 warehouses.
I tested 1, 2, 4 parallel threads for tools that support it and different level of compression ( 1,2,3 for qpress; -1 and -5 for other tools)
The raw results are available here http://spreadsheets.google.com/ccc?key=pOIo5aX59b6biPZ0QTVMXHg&hl=en, and I copy table in place in case if Google stops to work.
|
|
|
|
|
|
|
|
|
|
|
|
| threads
| level
| compressed size
| compress ratio
| comression time, sec
| compr speed, MB/s
| decomp time, sec
| decomp speed, MB/s |
| qpress
| 1
| 1
| 6,058.93
| 0.52
| 109
| 55.59
| 92
| 65.86 |
|
| 1
| 2
| 5,892.62
| 0.51
| 201
| 29.32
| 123
| 47.91 |
|
| 1
| 3
| 5,885.01
| 0.51
| 473
| 12.44
| 84
| 70.06 |
|
| 2
| 1
| 6,058.93
| 0.52
| 65
| 93.21
| 66
| 91.80 |
|
| 2
| 2
| 5,892.62
| 0.51
| 110
| 53.57
| 112
| 52.61 |
|
| 2
| 3
| 5,885.01
| 0.51
| 245
| 24.02
| 84
| 70.06 |
|
| 4
| 1
| 6,058.93
| 0.52
| 48
| 126.23
| 66
| 91.80 |
|
| 4
| 2
| 5,892.62
| 0.51
| 64
| 92.07
| 68
| 86.66 |
|
| 4
| 3
| 5,885.01
| 0.51
| 130
| 45.27
| 65
| 90.54 |
| pigz
| 1
| 1
| 4,839.97
| 0.42
| 438
| 11.05
| 129
| 37.52 |
|
| 1
| 5
| 3,460.31
| 0.30
| 763
| 4.54
| 121
| 28.60 |
|
| 2
| 1
| 4,839.97
| 0.42
| 213
| 22.72
| 109
| 44.40 |
|
| 2
| 5
| 3,460.31
| 0.30
| 379
| 9.13
| 104
| 33.27 |
|
| 4
| 1
| 4,839.97
| 0.42
| 107
| 45.23
| 112
| 43.21 |
|
| 4
| 5
| 3,460.31
| 0.30
| 190
| 18.21
| 103
| 33.60 |
| LZOP
| 1
| 1
| 5,831.25
| 0.50
| 184
| 31.69
| 83
| 70.26 |
|
| 1
| 5
| 5,850.16
| 0.50
| 179
| 32.68
| 87
| 67.24 |
| pbzip2
| 1
| 1
| 4,154.41
| 0.36
| 1594
| 2.61
| 597
| 6.96 |
|
| 1
| 5
| 4,007.07
| 0.34
| 1702
| 2.35
| 644
| 6.22 |
|
| 2
| 1
| 4,154.41
| 0.36
| 800
| 5.19
| 605
| 6.87 |
|
| 2
| 5
| 4,007.07
| 0.34
| 844
| 4.75
| 648
| 6.18 |
|
| 4
| 1
| 4,154.41
| 0.36
| 399
| 10.41
| 602
| 6.90 |
|
| 4
| 5
| 4,007.07
| 0.34
| 421
| 9.52
| 645
| 6.21 |
| LZMA
| 1
| 1
| 3,623.66
| 0.31
| 1454
| 2.49
| 501
| 7.23 |
|
| 1
| 5
| NA
| NA
| not done in 2h
| NA
| NA
| NA |
|
To summarize results:
- pbzip2 obviously show good compression, but the speed of processing is too slow. What is interesting on Level 5 the compression is worse than in pigz Level 5
- pigz is good for compression and faster than pbzip2 but still not so fast; however multi-threaded processing may be OK, especially if you need to keep compatibility, e.g. copy result on boxes where only standard gzip available
- qpress is not so good in compression ration, but speed is impressive, and maybe we will ship xtrabackup with this compression
- LZO is even faster in decompression than qpress, but I would like to see parallel version. There is the patch for it, but it did not apply clean to lzop 1.02, so I skipped it
- In my opinion in all cases Level 1 of compression shows better tradeoff between size of archive and compression/decompression time
There is no obvious winner, it depends on what is more important for you – size or time, but having this data we can make decision.