Compression for InnoDB backup

March 16, 2009
Author
Vadim Tkachenko
Share this Post:

Playing with last version of xtrabackup and compress it I noticed that gzip is unacceptable slow for both compression and decompression operations. Actually Peter wrote about it some time ago, but I wanted to review that data having some new information. In current multi-core word the compression utility should utilize several CPU to speedup operation, and another my requirement was the ability to work with stdin / stdout, so I could do scripting something like: innobackupex –stream | compressor | network_copy.

My research gave me next list: pigz (parallel gzip), pbzip2 (parallel bzip2), qpress ( command line utility for QuickLZ) and I wanted to try LZO (as lzop 1.03 command line + LZO 2 libraries). Actually lzop does not support parallel operations, but it is know to have good decompression speed even with 1 thread. UPDATE 17-Mar-2009: I added lzma results also by request from comments.


For compression test I took ~12GB of InnoDB data files generated by tpcc benchmark with 100 warehouses.

I tested 1, 2, 4 parallel threads for tools that support it and different level of compression ( 1,2,3 for qpress; -1 and -5 for other tools)

The raw results are available here http://spreadsheets.google.com/ccc?key=pOIo5aX59b6biPZ0QTVMXHg&hl=en, and I copy table in place in case if Google stops to work.

threads

level

compressed size

compress ratio

comression time, sec

compr speed, MB/s

decomp time, sec

decomp speed, MB/s
qpress

1

1

6,058.93

0.52

109

55.59

92

65.86
1

2

5,892.62

0.51

201

29.32

123

47.91
1

3

5,885.01

0.51

473

12.44

84

70.06
2

1

6,058.93

0.52

65

93.21

66

91.80
2

2

5,892.62

0.51

110

53.57

112

52.61
2

3

5,885.01

0.51

245

24.02

84

70.06
4

1

6,058.93

0.52

48

126.23

66

91.80
4

2

5,892.62

0.51

64

92.07

68

86.66
4

3

5,885.01

0.51

130

45.27

65

90.54
pigz

1

1

4,839.97

0.42

438

11.05

129

37.52
1

5

3,460.31

0.30

763

4.54

121

28.60
2

1

4,839.97

0.42

213

22.72

109

44.40
2

5

3,460.31

0.30

379

9.13

104

33.27
4

1

4,839.97

0.42

107

45.23

112

43.21
4

5

3,460.31

0.30

190

18.21

103

33.60
LZOP

1

1

5,831.25

0.50

184

31.69

83

70.26
1

5

5,850.16

0.50

179

32.68

87

67.24
pbzip2

1

1

4,154.41

0.36

1594

2.61

597

6.96
1

5

4,007.07

0.34

1702

2.35

644

6.22
2

1

4,154.41

0.36

800

5.19

605

6.87
2

5

4,007.07

0.34

844

4.75

648

6.18
4

1

4,154.41

0.36

399

10.41

602

6.90
4

5

4,007.07

0.34

421

9.52

645

6.21
LZMA

1

1

3,623.66

0.31

1454

2.49

501

7.23
1

5

NA

NA

not done in 2h

NA

NA

NA

To summarize results:

  • pbzip2 obviously show good compression, but the speed of processing is too slow. What is interesting on Level 5 the compression is worse than in pigz Level 5
  • pigz is good for compression and faster than pbzip2 but still not so fast; however multi-threaded processing may be OK, especially if you need to keep compatibility, e.g. copy result on boxes where only standard gzip available
  • qpress is not so good in compression ration, but speed is impressive, and maybe we will ship xtrabackup with this compression
  • LZO is even faster in decompression than qpress, but I would like to see parallel version. There is the patch for it, but it did not apply clean to lzop 1.02, so I skipped it
  • In my opinion in all cases Level 1 of compression shows better tradeoff between size of archive and compression/decompression time

There is no obvious winner, it depends on what is more important for you – size or time, but having this data we can make decision.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved