GSoC 2020: Pre-conditioners applied to ROOT compression algorithms
https://summerofcode.withgoogle.com/projects/#5798059117641728
Author: Keisuke Kamahori
Organization: CERN-HSF
Mentors: Oksana Shadura, Brian Paul Bockelman, Ken Bloom
Introduction
ROOT is a set of C++ frameworks developed by CERN which provides all the functionality required to handle and analyze a large amount of data in a very efficient way. ROOT has become the de-facto standard tool in the field of high energy physics because of its flexibility: it can process any type of C++ object as machine-independent compressed formats such as TTree and RNTuple.
In order to deal with the huge amounts of data, ROOT has an advanced I/O system that relies heavily on data compression to reduce the size of files. ROOT uses several compression algorithms such as ZLIB, LZ4, or ZSTD, and each has its own strong/weak points in terms of the two most important factors: compression ratio and decompression speed.
A pre-conditioner is an algorithm that rearranges binary data in order to improve the performance of compression algorithms. Research in 2019 demonstrated that Bitshuffle, a pre-conditioner that transposes binary data, can be beneficial for ROOT file formats. Bitshuffle does not compress data in itself but combined with LZ4 it performs better compression ratio.
This project has been focused on validating the possibility of using Bitshuffle in compressing ROOT files (TTree and RNTuple) and investigating how and when we can make use of the functionality.
Integration
Prior to the coding period, I wrote a unit test as an exercise: PR #5081.
Bitshuffle was integrated as a pre-conditioner for LZ4 as a new compression algorithm LZ4BS in PR #6221. It can be used with the compression setting 6
.
To make the most of Bitshuffle, LZ4BS takes a comparison approach: when LZ4BS was chosen as the compression algorithm, it tries compressing both with LZ4/LZ4BS, then choose better one. Also, vectorization (SSE and AVX) was implemented like a fat binary.
Benchmarks
I have also added some new I/O benchmarks to rootbench. For TTree, benchmarks for LHCb, NanoAOD, and ATLAS files were integrated in PR #145 and PR #179. For RNTuple, ATLAS benchmark was added in PR #170.
Performance Analysis
Here I briefly summarize some of the performance analysis that I reported on the final presentation.
TTree
LZ4BS made the compression ratio better for LHCb and NanoAOD files. Especially it worked pretty well for the NanoAOD file and it reduced decompression time. However, no improvement was seen for the ATLAS file (which is known to have a bad performance with LZ4).
RNTuple
LZ4BS also improved the performance for RNTuple files.
Compression Time
The time required in compressing files with LZ4BS was at most about 30% longer than with LZ4, which could be seen as reasonable considering the big improvement in the compression ratio.
File/Level | LZ4 (ms) | LZ4BS (ms) | Ratio |
---|---|---|---|
LHCb/1 | 1221.56 | 1400.14 | +14.6% |
LHCb/6 | 16639.5 | 22145.5 | +33.1% |
LHCb/9 | 20609.3 | 27557.8 | +33.7% |
NanoAOD/1 | 13692.0 | 15140.3 | +10.6% |
NanoAOD/6 | 382049 | 437289 | +14.5% |
NanoAOD/9 | 548143 | 549916 | +0.3% |
Detailed Analysis
Branches (or pages for RNTuple) that consist only of small & positive int
values have a good compression ratio in LZ4BS because there are many consecutive zeros after transposing such data.
On the other hand, float
and bool
values tend to have worse compression ratio than int
values as shown in the table below, possibly because adjacent bytes are unlikely to correlate for such data types.
Type | Total Size in LZ4 | Total Size in LZ4BS | Ratio |
---|---|---|---|
int | 216550264 | 78717632 | -63.7% |
float | 4032419451 | 3198921162 | -21.7% |
bool | 270378042 | 216623912 | -20.0% |
ZSTD + Bitshuffle
Besides LZ4, Bitshuffle was tested along with ZSTD, but it did not result in the improvement of performance. As shown in this table, Bitshuffle reduced filesize with ZSTD only 5% even in the best scenario.
File | Size in ZSTD | Size in ZSTD+BS | Ratio |
---|---|---|---|
LHCb | 120601097 | 114500217 | -5.1% |
NanoAOD | 1588405426 | 1623656639 | +2.2% |
ATLAS (TTree) | 1981475287 | 1980677449 | -0.0% |
ATLAS (RNTuple) | 912145794 | 912145643 | -0.0% |
Community
I had several opportunities to participate in ROOT I/O Meetings and present my progress to community members. Here is the list of meetings that I attended: