Erasure Coder

Leopard-RS Multithreading Results

Results from hand-tuned (Windows-only) worker thread-pool for Leopard:

It’s actually pretty hard to tell why it’s not 8x faster. I’m guessing that it is hitting a memory bandwidth limit on the processor or something like that, based on my previous multi-threaded XOR test.

Packet-sized data 1 MB with 10% redundancy:

Parameters: [original count=1000] [recovery count=100] [buffer bytes=1344] [loss count=100] [random seed=2] (multi-threading OFF)
Leopard Encoder(1.344 MB in 1000 pieces, 100 losses): Input=3385.39 MB/s, Output=338.539 MB/s
Leopard Decoder(1.344 MB in 1000 pieces, 100 losses): Input=578.811 MB/s, Output=57.8811 MB/s

Parameters: [original count=1000] [recovery count=100] [buffer bytes=1344] [loss count=100] [random seed=2] (multi-threading ON)
Leopard Encoder(1.344 MB in 1000 pieces, 100 losses): Input=3215.31 MB/s, Output=321.531 MB/s
Leopard Decoder(1.344 MB in 1000 pieces, 100 losses): Input=751.258 MB/s, Output=75.1258 MB/s

30% faster with 8 threads

File-sized data 64 MB in 64KB files with 10% redundancy:

Parameters: [original count=1000] [recovery count=100] [buffer bytes=64000] [loss count=100] [random seed=2] (multi-threading OFF)
Leopard Encoder(64 MB in 1000 pieces, 100 losses): Input=2471.14 MB/s, Output=247.114 MB/s
Leopard Decoder(64 MB in 1000 pieces, 100 losses): Input=489.63 MB/s, Output=48.963 MB/s

Parameters: [original count=1000] [recovery count=100] [buffer bytes=64000] [loss count=100] [random seed=2] (multi-threading ON)
Leopard Encoder(64 MB in 1000 pieces, 100 losses): Input=2475.06 MB/s, Output=247.506 MB/s
Leopard Decoder(64 MB in 1000 pieces, 100 losses): Input=817.369 MB/s, Output=81.7369 MB/s

66% faster with 8 threads

And then I scrapped my own thread-pool and just used OpenMP, which is a million times easier and performs better:

Parameters: [original count=1000] [recovery count=100] [buffer bytes=64000] [loss count=100] [random seed=2]
Leopard Encoder(64 MB in 1000 pieces, 100 losses): Input=4976.28 MB/s, Output=497.628 MB/s
Leopard Decoder(64 MB in 1000 pieces, 100 losses): Input=870.902 MB/s, Output=87.0902 MB/s

Parameters: [original count=1000] [recovery count=100] [buffer bytes=1344] [loss count=100] [random seed=2]
Leopard Encoder(1.344 MB in 1000 pieces, 100 losses): Input=5250 MB/s, Output=525 MB/s
Leopard Decoder(1.344 MB in 1000 pieces, 100 losses): Input=822.521 MB/s, Output=82.2521 MB/s
Posted June 6, 2017

author Christopher A Taylor (catid)Development blog for Christopher A Taylor (catid), systems software engineer at Oculus/Facebook: Focus on erasure correction coding (ECC/FEC), cryptography, networking, lossless image compression.


Consult me via Email (mrcatid at gmail).
Follow me on twitter/@oculuscat.
Check out my free, BSD licensed software on github/catid.
Hobby coding for 22 years in GwBasic, QBasic, TI-BASIC, VB6, VBA, C, Intel assembly, C++, C#, JavaScript.