Search Issue Tracker

Won't Fix

Votes

0

Found in [Package]

1.6.0

Issue ID

1358219

Regression

No

Burst compiled job is completed few times slower than IL2CPP

Package: Burst

-

How to reproduce:
1. Open the attached project “BurstPerformance.zip” and select SampleScene (Assets/Scenes/SampleScene.unity)
2. Go to Project Settings > Player > Other Settings > Scripting Backend and set it to IL2CPP
3. Build And Run (File > Build And Run)
4. In the Player click the Game Object button “Button”

Expected result: Burst part of code is completed faster
Actual result: Burst part of code is completed a few times slower

Reproducible with: 1.6.0 (2019.4.30f1, 2020.3.18f1, 2021.1.20f1, 2021.2.0b9, 2022.1.0a7)

  1. Resolution Note:

    Hey there - I'm from the Burst team and I thought I'd drop some information about why I'm marking this as won't fix.

    Firstly - the cost of doing floating point modulus (x % y) is too high in Burst - I'm going to look at including a better variant of that. At present we use SLEEF which keeps its calculations for float fmodf in 32-bit types. But the best way for us to implement floating point modulus for float types is to upscale to double and do the calculation there. This is a by-design feature of SLEEF, but I think we can do a better job for Burst. So thanks for finding that!

    Secondly - the other main cost for Burst vs il2cpp is that with il2cpp it is not using vector operations to perform the calculation. All the vector math (float4 / etc) types are mapping down to scalar operations underneath. Normally this is bad, but in this specific case because your code sample does a lot of things that vector units really dislike (swizzling data, extracting scalar elements for multiplication, inserting them back into vectors) it just so happens that doing this specific bit of code scalar is better.

    Because we rely on LLVM to power Burst, LLVM has an underlying philosophy that if a user provides vectorized types, the compiler should respect that and preserve the vectorization, even though in a case like the one you provided it would actually be better for the compiler to completely scalarize the code.

    So the overall il2cpp vs Burst delta will remain more favourable to il2cpp in this example, but I do think I can make Burst a good bit closer to il2cpp by making fmod better.

    Thanks for your interest in Burst!

Add comment

Log in to post comment

All about bugs

View bugs we have successfully reproduced, and vote for the bugs you want to see fixed most urgently.