Search Issue Tracker
Won't Fix
Votes
0
Found in [Package]
3.0.X - Serialization
Issue ID
ECSB-1147
Regression
Yes
Unicode escape sequences are incorrectly parsed when using SerializedObjectReader.Read()
How to reproduce:
1. Open the attached “serialization_unicode_issue” project
2. In the menu bar, select “Tools → Test”
3. Observe the result in the Console window
Expected result: Unicode sequences are successfully parsed and *\"TEST\"* is shown in the Console
Actual result: Unicode sequences are parsed unsuccessfully and null characters are inserted, the Console output is blank
Reproducible in: 3.0.0-pre.1 (2022.3.32f1), 3.1.1 (2022.3.32f1, 6000.0.5f1)
Not reproducible in: 2.1.2-exp.1 (2021.3.39f1, 2022.3.32f1)
Reproducible on: Windows 11
Not reproducible on: No other environments tested
Note: Also reproducible in Player
Add comment
All about bugs
View bugs we have successfully reproduced, and vote for the bugs you want to see fixed most urgently.
Latest issues
- Shader warnings are thrown after building High Definition 3D template
- "EndLayoutGroup: BeginLayoutGroup must be called first" error is thrown when changing Shader Precision Model from the Build Profiles window
- White artifacts/outlines are visible in the Garden Scene when viewing at meshes from a distance
- Shader warnings "Sprite-Unlit-Default" are thrown after building 2D Platrformer Microgame Template
- [Android] HLSL shader becomes corrupted when running on an Android device
Resolution Note:
So UnsafePackedBinaryWriter relies on a couple of "streaming" data structures, none of which necessarily have a single buffer which contains every sequence of bytes that make up a char stream.
For unescaped sequences that's "fine," you can just serially output data from start to finish and everything will be fine. Two-character escape sequences always start with backslash and aren't too hard to handle either. But unicode escape sequences are between 3 and 6 characters and require substantially more handling to be robust and correct.
I threw together a patch to handle unicode escape sequences but taking a step back this is functionality that, were we really to commit to maintaining, would require a more substantial rewrite of the internals to avoid having to maintain the very fragile "continuation context" necessary to pipeline escaped character state across token boundaries.
Recommendation is for users to use StripStringEscapeCharacters mechanism to strip unicode characters after the data has been linearized.