Make printf unicode-aware

Specifically, the width and precision format specifiers are interpreted as
referring to the width of the grapheme clusters rather than the byte count of
the string. Note that grapheme clusters can differ in width.

If a precision is specified for a string, meaning its "maximum number of
characters", we consider this to limit the width displayed.
If there is a grapheme cluster whose width is greater than 1,
it might not be possible to get precisely the desired width.
In such cases, this last grapheme cluster is excluded from the output.

Note that the definitions used here are not consistent with the `string length`
builtin at the moment, but this has already been the case.
This commit is contained in:
Daniel Rainer
2025-05-03 01:43:53 +02:00
parent 0d59e89374
commit 09eae92888
6 changed files with 112 additions and 22 deletions

View File

@@ -10,6 +10,8 @@ license = "MIT"
[dependencies]
libc = "0.2.155"
widestring = { version = "1.0.2", optional = true }
unicode-segmentation = "1.12.0"
unicode-width = "0.2.0"
[lints]
workspace = true