mirror of
https://github.com/fish-shell/fish-shell.git
synced 2026-05-24 14:01:15 -03:00
Work around broken rendering of pasted multibyte chars in non-UTF-8-ish locale
Run
printf \Xf6 | wl-copy # ö in ISO-8859-1
LANG=de_DE LC_ALL=$LANG gnome-terminal -- build/fish
and press ctrl-v. The pasted data looks like this:
$ set data (wl-paste -n 2>/dev/null | string collect -N)
$ set -S data
$data: set in local scope, unexported, with 1 elements
$data[1]: |\Xf6|
we pass $data directly to "commandline -i", which is supposed to insert it
into the commandline verbatim. What's actually inserted is "�".
This is because of all of:
1. We never decode "\Xf6 -> ö" in this scenario. Decoding it -- like we do
for non-pasted keyboard input -- would fix the issue.
2. We've switched to using Rust's char, which, for better or worse, disallows
code points that are not valid in Unicode (see b77d1d0e2 (Stop crashing
on invalid Unicode input, 2024-02-27)). This means that we don't simply
store \Xf6 as '\u{00f6}'. Instead we use our PUA encoding trick, making it
\u{f6f6} internally.
3. Finally, b77d1d0e2 renders reserved codepoints (which includes PUA chars)
using the replacement character � (sic). This was deemed more
user-friendly than printing an invalid character (which is probably not
mapped to a glyph). Yet it causes problems here: since we think that
\u{f6f6} is garbage, we try to render the replacement character. Apparently
that one is not defined(?) in ISO-8859-1; we get "�".
Fix this regression by removing the replacement character feature.
In future we should maybe decode pasted input instead. We could do that
lazily in "commandline -i", or eagerly in "set data (wl-paste ...)".
This commit is contained in:
@@ -19,7 +19,7 @@
|
||||
use libc::{ONLCR, STDERR_FILENO, STDOUT_FILENO};
|
||||
|
||||
use crate::common::{
|
||||
fish_reserved_codepoint, get_ellipsis_char, get_omitted_newline_str, get_omitted_newline_width,
|
||||
get_ellipsis_char, get_omitted_newline_str, get_omitted_newline_width,
|
||||
has_working_tty_timestamps, shell_modes, str2wcstring, wcs2string, write_loop, ScopeGuard,
|
||||
ScopeGuarding,
|
||||
};
|
||||
@@ -1825,9 +1825,6 @@ fn compute_layout(
|
||||
// \n.
|
||||
// See https://unicode-table.com/en/blocks/control-pictures/
|
||||
fn rendered_character(c: char) -> char {
|
||||
if fish_reserved_codepoint(c) {
|
||||
return '<27>'; // replacement character
|
||||
}
|
||||
if c <= '\x1F' {
|
||||
char::from_u32(u32::from(c) + 0x2400).unwrap()
|
||||
} else {
|
||||
|
||||
Reference in New Issue
Block a user