Users have tried to get a list of all tokens -- including operators
-- using "commandline --tokens-raw". That one has been deprecated
by cc2ca60baa (commandline.rst: deprecate --tokens-raw option,
2025-05-05). Part of the reason is that the above command is broken
for multi-line tokens.
Let's support this use case in a way that's less ambiguous.
Closes#11084
Two issues:
1. typing the codepoint 0x123456 into fish_key_reader:
$ fish_key_reader -cV
# decoded from: \xf4\xa3\x91
bind \xf4 'do something'
# decoded from:
bind \xa3 'do something'
# decoded from:
bind \x91 'do something'
The invalid codepoint is represented in its original encoding, which leaks
to the UI. This was more or less intentionally added by b77d1d0e2b (Stop
crashing on invalid Unicode input, 2024-02-27). That commit rendered it
as replacement byte, but that was removed for other reasons in e25a1358e6
(Work around broken rendering of pasted multibyte chars in non-UTF-8-ish
locale, 2024-08-03).
We no longer insert such (PUA) codepoints into the commandline. The "bind"
comes above would work however. I don't think this is something we want
to support. Discard invalid codepoints in the reader, so they can't be
bound and fish_key_reader shows nothing.
2. builtin read silently drops invalid encodings This builtin is not really
suited to read binary data (#11383 is an error scenario), but I guess it can
be bent to do that. Some of its code paths use str2wcstring which passes
through e.g. invalid UTF-8. The read-one-char-at-a-time code path doesn't.
Fix this.
As mentioned in
https://github.com/fish-shell/fish-shell/pull/9688#discussion_r1155089596,
commit b77d1d0e2b (Stop crashing on invalid Unicode input, 2024-02-27), Rust's
char type doesn't support arbitrary 32-bit values. Out-of-range Unicode
codepoints would cause crashes. That commit addressed this by converting
the encoded bytes (e.g. UTF-8) to special private-use-area characters that
fish knows about. It didn't bother to update the code path in builtin read
that relies on mbrtowc as well.
Fix that. Move and rename parse_codepoint() and rename/reorder its input/output
parameters.
Note that the behavior is still wrong if builtin read can't decode the
input; see the next commit.
Fixes#11383
Our use of the terminfo database in /usr/share/terminfo/$TERM is both
1. a way for users to configure app behavior in their terminal (by
setting TERM, copying around and modifying terminfo files)
2. a way for terminal emulator developers to advertise support for
backwards-incompatible features that are not otherwise easily observable.
To 1: this is not ideal (it's very easy to break things). There's not many
things that realistically need configuration; let's use shell variables
instead.
To 2: in practice, feature-probing via terminfo is often wrong. There's not
many backwards-incompatible features that need this; for the ones that do
we can still use terminfo capabilities but query the terminal via XTGETTCAP
directly, skipping the file (which may not exist on the same system as
the terminal).
---
Get rid of terminfo. If anyone finds a $TERM where we need different behavior,
we can hardcode that into fish.
* Allow to override this with `fish_features=no-ignore-terminfo fish`
Not sure if we should document this, since it's supposed to be removed soon,
and if someone needs this (which we don't expect), we'd like to know.
* This is supported on a best-effort basis; it doesn't match the previous
behavior exactly. For simplicity of implementation, it will not change
the fact that we now:
* use parm_left_cursor (CSI Ps D) instead of cursor_left (CSI D) if
terminfo claims the former is supported
* no longer support eat_newline_glitch, which seems no longer present
on today's ConEmu and ConHost
* Tested as described in https://github.com/fish-shell/fish-shell/pull/11345#discussion_r2030121580
* add `man fish-terminal-compatibility` to state our assumptions.
This could help terminal emulator developers.
* assume `parm_up_cursor` is supported if the terminal supports XTGETTCAP
* Extract all control sequences to src/terminal_command.rs.
* Remove the "\x1b(B" prefix from EXIT_ATTRIBUTE_MODE. I doubt it's really
needed.
* assume it's generally okay to output 256 colors
Things have improved since commit 3669805627 (Improve compatibility with
0-16 color terminals., 2016-07-21).
Apparently almost every actively developed terminal supports it, including
Terminal.app and GNU screen.
* That is, we default `fish_term256` to true and keep it only as a way to
opt out of the the full 256 palette (e.g. switching to the 16-color
palette).
* `TERM=xterm-16color` has the same opt-out effect.
* `TERM` is generally ignored but add back basic compatiblity by turning
off color for "ansi-m", "linux-m" and "xterm-mono"; these are probably
not set accidentally.
* Since `TERM` is (mostly) ignored, we don't need the magic "xterm" in
tests. Unset it instead.
* Note that our pexpect tests used a dumb terminal because:
1. it makes fish do a full redraw of the commandline everytime, making it
easier to write assertions.
2. it disables all control sequences for colors, etc, which we usually
don't want to test explicitly.
I don't think TERM=dumb has any other use, so it would be better
to print escape sequences unconditionally, and strip them in
the test driver (leaving this for later, since it's a bit more involved).
Closes#11344Closes#11345
This replaces the test_driver.sh/test.fish/interactive.fish system with a test driver written in python that calls into littlecheck directly and runs pexpect in a subprocess.
This means we reduce the reliance on the fish that we're testing, and we remove a posix sh script that is a weird stumbling block (see my recent quest to make it work on directories with spaces).
To run specific tests, e.g. all the tmux tests and bind.py:
tests/test_driver.py target/release/ tests/checks/tmux*.fish tests/pexpects/bind.py
This is somewhat subtle:
The #RUN line in a littlecheck file will be run by a posix shell,
which means the substitutions will also be mangled by it.
Now, we *have* shell-quoted them, but unfortunately what we need is to
quote them for inside a pre-existing layer of quotes, e.g.
# RUN: fish -C 'set -g fish %fish'
here, %fish can't be replaced with `'path with spaces/fish'`, because
that ends up as
# RUN: fish -C 'set -g fish 'path with spaces/fish''
which is just broken.
So instead, we pass it as a variable to that fish:
# RUN: fish=%fish fish...
In addition, we need to not mangle the arguments in our test_driver.
For that, because we insist on posix shell, which has only one array,
and we source a file, we *need* to stop having that file use
arguments.
Which is okay - test_env.sh could previously be used to start a test,
and now it no longer can because that is test_*driver*.sh's job.
For the interactive tests, it's slightly different:
pexpect.spawn(foo) is sensitive to shell metacharacters like space.
So we shell-quote it.
But if you pass any args to pexpect.spawn, it no longer uses a shell,
and so we cannot shell-quote it.
There could be a better way to fix this?
See the changelog additions for user-visible changes.
Since we enable/disable terminal protocols whenever we pass terminal ownership,
tests can no longer run in parallel on the same terminal.
For the same reason, readline shortcuts in the gdb REPL will not work anymore.
As a remedy, use gdbserver, or lobby for CSI u support in libreadline.
Add sleep to some tests, otherwise they fall (both in CI and locally).
There are two weird failures on FreeBSD remaining, disable them for now
https://github.com/fish-shell/fish-shell/pull/10359/checks?check_run_id=23330096362
Design and implementation borrows heavily from Kakoune.
In future, we should try to implement more of the kitty progressive
enhancements.
Closes#10359
The read test is now failing on GitHub actions even though it passes on
my Mac. It may be due to differences in dd between these two
environments. Stop using dd and just use head.
The read.fish check has a test where it limits the amount of data passed to
`read` to 8192 bytes, and verifies that fish reads exactly that amount.
This check occasionally fails on the OBS builds; it's very hard to repro a
failure locally, but I finally did it.
The amount of data written is limited via `yes` and `dd`:
yes $line | dd bs=1024 count=(math "$fish_read_limit / 1024")
The bug is that `dd` outputs a fixed number of "blocks" where a block
corresponds to a single read. As `yes` and `dd` are running concurrently,
it may happen that `dd` performs a short read; this then counts as a single
block. So `dd` may output less than the desired amount of data.
This can be verified by removing the 2>/dev/null redirection; on a
successful run dd reports `8+0 records out`, on a failed run it reports
`7+1 records out` because one of the records was short.
Fix this by using `fullblock` so that dd will no longer count a short read
as a single block. `head` would probably be a simpler tool to use but we'll
do this for now.
Happily it's not a fish bug. No need to relnote it.
We can't always read in chunks because we often can't bear to
overread:
```fish
echo foo\nbar | begin
read -l foo
read -l bar
end
```
needs to have the first read read `foo` and the second read `bar`. So
here we can only read one byte at a time.
However, when we are directly redirected:
```fish
echo foo | read foo
```
we can, because the data is only for us anyway. The stream will be
closed after, so anything not read just goes away. Nobody else is
there to read.
This dramatically speeds up `read` of long lines through a pipe. How
much depends on the length of the line.
With lines of 5000 characters it's about 15x, with lines of 50
characters about 2x, lines of 5 characters about 1.07x.
See #8542.
Changes it from
```
$fish_color_user: not set in local scope
$fish_color_user: set in global scope, unexported, with 1 elements
$fish_color_user[1]: length=3 value=|080|
$fish_color_user: set in universal scope, unexported, with 1 elements
$fish_color_user[1]: length=7 value=|brgreen|
```
(with the trailing empty line - not just a newline)
to
```
$fish_color_user: set in global scope, unexported, with 1 elements
$fish_color_user[1]: |080|
$fish_color_user: set in universal scope, unexported, with 1 elements
$fish_color_user[1]: |brgreen|
```
It's now good enough to do so.
We don't allow grid-alignment:
```fish
complete -c foo -s b -l barnanana -a '(something)'
complete -c foo -s z -a '(something)'
```
becomes
```fish
complete -c foo -s b -l barnanana -a '(something)'
complete -c foo -s z -a '(something)'
```
It's just more trouble than it is worth.
The one part I'd change:
We align and/or'd parts of an if-condition with the in-block code:
```fish
if true
and false
dosomething
end
```
becomes
```fish
if true
and false
dosomething
end
```
but it's not used terribly much and if we ever fix it we can just
reindent.
Do this only when splitting on IFS characters which usually contains
whitespace characters --- read --delimiter is unchanged; it still
consumes no more than one delimiter per variable. This seems better,
because it allows arbitrary delimiters in the last field.
Fixes#6406
This splits a string into variables according to the shell's
tokenization rules, considering quoting, escaping etc.
This runs an automatic `unescape` on the string so it's presented like
it would be passed to the command. E.g.
printf '%s\n' a\ b
returns the tokens
printf
%s\n
a b
It might be useful to add another mode "--tokenize-raw" that doesn't
do that, but this seems to be the more useful of the two.
Fixes#3823.