Use `clap_complete` to generate completions for our xtasks. This comes
with two complications:
- Due to the unusual CLI of the whole xtask CLI definition being itself
a subcommand under `cargo` (via the `cargo xtask` alias), we need to
tell `clap_complete` that we're generating completions for `cargo`,
which is fairly straightforward to do via two new types which are only
used for generating completions.
- Our completions for `cargo xtask` only make sense within fish's
workspace, so we need to ensure that they are not active elsewhere.
`clap_complete` does not support adding such conditions, so we hack
them together by post-processing its output with sed.
Closes#12739
As described in
https://github.com/fish-shell/fish-shell/pull/9990#discussion_r1382494440,
prior to 77aeb6a2a8 (Port execution, 2023-10-08), "Parser" was
passed by mutable reference ("parser_t&"), even though operation
context was passed as "const operation_context_t &". This worked
because C++ doesn't propagate const to pointers by default (see
https://en.cppreference.com/cpp/experimental/propagate_const).
class operation_context_t {
std::shared_ptr<parser_t> parser;
...
};
So "*ctx->parser" was a "parser_t&", not "const parser_t&".
Rust has stricter const propagation rules which means that const
operation context can't simply hand out a non-const reference to parser.
To be able to port code without changing its structure,
77aeb6a2a8 passed "Parser" by shared reference, using interior
mutability (RefCell) to modify parser fields. This is a bit ugly
(c.f. https://doc.rust-lang.org/std/cell/index.html "interior mutability
is something of a last resort") and means that some borrowing conflicts
are not found at compile time but runtime.
Pass both parser and operation context by exclusive reference, and
remove the interior mutability wrappers from parser's fields.
Since "libdata" is no longer inside a "RefCell", add a "ScopedRefCell"
around "transient_commandline".
The downside is that "ScopeGuard" use can become more intrusive
when we pass "Parser" or "OperationContext" as context (especially
when we use "zelf" since we can't shadow "self"), see
* 2930466d53 (Introduce ScopedCell and ScopedRefCell, 2025-03-15)
* 29ae571afa (Make scoped_push nicer, 2024-12-28)
Avoid this in some cases, specifically when using "ScopedCell" or
"ScopedRefCell". Since "&mut Parser" prevents the "ScopeCell"'s
"ScopeGuard" from holding a shared reference, use an "Rc" to capture
a dynamically-checked reference to the Cell. We could also use raw
pointers instead.
Change "Completer::apply_var_assignments" to return a block ID, to
avoid the need to return a "zelf" "ScopeGuard". In future, we could
probably untangle completer and get away with returning a "ScopeGuard"
called "ctx".
Closes#12694
This command
cd $(mktemp -d)
mkdir a "a b"
complete -C": "
prints
a b/
a/
which is wrong ordering.
Usually the trailing slash should not be compared.
Fix this by always sorting slashes first. Not sure if this is correct
for middle slashes but I couldn't find a case where it matters.
Closes#12695
As of commit a296ee085c (Stop returning a value from ScopeGuarding::commit,
2025-03-15) "ScopeGuard::commit()" is equivalent to "drop()".
Let's use that instead.
Rewrite the PO file handling logic in Rust and make it available via an
xtask. Replaces the
`build_tools/{update_translations,fish_xgettext}.fish` scripts.
Main benefits:
- Better ergonomics
- Better error handling
- Eliminates the need for a fish executable for updating PO files,
which is particularly useful in CI
- Improved performance, mainly due to concurrent threads working on the
PO files in parallel
The behavior is mostly unchanged, with the minor exception that section
headers for empty sections are now omitted in PO files.
The interface for invoking the tooling is quite different. Instead of
working with flags, `cargo xtask gettext` has 3 subcommands:
- `update` modifies the PO files to match the current sources
- `check` is like update, but instead of modifying the PO files, it
shows diffs between the current version of the PO files and what they
would look like after updating. When there is a difference, the xtask
exits non-zero, making it useful for checks to detect outdated PO
files.
- `new` creates a new PO file for the given language.
Both the `update` and `check` command take any number of file paths to
specify the PO files to consider. If none are specified, all files in
`localization/po/` are considered.
Extracting gettext messages from Rust still requires compiling with the
`gettext-extract` feature active. In situations where compilation is
needed for other purposes as well, it can make sense to only build once
and then tell the gettext xtask about the directory into which the
messages have been extracted. This can be done via the
`--rust-extraction-dir` flag. If we stop having gettext messages in
Rust, this logic can be removed.
Closes#12676
Terminating the process at arbitrary points with `std::process::exit`
when errors occur has several problems. There is a lack of information
about what lead up to the error, and it prevents destructors from
running, which in the cases of xtasks can for example result in
temporary files being left on the file system.
Instead, use `anyhow` which conveniently integrates with Rust's Result
type, allowing to return `anyhow::Result<T>`, which is an alias for
`Result<T, anyhow::Error>`, which is compatible with any error type that
implements `std::error::Error`. The advantages of using `anyhow` over
plain `Result`s are that it makes it easier to handle different error
types, attach context to errors, and show the call/context stack
associated with the error. Returning an `anyhow::Result<()>` from `main`
is possible because it implements `std::process::Termination`, so we get
automatic error reporting and corresponding exit codes by simply
bubbling up errors to `main`, attaching context as desired, and finally
returning the result from `main.`
In addition to removing the `std::process::exit` calls, this commit also
improves error handling in a few spots in other ways, such as replacing
`unwrap` by returning errors.
Closes#12674
ShellCheck does not have a built-in way of detecting which files it
should check, so we use ripgrep's `ignore` library to find files not
ignored by our gitignore rules, and then look for a non-fish shebang in
the first line of the file. The resulting shell scripts are then passed
to ShellCheck.
Part of #12661
Move the functions for escaping and unescaping strings from
`src/common.rs` into `fish_common`. It might make sense to move them
into a dedicated crate at some point, but for now just move them to the
preexisting crate to unblock other extraction.
Closes#12625
This time, move char constants from `src/expand.rs` to
`fish_widestring`, which resolves a dependency cycle between
`src/expand.rs` and `src/common.rs`.
Part of #12625
The decoding functions for our widestrings are already in the
`fish_widestring` crate, so by symmetry, it makes sense to put the
encoding functions there as well. This also makes it easier to depend on
these functions, giving more options when it comes to further code
extraction.
Part of #12625
Use `fish_widestring` as the place where char definitions live. This has
the advantage that all our code can depend on `fish_widestring` without
introducing dependency cycles. Having a common place for character
definitions also makes it easier to see which chars have a special
meaning assigned to them.
This change also unblocks some follow-up refactoring by removing a
dependency cycle between `src/common.rs` and `src/wildcard.rs`.
Part of #12625
These functions don't depend on `wcstringutil` functionality, so there
is no need for them to be there. The advantage of putting them into our
`widestring` crate is that quite a lot of code depends on it, and
extracting some of that code would result in crate dependency cycles if
the functions stayed in the `wcstringutil` crate. Our `widestring` crate
does not depend on any of our other crates, so there won't be any cyclic
dependency issues with code in it.
Part of #12625
Exporting it as both `safe_write_loop` and `write_loop` is redundant and
causes inconsistencies. Remove the `pub use` and use `write_loop` for
the function name. It is shorter, and in Rust the default assumption is
that code is safe unless otherwise indicated, so there is no need to be
explicit about it.
Part of #12625
Reduce verbosity of const definitions. Define a dedicated const for the
base of the special key encoding range. This range is 256 bytes wide, so
by defining consts via an u8 offset from the base, we can guarantee that
the consts fall into the allocated range. Ideally, we would also check
for collisions, but Rust's const capabilities don't allow for that as
far as I'm aware.
Having `SPECIAL_KEY_ENCODE_BASE` in the `rust-widestring` crate allows
getting rid of the dependency on `key::Backspace` in the
`fish_reserved_codepoint` function, which unblocks code extraction.
Part of #12625
Another step in splitting up the main library crate.
Note that this change requires removing the `#[cfg(test)]` annotations
around the `LOCAL_OVERRIDE_STACK` code, because otherwise the code would
be removed in test builds for other packages, making the `#[cfg(test)]`
functions unusable from other packages, and functions with such feature
gates in their body would have the code guarded by these gates removed
in test builds for tests in other packages.
Closes#12494
Return None rather than -1 for nonprintables. We probably still
differ from wcwidth which is bad (given we use the same name), but
hopefully not in a way that matters.
Fixes 146384abc6 (Stop using wcwidth entirely, 2026-03-15).
wcwidth isn't a great idea - it returns "-1" for anything it doesn't
know and non-printables, which can easily break text.
It is also unclear that it would be accurate to the system console,
and that's a minority use-case over using ssh to access older systems.
Additionally, it means we use one less function from libc and
simplifies the code.
Closes#12562
"Emoji width" refers to the width of emoji codepoints. Since Unicode
9, they're classified as "wide" according to
TR11 (https://www.unicode.org/reports/tr11/).
Unicode 9 was released in 2016, and this slowly percolated into C
libraries and terminals. Glibc updated its default in 2.26, released
in August 2017.
Until now, we'd guess support for unicode 9 by checking the system
wcwidth function for an emoji - if it returned 2, we'd set our emoji
width to 2 as well.
However, that's a problem in the common case of using ssh to connect
to an old server - modern desktop OS, old server LTS OS, boom.
So now we instead just figure you've got a system that's *displaying*
the emoji that has been updated in the last 9 years.
In effect we're putting the burden on those who run old RHEL et al as
their client OS. They need to set $fish_emoji_width to 1.
Fixes#12500
Part of #12562
Previously, we chose the ellipsis character/string based on the locale.
We now assume a UTF-8 locale, and accordingly always use the Unicode
HORIZONTAL ELLIPSIS U+2026 `…`. When this was changed, some of the logic
for handling different ellipsis values was left behind. It no longer
serves a purpose, so remove it.
The functions returning constants are replaced by constants. Since the
ellipsis as a `wstr` is only used in a single file, make it a local
const there and define it via the `ELLIPSIS_CHAR` const.
Put the `ELLIPSIS_CHAR` definition into `fish-widestring`, removing the
dependency of `fish-wcstringutil` on `fish-common`, helping future
extraction efforts.
One localized message contains an ellipsis, which was inserted via a
placeholder, preventing translators from localizing it. Since the
ellipsis is a constant, put it directly into the localized string.
Closes#12493
This time, functions for decoding `wstr` into various types and the
`ToCString` trait are extracted.
Part of the wider goal of slimming down the main library to improve
incremental build performance and reduce dependency cycles.
Part of #12492
Panicking suggests that an assumption of our code was violated.
The current use of panics in xtasks is for expected failures, so it's
better to avoid panicking and instead just print the error message to
stderr and exit 1.
Closes#12482
Accurately computing the width of arbitrary strings is a non-trivial
problem. We outsource the logic for it to the `unicode-width` crate. But
directly passing our PUA-encoded strings to the crate would give
incorrect results whenever a PUA codepoint is encoded in our string,
since one input PUA codepoint is converted into 3 consecutive codepoints
in our encoding. Therefore, we need to decode before performing width
calculations. Our regular decoding decodes to raw bytes, which is
incompatible with the `unicode-width` crate, since it expects `char`s,
and the decoded bytes could be invalid UTF-8, making their width
undefined. We tackle this problem by building a custom iterator which
does on-the-fly decoding. Encoded PUA codepoints are turned back into
the original codepoints, and any other PUA-encoded bytes are replaced by
one replacement character (U+FFFD) per byte. The latter is not necessary
since PUA codepoints have a defined width of 1, so we could also forward
the PUA-encoded bytes which encode invalid UTF-8 input instead of
inserting the replacement character. The choice to use the replacement
character is made to avoid producing a char sequence where some PUA
codepoints represent themselves, whereas others still encode non-UTF-8
bytes. Such a mix of semantics would be confusing if the char sequence
is ever used for anything else. Replacement characters make it clear
that there are no remaining encoded semantics. Note that using the char
sequences produced in this way for any purpose other than width
computation is not intended. For output, our pre-existing decoding to
bytes should be used, which allows preserving non-UTF-8 bytes.
The implementation of the iterator is not entirely straightforward,
since we need to read up to 3 chars to be able to decide whether we have
an encoded PUA character. Therefore, we need to cache some chars across
invocations of the iterator's `next` and `next_back` invocations. This
is done via a custom buffer struct, which does not require dynamic
allocations.
The tests for the new functionality are only in the main crate because
the encoding function is not available in the `fish-widestring` crate.
Once that is resolved, the tests should be moved.
Part of #12457
Replace the `build_tools/style.fish` script by an xtask. This eliminates
the need for a fish binary for performing the formatting/checking. The
`fish_indent` binary is still needed. Eventually, this should be made
available as a library function, so the xtask can use that instead of
requiring a `fish_indent` binary in the `$PATH`.
The new xtask is called `format` rather than `style`, because that's a
more fitting description of what it does (and what the script it
replaces did).
The old script's behavior is not replicated exactly:
- Specifying `--all` and explicit paths is supported within a single
invocation.
- Explicit arguments no longer have to be files. If a directory is
specified, all files within it will be considered.
- The git check for un-staged changes is no longer filtered by file
names, mainly to simplify the implementation.
- A warning is now printed if neither the `--all` flag nor a path are
provided as arguments. The reason for this is that one might assume
that omitting these arguments would default to formatting everything
in the current directory, but instead no formatting will happen in
this case.
- The wording of some messages is different.
The design of the new code tries to make it easy to add formatters for
additional languages, or change the ones we already have. This is
achieved by separating the code into one function per language, which
can be modified without touching the code for the other languages.
Adding support for a new formatter/language only requires adding a
function which builds the formatter command line based on the arguments
to the xtask, and calling that function from the main `format` function.
Closes#12467
Now that we have trimmed our msgids, add an assertion to ensure that
they stay trimmed. Note that we don't check msgstrs. We could do so when
building the maps which get put into the executable, but there we also
include messages originating from fish scripts, and there we don't
enforce trimmed messages, so limiting the checks to only messages
originating from the Rust code there would not be trivial.
Closes#12405