This completely removes our runtime dependency on gettext. As a
replacement, we have our own code for runtime localization in
`src/wutil/gettext.rs`. It considers the relevant locale variables to
decide which message catalogs to take localizations from. The use of
locale variables is mostly the same as in gettext, with the notable
exception that we do not support "default dialects". If `LANGUAGE=ll` is
set and we don't have a `ll` catalog but a `ll_CC` catalog, we will use
the catalog with the country code suffix. If multiple such catalogs
exist, we use an arbitrary one. (At the moment we have at most one
catalog per language, so this is not particularly relevant.)
By using an `EnvStack` to pass variables to gettext at runtime, we now
respect locale variables which are not exported.
For early output, we don't have an `EnvStack` to pass, so we add an
initialization function which constructs an `EnvStack` containing the
relevant locale variables from the corresponding Environment variables.
Treat `LANGUAGE` as path variable. This add automatic colon-splitting.
The sourcing of catalogs is completely reworked. Instead of looking for
MO files at runtime, we create catalogs as Rust maps at build time, by
converting PO files into MO data, which is not stored, but immediately
parsed to extract the mappings. From the mappings, we create Rust source
code as a build artifact, which is then macro-included in the crate's
library, i.e. `crates/gettext-maps/src/lib.rs`. The code in
`src/wutil/gettext.rs` includes the message catalogs from this library,
resulting in the message catalogs being built into the executable.
The `localize-messages` feature can now be used to control whether to
build with gettext support. By default, it is enabled. If `msgfmt` is
not available at build time, and `gettext` is enabled, a warning will be
emitted and fish is built with gettext support, but without any message
catalogs, so localization will not work then.
As a performance optimization, for each language we cache a separate
Rust source file containing its catalog as a map. This allows us to
reuse parsing results if the corresponding PO files have not changed
since we cached the parsing result.
Note that this approach does not eliminate our build-time dependency on
gettext. The process for generating PO files (which uses `msguniq` and
`msgmerge`) is unchanged, and we still need `msgfmt` to translate from
PO to MO. We could parse PO files directly, but these are significantly
more complex to parse, so we use `msgfmt` to do it for us and parse the
resulting MO data.
Advantages of the new approach:
- We have no runtime dependency on gettext anymore.
- The implementation has the same behavior everywhere.
- Our implementation is significantly simpler than GNU gettext.
- We can have localization in cargo-only builds by embedding
localizations into the code.
Previously, localization in such builds could only work reliably as
long as the binary was not moved from the build directory.
- We no longer have to take care of building and installing MO files in
build systems; everything we need for localization to work happens
automatically when building fish.
- Reduced overhead when disabling localization, both in compilation time
and binary size.
Disadvantages of this approach:
- Our own runtime implementation of gettext needs to be maintained.
- The implementation has a more limited feature set (but I don't think
it lacks any features which have been in use by fish).
Part of #11726Closes#11583Closes#11725Closes#11683
FindRust is too clever by half. It tries to do rustup's job for it.
See b38551dde9 for how that can break.
So we simplify it, and only let it check three things:
- Where's rustc? Look in $PATH and ~/.cargo/bin
- Where's cargo? Look in $PATH and ~/.cargo/bin
- What is the rust target (because we pass it explicitly)?
If any of these aren't that simple, we'll ask the user to tell us,
by setting Rust_COMPILER, Rust_CARGO or Rust_CARGO_TARGET.
None of the other things are helpful to us - we do not support windows
or whatever a "unikraft" is, and if the rust version doesn't work
it'll print its own error.
We could add a rustc version check, but that will become less and less
useful because rustc versions since 1.56 (released October 2021) will check rust-version in
Cargo.toml. So even at this point it's only pretty old rust versions already.
We turned it off, but for some reason (cmake version?) that stopped working on my system.
So instead we just remove all the code that does it.
To be honest I do not know why this exists anyway.
Commit 4e79ec5f tried to restore the static PCRE2 build after the update to the
pcre2 crate, but it set an environment variable at configure time, not build
time.
Properly set the environment variable at build time.
The recent update to the rust-pcre2 crate lost the property where a static
PCRE2 build could be enabled with a Cargo feature. This means that static
PCRE2 builds can no longer be forced.
Switch to setting the "PCRE2_SYS_STATIC" variable again, which is how the
official rust-pcre2 crate expects to work.
- Ubuntu focal is the lowest LTS release that we can support with only
distro packages (e.g. no rustup).
- Remove tsan from Cirrus (it's not working currently, and also not really
important).
- Remove Centos (it passes tests but I'm not sure it's worth adding; there
isn't even an official docker image for CentOS Stream).
Makes it possible to use the sanitizers again.
Note that this requires RUSTFLAGS to be set when running CMake, and will not be
updated when running the build system if the environment variable changes.
pcre2-sys includes a vendored copy of PCRE2, which allows for
statically-linked PCRE2. Hook this up to the CMake build variable, and
remove the C++ integration for PCRE2.
Use Rust for executables
Drops the C++ entry points and restructures the Rust package into a
library and three binary crates.
Renames the fish-rust package to fish.
At least on Ubuntu, "fish_indent" is built before "fish".
Make sure export CURSES_LIBRARY_LIST to all binaries to make sure
that "cached-curses-libnames" is populated.
Closes#10198
This makes "ninja test" write only to the build directory, not to the source
tree. This enables our docker script which mounts the source as read-only.
Keep running tests serially to avoid breaking assumptions.
I think many of these tests can run in parallel and/or don't need test_init().
Use the safe variant everywhere, to get it done faster.
This will allow to use "cargo test" for unit tests that depend on our
curses.rs.
This means that Rust.cmake depends on ConfigureChecks, so move that one to
the front.
- Make CMake use the correct target-path
- Make build.rs use the correct target dir
Workspaces place it in the project root by default, the alternative to making
this change is to add a `.cargo/config.toml` file with
```toml
[build]
target-dir = "fish-rust/target"
```
Which I think is unnecessary, as we likely want to use the new location anyways.
This is more complicated than it needs to be thanks to the presence of CMake and
the C++ ffi in the picture. rsconf can correctly detect the required libraries
and instruct rustc to link against them, but since we generate a static rust
library and have CMake link it against the C++ binaries, we are still at the
mercy of CMake picking up the symbols we want.
Unfortunately, we could detect the gettext symbols but discover at runtime that
they weren't linked in because CMake was compiled with `-DWITH_GETTEXT=0` or
similar (as the macOS CI runner does). This means we also need to pass state
between CMake and our build script to communicate which CMake options were
enabled.
The new asan exit handlers are called to get proper ASAN leak reports (as
calling _exit(0) skips the LSAN reporting stage and exits with success every
time).
They are no-ops when not compiled for ASAN.
Use a "cmake-vendored" directory if it exists, to avoid accessing the
network if it's available, and a target to create an appropriate tarball
to create that directory.
Rust has multiple sanitizers available (with llvm integration).
-Zsanitizer=address catches the most likely culprits but we may want to set up a
separate job w/ -Zsanitizer=memory to catch uninitialized reads.
It might be necessary to execute `cargo build` as `cargo build -Zbuild-std` to
get full coverage.
When we're linking against the hybrid C++ codebase, the sanitizer library is
injected into the binary by also include `-fsanitize=address` in CXXFLAGS - we
do *not* want to manually opt-into `-lasan`. We also need to manually specify
the desired target triple as a CMake variable and then explicitly pass it to all
`cargo` invocations if building with ASAN.
Corrosion has been patched to make sure it follows these rules.
The `cargo-test` target is failing to link under ASAN. For some reason it has
autocxx/ffi dependencies even though only rust-native, ffi-free code should be
tested (and one would think the situation wouldn't change depending on the
presence of the sanitizer flag). It's been disabled under ASAN for now.