Commit Graph

27 Commits

Author SHA1 Message Date
Johannes Altmanninger
3ecf1bc46d po: add section markers to indicate translation priority
Part of #11833

(cherry picked from commit a5db91dd85)
2025-09-30 11:52:41 +02:00
Johannes Altmanninger
27db0e5fed Rename repo_root to workspace_root
This seems like a slightly better term
because I think it also applies to tarball.
Ref: https://github.com/fish-shell/fish-shell/pull/11785#discussion_r2335280389
2025-09-13 15:12:23 +02:00
Daniel Rainer
514d34cc52 Add cargo feature for enabling gettext extraction
This allows having the proc macro crate as an optional dependency and speeds up
compilation in situations where `FISH_GETTEXT_EXTRACTION_FILE` changes, such as
the `build_tools/check.sh` script. Because we don't need to recompile on changes
to the environment variable when the feature is disabled, cargo can reuse
earlier compilation results instead of recompiling everything.
This speeds up the compilation work in `build_tools/check.sh` when no changes
were made which necessitate recompilation.
For such runs of `build_tools/check.sh`, these changes reduce the runtime on my
system by about 10 seconds, from 70 to 60, approximately.
The difference comes from the following two commands recompiling code without
the changes in this commit, but not with them:
- `cargo test --doc --workspace`
- `cargo doc --workspace`
2025-08-18 10:37:59 +02:00
Daniel Rainer
85fb937a4d Add --use-existing-template argument
This is intended to allow translation updates in contexts where building within
the `fish_xgettext.fish` script is undesirable.

Specifically, this allows checking for PO file updates in the tests run by
`test_driver.py`. Because these use a tmpdir for `$HOME`, building within such a
test requires installing the entire Rust toolchain and doing a clean build,
which is a waste of resources.
With this argument, it is possible to build the template before running the
tests and passing the file path into the script.
2025-06-15 23:43:31 +02:00
Daniel Rainer
1e571263a0 Ensure translation script is CWD-independent 2025-06-15 23:43:31 +02:00
Johannes Altmanninger
ac44b3da91 build_tools/fish_xgettext.fish: fix formatting 2025-06-07 11:15:44 +02:00
Daniel Rainer
80033adcf5 Use LocalizableString for gettext
This new wrapper type can be constructed via macros which invoke the
`gettext_extract` proc macro to extract the string literals for PO file
generation.

The type checking enabled by this wrapper should prevent trying to obtain
translations for a string for which none exist.

Because some strings (e.g. for completions) are not defined in Rust, but rather
in fish scripts, the `LocalizableString` type can also be constructed from
non-literals, in which case no extraction happens.
In such cases, it is the programmer's responsibility to only construct the type
for strings which are available for localization.

This approach is a replacement for the `cargo-expand`-based extraction.

When building with the `FISH_GETTEXT_EXTRACTION_FILE` environment variable set,
the `gettext_extract` proc macro will write the messages marked for extraction
to a file in the directory specified by the variable.

Updates to the po files:
- This is the result of running the `update_translations.fish` script using the
  new proc_macro extraction. It finds additional messages compared to the
  `cargo-expand` based approach.
- Messages IDs corresponding to paths are removed. The do not have localizations
  in any language and localizing paths would not make sense. I have not
  investigated how they made it into the po files in the first place.
- Some messages are reordered due to `msguniq` sorting differing from `sort`.

Remove docs about installing `cargo-expand`
These are no longer needed due to the switch to our extraction macro.
2025-06-07 00:10:05 +02:00
Daniel Rainer
e5fa047412 Mark format strings in po files
This allows msgfmt to detect issues with translations of format strings.
The detection used here is very simple. It just checks if a string contains '%',
and if it does, the entry in the po file is preceded by '#, c-format'.
Any entries with this marker are checked by msgfmt in our tests, so if an issue
arises, we will notice before it is merged.
2025-05-15 22:09:57 +02:00
Daniel Rainer
cb31887941 Do not hardcode xgettext output path
Instead output on stdout, which lets the caller decide what to do with it.
2025-05-13 21:18:39 +02:00
Daniel Rainer
122f39de66 Replace loop by pipeline
This simplifies the logic a bit and performs a better.

Performance improvements for extract_fish_script_messages (time in
microseconds):
- explicit regex: from 128241 to 83471 (speedup 1.5)
- implicit regex: from 682203 to 463635 (speedup 1.5)
2025-05-12 20:54:32 +02:00
Daniel Rainer
1df8fbff67 Replace long list by file
The replaces the `strs` list by a corresponding file, which eliminates the need
for looping over the list.

Use sed to transform strings into gettext po format entries.

Format the file with fish_indent and use more expressive variable name for the
file cargo expand writes to.

Performance improvements (in microseconds):
- sort+format rust strings: from 21750 to 11096 (speedup 2.0)
2025-05-12 20:35:41 +02:00
Daniel Rainer
ff5ff50183 Speed up constant string extraction
The fish builtin string functions are significantly slower than grep + sed.
The final replacement of \' to ' also does not make any sense here, because
single quotes appear unescaped in Rust strings.

Performance improvement: from 404880 to 44843 (speedup 9.0)

Profiling details (from separate runs):
Time (μs)   Sum (μs)  Command
       174     404880 > set -a strs (string match -rv 'BUILD_VERSION:|PACKAGE_NAME' <$tmpfile |
             string match -rg 'const [A-Z_]*: &str = "(.*)"' | string replace -a "\'" "'")
    404706     404706 -> string match -rv 'BUILD_VERSION:|PACKAGE_NAME' <$tmpfile |
             string match -rg 'const [A-Z_]*: &str = "(.*)"' | string replace -a "\'" "'"

       202      44843 > set -a strs (grep -Ev 'BUILD_VERSION:|PACKAGE_NAME' <$tmpfile |
             grep -E 'const [A-Z_]*: &str = "(.*)"' |
             sed -E -e 's/^.*const [A-Z_]*: &str = "(.*)".*$/\1/' -e "s_\\\'_'_g")
      4952      44641 -> grep -Ev 'BUILD_VERSION:|PACKAGE_NAME' <$tmpfile |
             grep -E 'const [A-Z_]*: &str = "(.*)"' |
             sed -E -e 's/^.*const [A-Z_]*: &str = "(.*)".*$/\1/' -e "s_\\\'_'_g"
     28716      28716 --> command grep --color=auto $argv
     10973      10973 --> command grep --color=auto $argv
2025-05-12 20:11:05 +02:00
Daniel Rainer
c0d93e4740 Do not use huge fish list
Using a file is significantly faster.

Profiling overview (times in microseconds):
- cargo expand: from 4959320 to 4503409 (speedup 1.1)
- gettext call pipeline: from 436996 to 13536 (speedup 32.3)
- static string pipeline: from 477429 to 404880 (speedup 1.18)
2025-05-12 18:12:37 +02:00
Daniel Rainer
a86a4dfabf Remove source locations from translations
Source locations (file name and line number) where a string originates is not
required by gettext tooling. It can help translators to identify context,
but the value of this is reduced by our lack of context support, meaning that
all occurrences of a string will receive the same translation.
Translators can use `rg` or similar tools to find the source locations.
For further details see this thread:
https://github.com/fish-shell/fish-shell/pull/11463#discussion_r2079378627

The main advantage is that updates to the PO files are now only necessary when
the source strings change, which greatly reduces the diff noise.

A secondary benefit is that the string extraction logic is simplified.
We can now directly extract the strings from fish scripts,
and several issues are fixed alongside, mostly related to quoting.
The regex for extracting implicit messages from fish scripts has been tweaked to
ignore commented-out lines, and properly support lines starting with `and`/`or`.
2025-05-11 21:10:03 +02:00
Daniel Rainer
22bc8e12c9 Fix xgettext implicit regex
The old regex has the problem that it does not handle lines containing any
non-space characters in front of ` complete` (or ` function`), which results in
`string replace` leaving this part in the resulting string.
For example,
`and complete -d "foo"`
would turn into
`andN_ foo`
if passed to
`string replace --regex $regex 'N_ $1'` (where `$regex` is the `$implicit_regex`) variable.
Another issue are commented-out lines.
2025-05-11 21:10:03 +02:00
Daniel Rainer
2d58cfe4cb Remove line numbers from translation strings
This greatly reduces the number of changes necessary to the PO files when the
Rust/fish source files are updated. (Changes to the line number can be applied
automatically, but this adds a lot of noise to the git history.)

Due to the way we have been extracting Rust strings, differentiation between
the same source string in different contexts has not been possible regardless
of the change.

It seems that duplicate msgid entries are not permitted in PO files, so since we
do not use context to distinguish the strings we extract, there is no way to
have context-/location-dependent translations, so we might as well reduce the
git noise by eliminating line numbers.

Including source locations helps translators with understanding context.
Because we do not distinguish between contexts for a given source string,
this is of limited utility, but keeping file names at least allows to open the
relevant files and search them for the string. This might also be helpful to
identify translations which do not make sense in all context in which they are
used. (Although without adding context support, the only remedy would be to
remove the translation altogether, as far as I can tell.)

For extraction from Rust, additional issues are fixed:
- File name extraction from the grep results now works properly. Previously,
  lines not starting with whitespace resulted in missing or corrupted matches.
  (missing if the source line contains no colon followed by a whitespace,
  corrupted if it does, then the match included the part of the line in front of
  the colon, instead of just the location)
- Only a single source location per string was supported (`head -n1`). The new
  approach using sed does not have this limitation.
2025-05-08 18:15:56 +02:00
Daniel Rainer
dd5864ce13 Add quotes around gettext string
This should prevent occurrences of the search string from being found in other
locations (e.g. in a comment).

The whole approach of string extraction from Rust sources is sketchy,
but this at least prevents producing garbage when the content of a string
appears somewhere else unquoted.
2025-05-03 16:07:20 +02:00
Daniel Rainer
d31dc9ffd8 Fix fish script translation file generation
The previous version generates files which do not preserve the line number from
the original fish script file, resulting in translation not working.

The new approach is quite ugly, and might have some issues,
but at least it seems to work in some cases.
2025-05-03 16:07:03 +02:00
Daniel Rainer
d5e80d43d9 Extract function for gettext extraction
Extracting explicit and implicit messages works essentially the same way, which
is also reflected in the code being identical, except for the regex.

Extract the duplicated code into a function.
2025-05-03 16:03:03 +02:00
Fabian Boehm
d3a66b2d96 translations: Remove tmpdir from location
This avoids changing the location every time you run fish_xgettext.
2024-03-10 16:40:58 +01:00
Fabian Boehm
d91ad2976c Make fish_xgettext sorta work with rust
This is absolutely disgusting code, but it works out okay-ish.

The problem is xgettext has no rust support (it's stuck in review
limbo). So we use cargo-expand to extract all invocations of
gettext, and massage all that to generate a
messages.pot ourselves.

We also assume any string constant could be translated.
2024-03-09 11:48:29 +01:00
Érico Rolim
3e3a42c127 build_tools/fish_xgettext.fish: use temporary directory.
Instead of using /tmp/fish as a temporary directory for this operation,
which could lead to clobbering user files, use mktemp to create an
actual temporary directory.
2020-11-13 16:29:37 +01:00
Johannes Altmanninger
826db22dbf Adjust deprecated stderr redirection in fish_xgettext.fish 2020-07-05 08:55:11 +02:00
Johannes Altmanninger
49c5f96470 Use set -l to force use of a local variable
Bare set overwrites a global/universal variable if it exists.
2020-05-15 08:25:07 +02:00
Jason
3cf6ebc0e1 Amend typos and grammar errors 2019-11-25 13:07:15 +01:00
David Adam
66fd52aa15 fish_xgettext: update translation generation for new build system
Closes #6123.
2019-09-21 22:29:19 +08:00
Kurtis Rader
3e29793d04 improve detection of msgs to be translated
This change does several things. First, it works around a quirk of the
`xgetttext` command that only recognizes description strings in even
numbered position on the command. Second, it allows descriptions
introduced by the `-d` short flag to be recognized.

More importantly, it normalizes the strings so that `xgettext` correctly
extracts them into the *.po file. Prior to this change many fish script
strings were ignored due to how they were written (e.g., single versus
double quotes).

Fixes #4073
2017-06-02 17:52:55 -07:00